:Home
:Conferences
:Events
:Publications
:Research Themes
:Team
:Teaching

:Open Theses
:Jobs
:Impressum

:Master/MICS
:Bachelor/BECS
:ILIAS Lab
:Internal Information
:Goethe AG (former group)
Development of a crawler and indexer for an internet forum

Master, Bachelor, TFE, SPP thesis


Project Description

The ADAM project deals with different questions in the field of Web Information Retrieval (Web IR). One of our goals is to extract common patterns out of websites (e.g., a forum), and to include such information into an automatically generated document collection (e.g., an FAQ containing commonly used questions/answers out of a forum).

The goal for the thesis would be to create some groundwork for this idea. A program has to be developed which behaves similar to a web crawler and indexer, however concerning only one specific website, e.g. an internet forum. Thus, information about the given layout and structure of that site can be taken into account to extract only the content part of the web pages. The resulting data will then be subject for further analysis (which is not necessarily part of the thesis anymore, but that depends on the level of the thesis).

Starting date & duration

The project should be started as soon as possible. It should run for at least 15 weeks.

Requirements

  • Good programming skills (preferably in Java)
  • Knowledge in concepts and algorithms in Information Retrieval would be good (but this is not a prerequisite)
  • The project (and thesis) language is either English (preferred) or German

Interested? Contact us:

Ralph Weires, PhD Student

"Development of a crawler and indexer for an internet forum" is mentioned on: Ralph Weires


Printable Version
VeryQuickWiki - HTML Export
Version: 2.7.1 (UniLux: 1.15.0 2006-01-19)
Modified: 2006-12-07 17:38:42
Exported: 2011-03-09 11:27:05