Naipeng Dong



About

Mrs. Naipeng Dong currently performs her Master Thesis (Specialization: Security and Trust) within a cooperation project between UL and the University of Shadong, China.

Project

Abstract

Texts are generally written by human authors who share an individual style. This style may be characterized by many attributes, like for example the author’s vocabulary, the usage of linguistic attributes like hapax legomena, or generally the domain where the text belongs to. Additionally, statistical parameters like the length of sentences in average or the number of words that occur more than k are taken into account as well. In the following, we call these attributes objective attributes.

Objective attributes may be sufficient up to a certain level and they show acceptable results if the text is written in an ob jective style, telling the facts of a story or describing the structures of an ob ject. However, if the text compounds sub jective components like for example the author’s opinion then the ob jective attributes fail. In this respect, the Master Thesis will be concerned with the building of a fingerprint engine that takes into account both objective attributes and subjective attributes. A subjective attribute might be for example an attribute like Text-Contains-A-Beliefor Text-Contains-A-Desire. Assume that we have a text like


Meeting today with Palestinian leaders in the West Bank, I believe that President Bush
will say that a Middle East peace treaty would be signed.



then we can extract subjective attributes like Text-Contains-A-Belief = yes and Text-Contains-A-Desire = no.

The aim of FINE is to become a prototype on how such a subjective-objective engine works. The input must be texts of the selected domain, the output a vector representing both the objective and subjective attributes (Both the vector representation, the similarity measure and the comparison itself must be defined). Ideally, the vectors could be compared and a distance or similarity be calculated: FINE may then suggest another text that is close to a given one.


Original Version
VeryQuickWiki - HTML Export - Printable Version
Version: 2.7.1 (UniLux: 1.15.0 2006-01-19)
Modified: 2008-01-14 11:25:16
Exported: 2011-03-09 11:27:05