On Monday the 11th of July 2011 I defended my PhD. Its title was (in French):
Évaluation sémantique d'informations symboliques : la cotation
During my PhD I worked both at the French Aerospace Lab (Office National d'Études et Recherches Aérospatiales - ONERA) and at the Laboratoire d'Informatique de Paris 6 (LIP6). Below are the abstract (the only part in English), members of the jury, slides from the viva voce and dissertation, both downloadable.
Confidence in information should represent how far one can believe in it, how much faith to put in it. Trust is a thriving field of study yet, in general, it tends to measure quality of the process responsible for producing the information rather than inform on whether to believe it or not. In the same way that hearing a fact from a trustworthy source is insufficient to fully believe it, automatic evaluation of trust requires a rich model capable of expliciting why what it qualifies should or should not be believed. This is the problem we have tackled in our work.
From a careful study of an existing representation of confidence, we choose to split the problem in two: the encoding of trust, how it is represented, and the rules governing its appraisal, how it is evaluated. We derive the essential dimensions participating in the building of trust from the examination of the prerequisites imposed on the definition of its encoding. We offer a categorisation of these dimensions which clusters the evaluated criteria according to their object and influence and ensures their independence and non-redundancy. We also take great care of ensuring the readability of the measures involved in the assessment by proposing their expression along discrete scales made explicit through the use of linguistic labels.
Now that the dimensions have been selected, we can address the problem of their combination, to model the trust-building process. We tackle this problem by proposing a philosophy of dimension integration, we shape the architecture of information scoring. We provide this architecture with a scoring-chain representation which highlights the order in which dimensions are considered and the influence they have on the increase or decrease of the confidence evaluation. We also show how the flexibility of our model can be used to represent different user gullibility postures, an essential adaptativity for the modeling of subjective matters.
Once these definitions are set, we propose a theoretical formalisation of the scoring process and of its expression, the score. Using the expressiveness of multi-valued logic, we choose to set our solutions in this formalism. To reintroduce the important distinction between impossibility of measuring and a neutral measure, we extend this formalism by adding a new truth degree. Within this framework of an extended symbolic logic, we define combination operators to represent all of our proposals and formalise credulity modeling.
We then consider the implementation of our model for the extraction and scoring of symbolic information. We first examine the transposition of information scoring to the problem of knowledge extraction from text. We describe successively the scoring of information extraction, and that of their fusion, examining for both how the scoring dimensions translate. We then develop a prototype for the put to use of our model. Finally, we apply both model and prototype to a real case of extraction and scoring of a social network from texts.
Herman Akdag (LIP6)
Salem Benferhat (Centre de Recherche en Informatique de Lens - CRIL)
Bernadette Bouchon-Meunier (LIP6)
Philippe Capet (Thalès)
Laurence Cholvy (ONERA - Toulouse)
Michel Goya (Institut de Recherches Stratégiques de líÉcole Militaire)
Marie-Jeanne Lesot (LIP6)
Olivier Poirel (ONERA - Palaiseau)
Download the slides of the viva voce here, or browse through them below if your browser allows, and the manuscript there. Both are in French.