OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Enhancing Information Retrieval Through Concept-Based Language Modeling and Semantic Smoothing

Said Lhadj, Lynda and Boughanem, Mohand and Amrouche, Karima Enhancing Information Retrieval Through Concept-Based Language Modeling and Semantic Smoothing. (2015) Journal of the Association for Information Science and Technology, 67 (12). 2909-2927. ISSN 2330-1635

[img]
Preview
(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
814kB

Official URL: https://doi.org/10.1002/asi.23553

Abstract

Traditionally, many information retrieval models assume that terms occur in documents independently. Although these models have already shown good performance, the word independency assumption seems to be unrealistic from a natural language point of view, which considers that terms are related to each other. Therefore, such an assumption leads to two well‐known problems in information retrieval (IR), namely, polysemy, or term mismatch, and synonymy. In language models, these issues have been addressed by considering dependencies such as bigrams, phrasal‐concepts, or word relationships, but such models are estimated using simple n‐grams or concept counting. In this paper, we address polysemy and synonymy mismatch with a concept‐based language modeling approach that combines ontological concepts from external resources with frequently found collocations from the document collection. In addition, the concept‐based model is enriched with subconcepts and semantic relationships through a semantic smoothing technique so as to perform semantic matching. Experiments carried out on TREC collections show that our model achieves significant improvements over a single word‐based model and the Markov Random Field model (using a Markov classifier).

Item Type:Article
Audience (journal):International peer-reviewed journal
Uncontrolled Keywords:
Institution:French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Other partners > Ecole nationale supérieure d'informatique - ESI (ALGERIA)
Université de Toulouse > Institut National Polytechnique de Toulouse - INPT (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UPS (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Laboratory name:
Statistics:download
Deposited By: IRIT IRIT
Deposited On:14 Feb 2019 09:20

Repository Staff Only: item control page