OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Combining compound and single terms under language model framework

Hammache, Arezki and Boughanem, Mohand and Ahmed-Ouamar, Rachid Combining compound and single terms under language model framework. (2014) Knowledge and Information Systems, 39 (2). 329-349. ISSN 0219-1377

[img]
Preview
(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
369kB

Official URL: http://dx.doi.org/10.1007/s10115-013-0618-x

Abstract

Most existing Information Retrieval model including probabilistic and vector space models are based on the term independence hypothesis. To go beyond this assumption and thereby capture the semantics of document and query more accurately, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this extension is achieved through the use of the bigram or n-gram models. However, in these models all bigrams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to select and weight relevant n-grams associated with a document. Experimental results on three TREC test collections showed an improvement over three strongest state-of-the-art model baselines, which are the original unigram language model, the Markov Random Field model, and the positional language model.

Item Type:Article
Additional Information:Thanks to Springer editor. This papers appears in volume 39 Knowledge and Information Systems ISSN : 0219-1377 The original PDF is available at :http://link.springer.com/article/10.1007%2Fs10115-013-0618-x
HAL Id:hal-01282933
Audience (journal):International peer-reviewed journal
Uncontrolled Keywords:
Institution:French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Institut National Polytechnique de Toulouse - INPT (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UPS (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Other partners > Université Mouloud Mammeri Tizi Ouzou - UMMTO (ALGERIA)
Laboratory name:
Statistics:download
Deposited By: IRIT IRIT
Deposited On:19 Feb 2016 14:17

Repository Staff Only: item control page