OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Learning to Choose the Best System Configuration in Information Retrieval: the case of repeated queries

Bigot, Anthony and Déjean, Sébastien and Mothe, Josiane Learning to Choose the Best System Configuration in Information Retrieval: the case of repeated queries. (2015) Journal of Universal Computer Science, 21 (13). 1726-1745. ISSN 0948-695X

[img]
Preview
(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
568kB

Official URL: http://dx.doi.org/10.3217/jucs-021-13-1726

Abstract

This paper presents a method that automatically decides which system configuration should be used to process a query. This method is developed for the case of repeated queries and implements a new kind of meta-system. It is based on a training process: the meta-system learns the best system configuration to use on a per query basis. After training, the meta-search system knows which configuration should treat a given query. The Learning to Choose method we developed selects the best configurations among many. This selective process rests on data analytics applied to system parameter values and their link with system effectiveness. Moreover, we optimize the parameters on a per-query basis. The training phase uses a limited amount of document relevance judgment. When the query is repeated or when an equal-query is submitted to the system, the meta-system automatically knows which parameters it should use to treat the query. This method its the case of changing collections since what is learnt is the relationship between a query and the best parameters to use to process it, rather than the relationship between a query and documents to retrieve. In this paper, we describe how data analysis can help to select among various configurations the ones that will be useful. The "Learning to choose" method is presented and evaluated using simulated data from TREC campaigns. We show that system performance highly increases in terms of precision, specifically for the queries that are difficult or medium difficult to answer. The other parameters of the method are also studied.

Item Type:Article
Additional Information:The original PDF can be found at: http://www.jucs.org/jucs_21_13/learning_to_choose_the
HAL Id:hal-01592024
Audience (journal):International peer-reviewed journal
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut National Polytechnique de Toulouse - INPT (FRANCE)
French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UPS (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Laboratory name:
Statistics:download
Deposited By: IRIT IRIT
Deposited On:15 Sep 2017 07:36

Repository Staff Only: item control page