OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Learning to Adaptively Rank Document Retrieval System Configurations

Deveaud, Romain and Mothe, Josiane and Ullah, Md Zia and Nie, Jian-Yun Learning to Adaptively Rank Document Retrieval System Configurations. (2018) ACM Transactions on Information Systems - TOIS, 37 (1). 1-41. ISSN 1046-8188

(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Official URL: https://doi.org/10.1145/3231937


Modern Information Retrieval (IR) systems become more and more complex, involving a large number of parameters. For example, a system may choose from a set of possible retrieval models (BM25, language model, etc.), or various query expansion parameters, whose values greatly influence the overall retrieval effectiveness. Traditionally, these parameters are set at system level based on training queries, and the same parameters are then used for different queries. We observe that it may not be easy to set all these parameters separately since they can be dependent. In addition, a global setting for all queries may not best fit all individual queries with different characteristics. The parameters should be set according to these characteristics. In this paper, we propose a novel approach to tackle this problem by dealing with the entire system configurations (i.e. a set of parameters representing an IR system behaviour) instead of selecting a single parameter at a time. The selection of the best configuration is cast as a problem of ranking different possible configurations given a query. We apply learning-to-rank approaches for this task. We exploit both the query features and the system configuration features in the learning-to-rank method so that the selection of configuration is query-dependent. The experiments we conducted on four TREC ad-hoc collections show that this approach can significantly outperform the traditional method to tune system configuration globally (i.e grid search), and leads to higher effectiveness than the top performing systems of the TREC tracks. We also perform an ablation analysis on the impact of different features on the model learning capability and show that query expansion features are among the most important for adaptive systems.

Item Type:Article
Additional Information:Thanks to ACM editor. This papers appears in ACM Transactions on Information Systems ISBN 1046-8188 The original PDF is available at:https://dl.acm.org/citation.cfm?id=3231937
HAL Id:hal-02092955
Audience (journal):International peer-reviewed journal
Uncontrolled Keywords:
Institution:French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
Other partners > Université de Montréal - UdeM (CANADA)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UT3 (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Laboratory name:
ANR Agence nationale de la recherche
Deposited On:01 Apr 2019 09:40

Repository Staff Only: item control page