OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Extracting hypernym relations from Wikipedia disambiguation pages: comparing symbolic and machine learning approaches

Kamel, Mouna and Trojahn, Cassia and Ghamnia, Adel and Aussenac-Gilles, Nathalie and Fabre, Cécile Extracting hypernym relations from Wikipedia disambiguation pages: comparing symbolic and machine learning approaches. (2017) In: International Conference on Computational Semantics (IWCS 2017), 19 September 2017 - 22 September 2017 (Montpellier, France).

(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Extracting hypernym relations from text is one of the key steps in the construction and enrichment of semantic resources. Several methods have been exploited in a variety of propositions in the literature. However, the strengths of each approach on a same corpus are still poorly identified in order to better take advantage of their complementarity. In this paper, we study how complementary two approaches of different nature are when identifying hypernym relations on a structured corpus containing both well-written text and syntactically poor formulations, together with a rich formatting. A symbolic approach based on lexico-syntactic patterns and a statistical approach using a supervised learning method are applied to a sub-corpus of Wikipedia in French, composed of disambiguation pages. These pages, particularly rich in hypernym relations, contain both kinks of formulations. We compared the results of each approach independently of each other and compared the performance when combining together their individual results. We obtain the best results in the latter case, with an F-measure of 0.75. In addition, 55% of the relations identified by our approach, with respect to a reference corpus, are not expressed in the French DBPedia and could be used to enrich this resource.

Item Type:Conference or Workshop Item (Paper)
HAL Id:hal-02355277
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UT3 (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Laboratory name:
Région Occitanie (France)
Deposited On:08 Nov 2019 09:57

Repository Staff Only: item control page