OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Ontology-based Flexible Topic Classification of Crowdsourcing Textual Resources

Dumitrescu, Stefan Daniel and Trausan Matu, Stefan and Brut, Mihaela and Sèdes, Florence Ontology-based Flexible Topic Classification of Crowdsourcing Textual Resources. (2013) In: 5th International Conference on Management of Emergent Digital EcoSystems (MEDES 2013), 28 October 2013 - 31 October 2013 (Luxembourg, Luxembourg).

[img] (Document in English)

PDF (Publisher's version) - Depositor and staff only - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
320kB

Official URL: http://dx.doi.org/10.1145/2536146.2536172

Abstract

The paper presents a solution to the problem of capitalizing in different contexts and by different stakeholders the time-stamped new documents produced by social Web sites (including news, blog entries, and uploaded documents). The solution core includes an ontology-based method to express the interest topics and to automatically classify them. For such textual content obtained in real-time, we propose an unsupervised text classification system based on general YAGO ontology, graph algorithms and a custom scoring method. The system shows good performance using only ontology information and the ontology structure itself. We compare our system against a SVM-based (Support Vector Machine) classic text classification approach. For determining the relevance of a specific document for a specific topic, our approach develops and compares the ontology sub graphs corresponding to the query and to the document. It leads to a high flexibility in terms of capitalizing the already classified documents when refining and changing the interest topic: a graph-based matching of the already obtained ontology-based document representation against the new query representation is enough to assess the document relevance.

Item Type:Conference or Workshop Item (Paper)
Additional Information:ISBN: 978-1-4503-2004-7 The original PDF is available at : http://dl.acm.org/citation.cfm?id=2536172
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Institut National Polytechnique de Toulouse - INPT (FRANCE)
Other partners > Thales (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UPS (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Other partners > University Politechnica of Bucharest (ROMANIA)
Other partners > Romanian Academy (ROMANIA)
Laboratory name:
Statistics:download
Deposited By: IRIT IRIT
Deposited On:22 Jul 2015 13:00

Repository Staff Only: item control page