OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Information Quality in Social Networks: Predicting Spammy Naming Patterns for Retrieving Twitter Spam Accounts

Washha, Mahdi and Qaroush, Aziz and Mezghani, Manel and Sèdes, Florence Information Quality in Social Networks: Predicting Spammy Naming Patterns for Retrieving Twitter Spam Accounts. (2017) In: 19th International Conference on Enterprise Information Systems (ICEIS 2017), 26 April 2017 - 29 April 2017 (Porto, Portugal).

[img]
Preview
(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB

Official URL: http://dx.doi.org/10.5220/0006314006100622

Abstract

The popularity of social networks is mainly conditioned by the integrity and the quality of contents generated by users as well as the maintenance of users’ privacy. More precisely, Twitter data (e.g. tweets) are valuable for a tremendous range of applications such as search engines and recommendation systems in which working on a high quality information is a compulsory step. However, the existence of ill-intentioned users in Twitter imposes challenges to maintain an acceptable level of data quality. Spammers are a concrete example of ill-intentioned users. Indeed, they have misused all services provided by Twitter to post spam content which consequently leads to serious problems such as polluting search results. As a natural reaction, various detection methods have been designed which inspect individual tweets or accounts for the existence of spam. In the context of large collections of Twitter users, applying these conventional methods is time consuming requiring months to filter o ut spam accounts in such collections. Moreover, Twitter community cannot apply them either randomly or sequentially on each user registered because of the dynamicity of Twitter network. Consequently, these limitations raise the need to make the detection process more systematic and faster. Complementary to the conventional detection methods, our proposal takes the collective perspective of users (or accounts) to provide a searchable information to retrieve accounts having high potential for being spam ones. We provide a design of an unsupervised automatic method to predict spammy naming patterns, as searchable information, used in naming spam accounts. Our experimental evaluation demonstrates the efficiency of predicting spammy naming patterns to retrieve spam accounts in terms of precision, recall, and normalized discounted cumulative gain at different ranks

Item Type:Conference or Workshop Item (Paper)
Additional Information:Thanks to Elsevier editor. This papers appears in Proceedings of the 19th International Conference on Enterprise Information Systems ISBN 978-989-758-248-6 The definitive version is available at: http://www.sciencedirect.com The original PDF of the article can be found at : http://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0006314006100622
HAL Id:hal-01809318
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Institut National Polytechnique de Toulouse - INPT (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UPS (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Other partners > Birzeit University - BZU (PALESTINE)
Laboratory name:
Statistics:download
Deposited By: IRIT IRIT
Deposited On:04 May 2018 13:18

Repository Staff Only: item control page