OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Behavioural account-based features for filtering out social spammers in large-scale twitter data collections

Washha, Mahdi and Mezghani, Manel and Sèdes, Florence Behavioural account-based features for filtering out social spammers in large-scale twitter data collections. (2017) Ingénierie des Systèmes d'Information, 22 (3). 65-88. ISSN 1633-1311

(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Official URL: http://www.iieta.org/journals/isi/paper/10.3166/ISI.22.3.65-88


Online social networks (OSNs) have become an important source of information for a tremendous range of applications and researches. However, the high usability and accessibility of OSNs have exposed many information quality (IQ) problems which consequently decrease the performance of OSNs dependent applications. Social spammers are a particular kind of ill-intentioned users who degrade the quality of OSNs information through misusing all possible services provided by OSNs. Given the fact that Twitter is not immune towards the social spam problem, different researchers have designed various detection methods of a spam content. Ho-wever, the tweet-based detection methods are not effective for detecting a spam content because of the dynamicity and the fast evolution of spam. Moreover, the robust account-based features are costly for extraction because of the need for huge volume of data from Twitter’s servers, while most other account-based features don’t model the behavior of social spammers. Hence, in this paper, we introduce a design of new 10 robust behavioral account-based features for filte-ring out spam accounts existing in large-scale Twitter "crawled" data collections. Our features focus on modeling the behavior of social spammers, such as the time correlation among tweets. The experimental results show that our new behavioral features are able to correctly classify the majority of social spammers (spam accounts), outperforming 75 account-based features de-signed in the literature.

Item Type:Article
Additional Information:Thanks to Lavoisier editor. This papers appears in volume 22 of Ingénierie des Systèmes d'Information ISSN 1633-1311. The original PDF is available at: http://www.iieta.org/journals/isi
HAL Id:hal-02548073
Audience (journal):National peer-reviewed journal
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UT3 (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Laboratory name:
Deposited On:08 Apr 2020 07:22

Repository Staff Only: item control page