A semi-Automatic Annotation with a Novel Deep Learning Model for Terrorism Tweets
Keywords:
Keywords: classification, deep learning , Annotation , Labeling, Xlm-R LargeAbstract
Abstract: Terrorist groups are resorting more to social networks to promote their operations through Twitter. In order to win new members, these groups carry out radical propaganda by posting content. Sentiment based natural language processing methods are widely used in practice to preprocess such posts or tweets. Semi-automatic annotation approach was used: the first set of labels were created automatically with the help of Zero-Shot classifier (ZSc) which text to category classification after which random tweets were manually verified. The labeled dataset was then lastly used to train the model. This paper gives a developed system of terrorism-content classification utilizing the Cross-linguistic Language Model RoBERTa Large (XLM-R Large) to categorize information in the Arabic and the English language in Twitter using structured and labeled information. The accuracy of our system that involved the combination of automated labeling and deep learning was 85% and it was successful in identifying extremist and terrorist materials