Please use this identifier to cite or link to this item:
|Title||Identifying Palestinian Political Content from Arabic Tweets|
|Title in Arabic||تعريف المحتوى الفلسطيني السياسي من التغريدات العربية|
Using Twitter in news agencies and media has become widely popular, thus considerable proportion of tweets greatly reflects the social perspective in the real world. Further many people follow the news on twitter which attracts the news agencies to analyze and try to know what is happening on Twitter. Press and media agencies are looking to find efficient tools to analyze and classify tweets and this is due the difficulty and high cost of the manual approaches. Various works have discussed and provided effective solutions for processing tweets into understandable formats for machines to classify and analyze. While researches receive much attention in languages and locales such as English, some languages such as Arabic have not received much research attention despite the wide spread of Twitter in the Arab world in general and Palestine in particular. In this thesis we propose an approach using machine learning to automatically classify Arabic tweets related to Palestinian political topics/content. The purpose of classifying Palestinian Arabic political topics is that Palestine receives great attention in Arab news and social media. The approach is based on collecting tweets using an application that we develop based on the TwitterPalPol. It collects tweets from different Twitter API’s through specific factors like keywords, region and language. Then we process the collected tweets and classify part of them manually as Palestinian political and as not Palestinian political in order to be used as the learning data set for the selected machine learning. This is used in the algorithm to classify new tweets automatically. In addition we create two datasets for learning, the first one includes all the collected tweets prepared for learning, and the second include filtered tweets with creditability filter created to evaluate the creditability for each tweet and ignore fake tweets. The filter is dependent on many factors related to tweet properties, therefore we compare the results for the classification in both data sets and find out the importance of the filter. The results was sufficient as they preserve ranges between 97% and 80% in the main classification measurers like recall and precision.
|Publisher||الجامعة الإسلامية - غزة|
|Files in this item|