Lexicon annotation in sentiment analysis for dialectal Arabic: Consensus Expert Standardized Criteria
Main Article Content
Abstract
Sentiment Analysis (SA) in Natural Language Processing (NLP) involves analyzing perceptions, attitudes, and emotions from text. It is crucial for decision-making and consumer insights. Recent studies focus on developing Lexicons for SA research. Understanding the construction and evaluation of existing lexicons is key to advancing development efforts. Evaluation and benchmarking of lexicons are vital for identifying the most suitable ones and establishing best practices. Factors like effectiveness and importance must be considered when building or selecting lexicons. This research outlines three key phases: Determining Lexicons, Identifying Evaluation Criteria, and Engaging Experts. The study aims to enhance understanding of lexicon development processes and improve future guidelines. Efforts in lexicon development can benefit from a structured approach that considers various criteria for evaluation. The research emphasizes the importance of expert input in refining lexicons for optimal performance. Evaluating lexical criteria helps in identifying gaps and areas for improvement in sentiment analysis tools. Benchmarking different lexicons aids in selecting the most appropriate ones for specific applications or domains. Establishing best practices in lexicon development involves thorough evaluation against predefined criteria to ensure quality and reliability. Expert opinions play a crucial role in validating the significance of developed lexicons for sentiment analysis tasks. The research methodology involves systematic identification of lexicons relevant criteria, and experts to inform best practices in the field of sentiment analysis. By focusing on these three key phases, this study aims to contribute valuable insights into enhancing sentiment analysis through improved lexicon development processes.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
References
A. Farha and W. Magdy, "Mazajak: An online Arabic sentiment analyser," in Proceedings of the fourth arabic natural language processing workshop, 2019, pp. 192-198.
A. H. Ombabi, W. Ouarda, and A. M. Alimi, "Deep learning CNN–LSTM framework for Arabic sentiment analysis using textual information shared in social networks," Social Network Analysis and Mining, vol. 10, pp. 1-13, 2020.
S. M. Sherif, A. Alamoodi, O. Albahri, S. Garfan, A. Albahri, M. Deveci, et al., "Lexicon annotation in sentiment analysis for dialectal Arabic: Systematic review of current trends and future directions," Information Processing & Management, vol. 60, p. 103449, 2023.
A. Abdelli, F. Guerrouf, O. Tibermacine, and B. Abdelli, "Sentiment Analysis of Arabic Algerian Dialect Using a Supervised Method," in 2019 International Conference on Intelligent Systems and Advanced Computing Sciences (ISACS), 2019, pp. 1-6.
S. Al-Azani and E.-S. M. El-Alfy, "Audio-Textual Arabic Dialect Identification for Opinion Mining Videos," in 2019 IEEE Symposium Series on Computational Intelligence (SSCI), 2019, pp. 2470-2475.
M. Al-Ayyoub, A. A. Khamaiseh, Y. Jararweh, and M. N. Al-Kabi, "A comprehensive survey of arabic sentiment analysis," Information processing & management, vol. 56, pp. 320-342, 2019.
O. Oueslati, E. Cambria, M. B. HajHmida, and H. Ounelli, "A review of sentiment analysis research in Arabic language," Future Generation Computer Systems, vol. 112, pp. 408-430, 2020.
A. Alawami, "Aspect terms extraction of Arabic dialects for opinion mining using conditional random fields," in International Conference on Intelligent Text Processing and Computational Linguistics, 2016, pp. 211-220.
A. Assiri, A. Emam, and H. Al-Dossari, "Real-time sentiment analysis of Saudi dialect tweets using SPARK," in 2016 IEEE International Conference on Big Data (Big Data), 2016, pp. 3947-3950.
S. Albukhitan, A. Alnazer, and T. Helmy, "Framework of Semantic Annotation of Arabic Document using Deep Learning," Procedia Computer Science, vol. 170, pp. 989-994, 2020.
F. Sadat, F. Mallek, M. M. Boudabous, R. Sellami, and A. Farzindar, "Collaboratively constructed linguistic resources for language variants and their exploitation in NLP application–the case of Tunisian Arabic and the social media," in Proceedings of workshop on Lexical and grammatical resources for language processing, 2014, pp. 102-110.
K. Darwish, "Arabizi detection and conversion to Arabic," arXiv preprint arXiv:1306.6755, 2013.
A. Bies, Z. Song, M. Maamouri, S. Grimes, H. Lee, J. Wright, et al., "Transliteration of arabizi into arabic orthography: Developing a parallel annotated arabizi-arabic script sms/chat corpus," in Proceedings of the EMNLP 2014 workshop on Arabic natural language processing (ANLP), 2014, pp. 93-103.
A. Assiri, A. Emam, and H. Al-Dossari, "Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis," Journal of information science, vol. 44, pp. 184-202, 2018.
I. Guellil, F. Azouaou, and M. Mendoza, "Arabic sentiment analysis: studies, resources, and tools," Social Network Analysis and Mining, vol. 9, pp. 1-17, 2019.
R. Baly, G. El-Khoury, R. Moukalled, R. Aoun, H. Hajj, K. B. Shaban, et al., "Comparative evaluation of sentiment analysis methods across Arabic dialects," Procedia Computer Science, vol. 117, pp. 266-273, 2017.
S. M. C. Loureiro, J. Romero, and R. G. Bilro, "Stakeholder engagement in co-creation processes for innovation: a systematic literature review and case study," Journal of Business Research, vol. 119, pp. 388-409, 2020.
A. B. Soliman, K. Eissa, and S. R. El-Beltagy, "Aravec: A set of arabic word embedding models for use in arabic nlp," Procedia Computer Science, vol. 117, pp. 256-265, 2017.
S. Almouzini and A. Alageel, "Detecting Arabic depressed users from Twitter data," Procedia Computer Science, vol. 163, pp. 257-265, 2019.
A. Soumeur, M. Mokdadi, A. Guessoum, and A. Daoud, "Sentiment analysis of users on social networks: overcoming the challenge of the loose usages of the Algerian Dialect," Procedia computer science, vol. 142, pp. 26-37, 2018.
I. Alsarsour, E. Mohamed, R. Suwaileh, and T. Elsayed, "Dart: A large dataset of dialectal arabic tweets," in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018.
J. Younes, E. Souissi, H. Achour, and A. Ferchichi, "Language resources for Maghrebi Arabic dialects’ NLP: a survey," Language Resources and Evaluation, vol. 54, pp. 1079-1142, 2020.
H. Rahab, A. Zitouni, and M. Djoudi, "SANA: Sentiment analysis on newspapers comments in Algeria," Journal of King Saud University-Computer and Information Sciences, 2019.
N. Al-Twairesh, R. Al-Matham, N. Madi, N. Almugren, A.-H. Al-Aljmi, S. Alshalan, et al., "Suar: Towards building a corpus for the Saudi dialect," Procedia computer science, vol. 142, pp. 72-82, 2018.
W. Zaghouani, N. Habash, and B. Mohit, "The qatar arabic language bank guidelines," Technical Report CMU-CS-QTR-124, School of Computer Science, Carnegie Mellon …2014.
M. N. Al-Kabi, A. A. Al-Qwaqenah, A. H. Gigieh, K. Alsmearat, M. Al-Ayyoub, and I. M. Alsmadi, "Building a standard dataset for Arabie sentiment analysis: Identifying potential annotation pitfalls," in 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), 2016, pp. 1-6.
T. Almanie, A. Aldayel, G. Alkanhal, L. Alesmail, M. Almutlaq, and R. Althunayan, "Saudi Mood: a real-time informative tool for visualizing emotions in Saudi Arabia Using Twitter," in 2018 21st Saudi Computer Society National Computer Conference (NCC), 2018, pp. 1-6.
N. Boudad, R. Faizi, R. O. H. Thami, and R. Chiheb, "Sentiment analysis in Arabic: A review of the literature," Ain Shams Engineering Journal, vol. 9, pp. 2479-2490, 2018.
M. Heikal, M. Torki, and N. El-Makky, "Sentiment analysis of Arabic Tweets using deep learning," Procedia Computer Science, vol. 142, pp. 114-122, 2018.
R. Tachicart, K. Bouzoubaa, and H. Jaafar, "Building a Moroccan dialect electronic dictionary (MDED)," in 5th International Conference on Arabic Language Processing, 2014, pp. 216-221.
I. Guellil and F. Azouaou, "Arabic dialect identification with an unsupervised learning (based on a lexicon). application case: Algerian dialect," in 2016 IEEE Intl Conference on Computational Science and Engineering (CSE) and IEEE Intl Conference on Embedded and Ubiquitous Computing (EUC) and 15th Intl Symposium on Distributed Computing and Applications for Business Engineering (DCABES), 2016, pp. 724-731.
G. Imane, D. Kareem, and A. Faical, "A set of parameters for automatically annotating a Sentiment Arabic Corpus," International Journal of Web Information Systems, 2019.
A. Vallenari, A. G. Brown, T. Prusti, J. H. De Bruijne, F. Arenou, C. Babusiaux, et al., "Gaia data release 3-summary of the content and survey properties," Astronomy & Astrophysics, vol. 674, p. A1, 2023.