IJEME Vol. 13, No. 5, 8 Oct. 2023
Cover page and Table of Contents: PDF (size: 892KB)
Full Text (PDF, 892KB), PP.34-42
Views: 0 Downloads: 0
SQuAD, AQAD, Machine Reading Comprehension, Question Answering
Machine Reading Comprehension (MRC), known as the ability of computers to read and understand unstructured text and then answer questions, is still an open research field. MRC is considered one of the most research-demanding sub-tasks in Natural Language Processing (NLP) and Natural Language Understanding (NLU). MRC introduces multiple research challenges. One of these challenges is that the models should be trained to answer all questions and abstain from answering when the answer is not covered in the given context. Another challenge lies in dataset availability. These challenges are amplified for non-Latin-based languages; Arabic as an example. Currently, available Arabic MCR datasets are either small-sized high-quality collections or large-sized low-quality datasets. Additionally, they do not include unanswerable questions. This lack of resources depicts the model as incapable of real-world deployments. To tackle these challenges, this paper proposes a novel large-size high-quality Arabic MRC dataset that includes unanswerable questions, named “Arabic-SQuAD v2.0'”. The dataset consists of 96051 triplets {question, context, answer} in an attempt to help enrich the field of Arabic-MRC. Furthermore, a Machine Learning (ML)-based model is introduced that is capable of effectively solving Arabic MRC-with-unanswerable questions. The results of the proposed model are satisfactory and comparable with Latin-based language models. Furthermore, the results show a significant improvement of the current state-of-the-art Arabic MRC. To be exact, the model scores 71.49 F1-score and 65.12 Exact Match (EM). This proposed dataset and implementation pave the way to further Arabic MRC; aiming to reach a state when MRC models could mimic human text reasoning.
Zeyad Ahmed, Mariam Zeyada, Youssef Amin, Donia Gamal, Hanan Hindy, "Introducing Arabic-SQuADv2.0 for Effective Arabic Machine Reading Comprehension", International Journal of Education and Management Engineering (IJEME), Vol.13, No.5, pp. 34-42, 2023. DOI:10.5815/ijeme.2023.05.03
[1]Danqi Chen. Neural reading comprehension and beyond. Stanford University, 2018. https://purl.stanford.edu/gd576xb1833.
[2]Shanshan Liu, Xin Zhang, Sheng Zhang, Hui Wang, and Weiming Zhang. Neural machine reading comprehension: Methods and trends. Applied Sciences, 9:3698, 09 2019. doi: 10.3390/app9183698.
[3]Saeed Salah, Mohammad Nassar, Raid Zaghal, and Osama Hamed. Towards the automatic generation of arabic lexical recognition tests using orthographic and phonological similarity maps. Journal of King Saud University-Computer and Information Sciences, 2021.
[4]Mohamed Shaheen and Ahmed Magdy Ezzeldin. Arabic question answering: systems, resources, tools, and future trends. Arabian Journal for Science and Engineering, 39(6):4541–4564, 2014.
[5]Adel Atef, Bassam Mattar, Sandra Sherif, Eman Elrefai, and Marwan Torki. AQAD: 17,000+ Arabic questions for machine comprehension of text. pages 1–6, 112020. doi: 10.1109/AICCSA50499.2020.9316526.
[6]Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. MS MARCO: A human generated machine reading comprehension dataset. CoRR, abs/1611.09268, 2016.
[7]Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383–2392, Austin, Texas, November 2016. Association for Computational Linguistics. doi: 10.18653/v1/D16-1264.
[8]Pranav Rajpurkar, Robin Jia, and Percy Liang. Know what you don’t know: Unanswerable questions for SQuAD. CoRR, abs/1806.03822, 2018.
[9]Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Ćukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. Google’s neural machine translation system: Bridging the gap between human and machine translation, 2016. doi: 10.48550/ARXIV.1609.08144.
[10]Hussein Mozannar, Elie Maamary, Karl El Hajal, and Hazem Hajj. Neural Arabic question answering. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 108–118, Florence, Italy, August 2019. Association for Computational Linguistics. doi: 10.18653/v1/W19-4612.
[11]Mariam Biltawi, Arafat Awajan, and Sara Tedmori. Towards building an open-domain corpus for arabic reading comprehension. In Proceeding of the 35th Int. Bus. Inf. Manage. Assoc.(IBIMA), pages 1–27, 04 2020.
[12]Amit Mishra and Sanjay Jain. A survey on question answering systems with classification. Journal of King Saud University -Computer and Information Sciences, 28, 11 2015. doi: 10.1016/j.jksuci.2014.10.007.
[13]Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805, 2018.
[14]Wissam Antoun, Fady Baly, and Hazem Hajj. Ara-ELECTRA: Pre-training text discriminators for Arabic language understanding. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, pages 191–195, Kyiv, Ukraine (Virtual), April 2021. Association for Computational Linguistics.
[15]Wissam Antoun, Fady Baly, and Hazem Hajj. AraBERT: Transformer-based model for arabic language understanding. In LREC 2020 Workshop Language Resources and Evaluation Conference 11–16 May 2020, page 9.
[16]Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR, abs/2003.10555, 2020.
[17]Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics.
[18]Jaejun Lee, Raphael Tang, and Jimmy Lin. What would Elsa do? freezing layers during transformer fine-tuning. arXiv, 2019.
[19]Marius Mosbach, Maksym Andriushchenko, and Dietrich Klakow. On the stability of fine-tuning BERT: Misconceptions, explanations, and strong baselines. arXiv, 2020.