Ahmed Abdelali

Work place: Qatar Computing Research Institute, Qatar

E-mail: aabdelali@qf.org.qa

Website:

Research Interests: Natural Language Processing

Biography

Ahmed Abdelali, a Senior Engineer at QCRI, Arabic Language Technologies group, his research interest includes machine translation and Arabic text processing.

He received his PhD degree in computer science from the New Mexico Institute of Mining and Technology, USA in 2006. 

Author Articles
A Semi-Automatic and Low Cost Approach to Build Scalable Lemma-based Lexical Resources for Arabic Verbs

By Noureddine Doumi Ahmed Lehireche Denis Maurel Ahmed Abdelali

DOI: https://doi.org/10.5815/ijitcs.2016.02.01, Pub. Date: 8 Feb. 2016

This work presents a method that enables Arabic NLP community to build scalable lexical resources. The proposed method is low cost and efficient in time in addition to its scalability and extendibility. The latter is reflected in the ability for the method to be incremental in both aspects, processing resources and generating lexicons. Using a corpus; firstly, tokens are drawn from the corpus and lemmatized. Secondly, finite state transducers (FSTs) are generated semi-automatically. Finally, FSTs are used to produce all possible inflected verb forms with their full morphological features. Among the algorithm's strength is its ability to generate transducers having 184 transitions, which is very cumbersome, if manually designed. The second strength is a new inflection scheme of Arabic verbs; this increases the efficiency of FST generation algorithm. The experimentation uses a representative corpus of Modern Standard Arabic. The number of semi-automatically generated transducers is 171. The resulting open lexical resources coverage is high. Our resources cover more than 70% Arabic verbs. The built resources contain 16,855 verb lemmas and 11,080,355 fully, partially and not vocalized verbal inflected forms. All these resources are being made public and currently used as an open package in the Unitex framework available under the LGPL license.

[...] Read more.
Other Articles