Challenges and Progress in Constructing Arabic Dialect Corpora and Linguistic tools: A Focus on Moroccan and Tunisian Dialects
Academic Article
Publication Date:
2023
Short description:
Challenges and Progress in Constructing Arabic Dialect Corpora and Linguistic tools: A Focus on Moroccan and Tunisian Dialects / Nahli, Ouafae; Gugliotta, Elisa; Khlif, Nadia; Giulia, Benotto. - INTERNATIONAL IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY:(2023), pp. 293-298. [10.1109/cist56084.2023.10410009]
abstract:
Given the lack of resources for Arabic dialects, the construction of corpora, lexical resources, and tools is a non-trivial challenge. The focus of the article is to describe our in-progress work to address these deficiencies. We start with Moroccan and Tunisian dialects to provide annotated corpora and corpus-based lexical resources. We also aim to extend an existing morphological engine with linguistic resources built ad hoc for each dialect. In addition, we develop an integrated component in the morphological engine to better address linguistic and sociolinguistic characteristics while preserving the integrity of dialectal texts.
Iris type:
1.1 Articolo in rivista
Keywords:
Arabic dialects, corpora, lexical resources, morphological engine, annotated corpora, Moroccan dialect, Tunisian dialect, linguistic resources, sociolinguistic characteristics, dialectal texts
List of contributors: