Skip to Main Content (Press Enter)

Logo UNISS
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Competenze

Logo UNISS

|

UNIFIND

uniss.it
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Competenze
  1. Pubblicazioni

TArC: Tunisian Arabish Corpus, First complete release

Capitolo di libro
Data di Pubblicazione:
2022
Citazione:
TArC: Tunisian Arabish Corpus, First complete release / Gugliotta, Elisa; Dinarelli, Marco. - (2022), pp. 1125-1136.
Abstract:
In this paper we present the final result of a project focused on Tunisian Arabic encoded in Arabizi, the Latin-based writing system for digital conversations. The project led to the realization of two integrated and independent tools: a linguistic corpus and a neural network architecture created to annotate the former with various levels of linguistic information (code-switching classification, transliteration, tokenization, POS-tagging, lemmatization). We discuss the choices made in terms of computational and linguistic methodology and the strategies adopted to improve our results. We report on the experiments performed in order to outline our research path. Finally, we explain the reasons why we believe in the potential of these tools for both computational and linguistic researches.
Tipologia CRIS:
2.1 Contributo in volume (Capitolo o Saggio)
Keywords:
Tunisian Arabizi, Annotated Corpus, Neural Network Architecture
Elenco autori:
Gugliotta, Elisa; Dinarelli, Marco
Autori di Ateneo:
GUGLIOTTA Elisa
Link alla scheda completa:
https://iris.uniss.it/handle/11388/361754
Titolo del libro:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
  • Dati Generali

Dati Generali

URL

http://www.lrec-conf.org/proceedings/lrec2022/LREC-2022.pdf
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.6.0.0