Cover Image for System.Linq.Enumerable+EnumerablePartition`1[System.Char]

An Arabic Dialects Dictionary Using Word Embeddings

OAI: oai:igi-global.com:251899 DOI: 10.4018/IJRSDA.2019070102
Published by: IGI Global

Abstract

The dialectical Arabic and the Modern Standard Arabic lacks sufficient standardized language resources to enable the tasks of Arabic language processing, despite it being an active research area. This work addresses this issue by firstly highlighting the steps and the issues related to building a multi Arabic dialect corpus using web data from blogs and social media platforms (i.e. Facebook, Twitter, etc.). This is to create a vectorized dictionary for the crawled data using the word Embeddings. In other terms, the goal of this article is to build an updated multi-dialect data set, and then, to extract an annotated corpus from it.