Loading…
A Generative Adversarial Network for Data Augmentation: The Case of Arabic Regional Dialects
Text Generation using Generative Adversarial Networks (GANs) has been successful in domains such as sentiment analysis using Sentimental GAN (SentiGAN) model. We adopt a similar model to generate sentences for five regional Arabic dialects (Egypt, Gulf, Maghreb, Levant, and Iraq). The objective is t...
Saved in:
Published in: | Procedia computer science 2021, Vol.189, p.92-99 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Text Generation using Generative Adversarial Networks (GANs) has been successful in domains such as sentiment analysis using Sentimental GAN (SentiGAN) model. We adopt a similar model to generate sentences for five regional Arabic dialects (Egypt, Gulf, Maghreb, Levant, and Iraq). The objective is to overcome the scarcity of richly annotated Dialectal Arabic (DA) datasets by automatic generation of such corpora. The DA generation process for a specific dialect, relies on a generator to create new text, and a discriminator to evaluate that text, with a dynamic update that will allow the process to run automatically without supervision. Novelty and diversity are the two metrics used to verify the consistency and quality of the generated DA text before enriching the sought datasets. Experimental results confirm the reliability and value of the generated datasets when tested by different classifiers. |
---|---|
ISSN: | 1877-0509 1877-0509 |
DOI: | 10.1016/j.procs.2021.05.072 |