Loading…

Preserving Privacy in Arabic Judgments: AI-Powered Anonymization For Enhanced Legal Data Privacy

Jurisprudence involves studying, interpreting, and applying the law to comprehend its societal impact. Judges annually review cases to ensure accurate law application, which raises privacy concerns when accessing files from other courts. While the legal field has garnered interest from the research...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2023-01, Vol.11, p.1-1
Main Authors: Moussaoui, Taoufiq El, Chakir, Loqman, Boumhidi, Jaouad
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Jurisprudence involves studying, interpreting, and applying the law to comprehend its societal impact. Judges annually review cases to ensure accurate law application, which raises privacy concerns when accessing files from other courts. While the legal field has garnered interest from the research community, the challenge of masking personal data, particularly in the Arabic language with limited resources, remains in its early stages. To address this research gap, we develop a two-component system for generating anonymous Arabic judgments. The first component, a personal data extractor model, utilizes Named Entity Recognition (NER) to identify key individual entities like names, addresses, birthdays, case numbers, and national identity codes. We train this model on a purpose-built Arabic legal corpus. The second component involves a Python module designed to mask the personal entities extracted by the first component. Together, these components enable the generation of anonymous judgments. Our model achieves an F1-score of 96.14% when detecting entities in the created Arabic Legal corpus. Additionally, experiments on the ANERCorp corpus, with training and testing splits of 70%-30% and 90%-10%, yield F1-scores of 93.78% and 95.77%, respectively. With these results, our proposed system demonstrates the promising potential for generating anonymous Arabic judgments. Furthermore, the built Arabic legal corpus provides a valuable resource for researchers aiming to enhance domain-specific NER models in Arabic text.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3324288