Loading…

Introduction of the Asian Language Treebank

This paper introduces our project for developing Asian Language Treebank (ALT). The ALT project aims to advance the state-of-the-art Asian natural language processing (NLP) techniques through the open collaboration for developing and using ALT. The project is a joint effort of six institutes for mak...

Full description

Saved in:
Bibliographic Details
Main Authors: Riza, Hammam, Purwoadi, Michael, Gunarso, Uliniansyah, Teduh, Aw Ai Ti, Aljunied, Sharifah Mahani, Luong Chi Mai, Vu Tat Thang, Nguyen Phuong Thai, Chea, Vichet, Sun, Rapid, Sam, Sethserey, Seng, Sopheap, Soe, Khin Mar, Nwet, Khin Thandar, Utiyama, Masao, Chenchen Ding
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper introduces our project for developing Asian Language Treebank (ALT). The ALT project aims to advance the state-of-the-art Asian natural language processing (NLP) techniques through the open collaboration for developing and using ALT. The project is a joint effort of six institutes for making a parallel treebank for seven languages: English, Indonesian, Japanese, Khmer, Malay, Myanmar, and Vietnamese. The process of building ALT began with sampling about 20,000 sentences from English Wikinews, and then these sentences were translated into the other six languages. ALT will have word segmentation, part-of-speech tags, syntactic analysis annotations, together with word alignment links among these languages.
ISSN:2472-7695
DOI:10.1109/ICSDA.2016.7918974