Loading…

Sino-Vietnamese Text Transcription using Word Embedding Approach

Sino-Vietnamese (aka Han Viet) characters, derived from ancient Chinese script, held prominence in Vietnam for centuries until the advent of the modern scripting system. Despite their historical significance, contemporary Vietnamese struggle to decipher these characters, hindering access to valuable...

Full description

Saved in:
Bibliographic Details
Main Authors: Dao, Trung-Kien, Nguyen, Dinh-Van, Pham, Minh-Vu, Duong, Nhat-Nam
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 397
container_issue
container_start_page 392
container_title
container_volume
creator Dao, Trung-Kien
Nguyen, Dinh-Van
Pham, Minh-Vu
Duong, Nhat-Nam
description Sino-Vietnamese (aka Han Viet) characters, derived from ancient Chinese script, held prominence in Vietnam for centuries until the advent of the modern scripting system. Despite their historical significance, contemporary Vietnamese struggle to decipher these characters, hindering access to valuable cultural artifacts. This paper explores the complexities of Sino-Vietnamese characters, their divergence from modern Chinese characters, and the intricate process of converting them into readable texts. Notably, Sino-Vietnamese characters boast eight tones compared with Mandarin's five, adding to their linguistic complexity. While remnants of these characters persist in artifacts and bibliographies, their comprehension requires linguistic expertise because of the multiple possible readings for a single character. Addressing this challenge requires the development of tools for automatic recognition and translation, enabling broader access to Vietnam's cultural heritage. By facilitating the understanding of ancient texts, these efforts have contributed to the preservation and appreciation of Vietnamese linguistic and cultural traditions.
doi_str_mv 10.1109/ICCE62051.2024.10634601
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10634601</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10634601</ieee_id><sourcerecordid>10634601</sourcerecordid><originalsourceid>FETCH-LOGICAL-i106t-d41442d32577a1b3a7374f0c442cdf876a31387d90f7bd6ec18c9e57114fe4d93</originalsourceid><addsrcrecordid>eNo1j91KxDAUhKMguKx9A8G-QNdzctKkuXMp1V1Y8MKql0vapBqxPyQV9O2tqFcz810MM4xdIWwQQV_vy7KSHHLccOBigyBJSMATlmilC8qB1GL4KVvxgmQmSPNzlsT4BgAECIJwxW4e_DBmT97Ng-lddGntPue0DmaIbfDT7Mch_Yh-eEmfx2DTqm-ctT9xO01hNO3rBTvrzHt0yZ-u2eNtVZe77HB_ty-3h8wvw-bMChSCW-K5UgYbMoqU6KBdYGu7QklDSIWyGjrVWOlaLFrtcoUoOiespjW7_O31zrnjFHxvwtfx_zR9A-W8Sww</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Sino-Vietnamese Text Transcription using Word Embedding Approach</title><source>IEEE Xplore All Conference Series</source><creator>Dao, Trung-Kien ; Nguyen, Dinh-Van ; Pham, Minh-Vu ; Duong, Nhat-Nam</creator><creatorcontrib>Dao, Trung-Kien ; Nguyen, Dinh-Van ; Pham, Minh-Vu ; Duong, Nhat-Nam</creatorcontrib><description>Sino-Vietnamese (aka Han Viet) characters, derived from ancient Chinese script, held prominence in Vietnam for centuries until the advent of the modern scripting system. Despite their historical significance, contemporary Vietnamese struggle to decipher these characters, hindering access to valuable cultural artifacts. This paper explores the complexities of Sino-Vietnamese characters, their divergence from modern Chinese characters, and the intricate process of converting them into readable texts. Notably, Sino-Vietnamese characters boast eight tones compared with Mandarin's five, adding to their linguistic complexity. While remnants of these characters persist in artifacts and bibliographies, their comprehension requires linguistic expertise because of the multiple possible readings for a single character. Addressing this challenge requires the development of tools for automatic recognition and translation, enabling broader access to Vietnam's cultural heritage. By facilitating the understanding of ancient texts, these efforts have contributed to the preservation and appreciation of Vietnamese linguistic and cultural traditions.</description><identifier>EISSN: 2836-4392</identifier><identifier>EISBN: 9798350379792</identifier><identifier>DOI: 10.1109/ICCE62051.2024.10634601</identifier><language>eng</language><publisher>IEEE</publisher><subject>Bibliographies ; Complexity theory ; Cultural differences ; Linguistics ; natural language processing ; Pressing ; Sino-Vietnamese transcription ; word embeddings</subject><ispartof>2024 Tenth International Conference on Communications and Electronics (ICCE), 2024, p.392-397</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10634601$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10634601$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Dao, Trung-Kien</creatorcontrib><creatorcontrib>Nguyen, Dinh-Van</creatorcontrib><creatorcontrib>Pham, Minh-Vu</creatorcontrib><creatorcontrib>Duong, Nhat-Nam</creatorcontrib><title>Sino-Vietnamese Text Transcription using Word Embedding Approach</title><title>2024 Tenth International Conference on Communications and Electronics (ICCE)</title><addtitle>ICCE</addtitle><description>Sino-Vietnamese (aka Han Viet) characters, derived from ancient Chinese script, held prominence in Vietnam for centuries until the advent of the modern scripting system. Despite their historical significance, contemporary Vietnamese struggle to decipher these characters, hindering access to valuable cultural artifacts. This paper explores the complexities of Sino-Vietnamese characters, their divergence from modern Chinese characters, and the intricate process of converting them into readable texts. Notably, Sino-Vietnamese characters boast eight tones compared with Mandarin's five, adding to their linguistic complexity. While remnants of these characters persist in artifacts and bibliographies, their comprehension requires linguistic expertise because of the multiple possible readings for a single character. Addressing this challenge requires the development of tools for automatic recognition and translation, enabling broader access to Vietnam's cultural heritage. By facilitating the understanding of ancient texts, these efforts have contributed to the preservation and appreciation of Vietnamese linguistic and cultural traditions.</description><subject>Bibliographies</subject><subject>Complexity theory</subject><subject>Cultural differences</subject><subject>Linguistics</subject><subject>natural language processing</subject><subject>Pressing</subject><subject>Sino-Vietnamese transcription</subject><subject>word embeddings</subject><issn>2836-4392</issn><isbn>9798350379792</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNo1j91KxDAUhKMguKx9A8G-QNdzctKkuXMp1V1Y8MKql0vapBqxPyQV9O2tqFcz810MM4xdIWwQQV_vy7KSHHLccOBigyBJSMATlmilC8qB1GL4KVvxgmQmSPNzlsT4BgAECIJwxW4e_DBmT97Ng-lddGntPue0DmaIbfDT7Mch_Yh-eEmfx2DTqm-ctT9xO01hNO3rBTvrzHt0yZ-u2eNtVZe77HB_ty-3h8wvw-bMChSCW-K5UgYbMoqU6KBdYGu7QklDSIWyGjrVWOlaLFrtcoUoOiespjW7_O31zrnjFHxvwtfx_zR9A-W8Sww</recordid><startdate>20240731</startdate><enddate>20240731</enddate><creator>Dao, Trung-Kien</creator><creator>Nguyen, Dinh-Van</creator><creator>Pham, Minh-Vu</creator><creator>Duong, Nhat-Nam</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240731</creationdate><title>Sino-Vietnamese Text Transcription using Word Embedding Approach</title><author>Dao, Trung-Kien ; Nguyen, Dinh-Van ; Pham, Minh-Vu ; Duong, Nhat-Nam</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i106t-d41442d32577a1b3a7374f0c442cdf876a31387d90f7bd6ec18c9e57114fe4d93</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Bibliographies</topic><topic>Complexity theory</topic><topic>Cultural differences</topic><topic>Linguistics</topic><topic>natural language processing</topic><topic>Pressing</topic><topic>Sino-Vietnamese transcription</topic><topic>word embeddings</topic><toplevel>online_resources</toplevel><creatorcontrib>Dao, Trung-Kien</creatorcontrib><creatorcontrib>Nguyen, Dinh-Van</creatorcontrib><creatorcontrib>Pham, Minh-Vu</creatorcontrib><creatorcontrib>Duong, Nhat-Nam</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore (Online service)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dao, Trung-Kien</au><au>Nguyen, Dinh-Van</au><au>Pham, Minh-Vu</au><au>Duong, Nhat-Nam</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Sino-Vietnamese Text Transcription using Word Embedding Approach</atitle><btitle>2024 Tenth International Conference on Communications and Electronics (ICCE)</btitle><stitle>ICCE</stitle><date>2024-07-31</date><risdate>2024</risdate><spage>392</spage><epage>397</epage><pages>392-397</pages><eissn>2836-4392</eissn><eisbn>9798350379792</eisbn><abstract>Sino-Vietnamese (aka Han Viet) characters, derived from ancient Chinese script, held prominence in Vietnam for centuries until the advent of the modern scripting system. Despite their historical significance, contemporary Vietnamese struggle to decipher these characters, hindering access to valuable cultural artifacts. This paper explores the complexities of Sino-Vietnamese characters, their divergence from modern Chinese characters, and the intricate process of converting them into readable texts. Notably, Sino-Vietnamese characters boast eight tones compared with Mandarin's five, adding to their linguistic complexity. While remnants of these characters persist in artifacts and bibliographies, their comprehension requires linguistic expertise because of the multiple possible readings for a single character. Addressing this challenge requires the development of tools for automatic recognition and translation, enabling broader access to Vietnam's cultural heritage. By facilitating the understanding of ancient texts, these efforts have contributed to the preservation and appreciation of Vietnamese linguistic and cultural traditions.</abstract><pub>IEEE</pub><doi>10.1109/ICCE62051.2024.10634601</doi><tpages>6</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2836-4392
ispartof 2024 Tenth International Conference on Communications and Electronics (ICCE), 2024, p.392-397
issn 2836-4392
language eng
recordid cdi_ieee_primary_10634601
source IEEE Xplore All Conference Series
subjects Bibliographies
Complexity theory
Cultural differences
Linguistics
natural language processing
Pressing
Sino-Vietnamese transcription
word embeddings
title Sino-Vietnamese Text Transcription using Word Embedding Approach
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T10%3A33%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Sino-Vietnamese%20Text%20Transcription%20using%20Word%20Embedding%20Approach&rft.btitle=2024%20Tenth%20International%20Conference%20on%20Communications%20and%20Electronics%20(ICCE)&rft.au=Dao,%20Trung-Kien&rft.date=2024-07-31&rft.spage=392&rft.epage=397&rft.pages=392-397&rft.eissn=2836-4392&rft_id=info:doi/10.1109/ICCE62051.2024.10634601&rft.eisbn=9798350379792&rft_dat=%3Cieee_CHZPO%3E10634601%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i106t-d41442d32577a1b3a7374f0c442cdf876a31387d90f7bd6ec18c9e57114fe4d93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10634601&rfr_iscdi=true