Loading…
BCWS: Bilingual Contextual Word Similarity
This paper introduces the first dataset for evaluating English-Chinese Bilingual Contextual Word Similarity, namely BCWS (https://github.com/MiuLab/BCWS). The dataset consists of 2,091 English-Chinese word pairs with the corresponding sentential contexts and their similarity scores annotated by the...
Saved in:
Published in: | arXiv.org 2018-10 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Ta-Chung, Chi Ching-Yen, Shih Chen, Yun-Nung |
description | This paper introduces the first dataset for evaluating English-Chinese Bilingual Contextual Word Similarity, namely BCWS (https://github.com/MiuLab/BCWS). The dataset consists of 2,091 English-Chinese word pairs with the corresponding sentential contexts and their similarity scores annotated by the human. Our annotated dataset has higher consistency compared to other similar datasets. We establish several baselines for the bilingual embedding task to benchmark the experiments. Modeling cross-lingual sense representations as provided in this dataset has the potential of moving artificial intelligence from monolingual understanding towards multilingual understanding. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2124345079</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2124345079</sourcerecordid><originalsourceid>FETCH-proquest_journals_21243450793</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQcnIOD7ZScMrMycxLL03MUXDOzytJrSgBMcPzi1IUgjNzM3MSizJLKnkYWNMSc4pTeaE0N4Oym2uIs4duQVF-YWlqcUl8Vn5pUR5QKt7I0MjE2MTUwNzSmDhVANJYMGM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2124345079</pqid></control><display><type>article</type><title>BCWS: Bilingual Contextual Word Similarity</title><source>Publicly Available Content Database</source><creator>Ta-Chung, Chi ; Ching-Yen, Shih ; Chen, Yun-Nung</creator><creatorcontrib>Ta-Chung, Chi ; Ching-Yen, Shih ; Chen, Yun-Nung</creatorcontrib><description>This paper introduces the first dataset for evaluating English-Chinese Bilingual Contextual Word Similarity, namely BCWS (https://github.com/MiuLab/BCWS). The dataset consists of 2,091 English-Chinese word pairs with the corresponding sentential contexts and their similarity scores annotated by the human. Our annotated dataset has higher consistency compared to other similar datasets. We establish several baselines for the bilingual embedding task to benchmark the experiments. Modeling cross-lingual sense representations as provided in this dataset has the potential of moving artificial intelligence from monolingual understanding towards multilingual understanding.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial intelligence ; Bilingualism ; Datasets ; Similarity</subject><ispartof>arXiv.org, 2018-10</ispartof><rights>2018. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2124345079?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Ta-Chung, Chi</creatorcontrib><creatorcontrib>Ching-Yen, Shih</creatorcontrib><creatorcontrib>Chen, Yun-Nung</creatorcontrib><title>BCWS: Bilingual Contextual Word Similarity</title><title>arXiv.org</title><description>This paper introduces the first dataset for evaluating English-Chinese Bilingual Contextual Word Similarity, namely BCWS (https://github.com/MiuLab/BCWS). The dataset consists of 2,091 English-Chinese word pairs with the corresponding sentential contexts and their similarity scores annotated by the human. Our annotated dataset has higher consistency compared to other similar datasets. We establish several baselines for the bilingual embedding task to benchmark the experiments. Modeling cross-lingual sense representations as provided in this dataset has the potential of moving artificial intelligence from monolingual understanding towards multilingual understanding.</description><subject>Artificial intelligence</subject><subject>Bilingualism</subject><subject>Datasets</subject><subject>Similarity</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQcnIOD7ZScMrMycxLL03MUXDOzytJrSgBMcPzi1IUgjNzM3MSizJLKnkYWNMSc4pTeaE0N4Oym2uIs4duQVF-YWlqcUl8Vn5pUR5QKt7I0MjE2MTUwNzSmDhVANJYMGM</recordid><startdate>20181021</startdate><enddate>20181021</enddate><creator>Ta-Chung, Chi</creator><creator>Ching-Yen, Shih</creator><creator>Chen, Yun-Nung</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20181021</creationdate><title>BCWS: Bilingual Contextual Word Similarity</title><author>Ta-Chung, Chi ; Ching-Yen, Shih ; Chen, Yun-Nung</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_21243450793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Artificial intelligence</topic><topic>Bilingualism</topic><topic>Datasets</topic><topic>Similarity</topic><toplevel>online_resources</toplevel><creatorcontrib>Ta-Chung, Chi</creatorcontrib><creatorcontrib>Ching-Yen, Shih</creatorcontrib><creatorcontrib>Chen, Yun-Nung</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ta-Chung, Chi</au><au>Ching-Yen, Shih</au><au>Chen, Yun-Nung</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>BCWS: Bilingual Contextual Word Similarity</atitle><jtitle>arXiv.org</jtitle><date>2018-10-21</date><risdate>2018</risdate><eissn>2331-8422</eissn><abstract>This paper introduces the first dataset for evaluating English-Chinese Bilingual Contextual Word Similarity, namely BCWS (https://github.com/MiuLab/BCWS). The dataset consists of 2,091 English-Chinese word pairs with the corresponding sentential contexts and their similarity scores annotated by the human. Our annotated dataset has higher consistency compared to other similar datasets. We establish several baselines for the bilingual embedding task to benchmark the experiments. Modeling cross-lingual sense representations as provided in this dataset has the potential of moving artificial intelligence from monolingual understanding towards multilingual understanding.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2018-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2124345079 |
source | Publicly Available Content Database |
subjects | Artificial intelligence Bilingualism Datasets Similarity |
title | BCWS: Bilingual Contextual Word Similarity |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T00%3A48%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=BCWS:%20Bilingual%20Contextual%20Word%20Similarity&rft.jtitle=arXiv.org&rft.au=Ta-Chung,%20Chi&rft.date=2018-10-21&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2124345079%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_21243450793%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2124345079&rft_id=info:pmid/&rfr_iscdi=true |