Loading…

Instilling Type Knowledge in Language Models via Multi-Task QA

Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained t...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2022-04
Main Authors:	Li, Shuyang, Sridhar, Mukund, Prakash, Chandana Satya, Cao, Jin, Hamza, Wael, McAuley, Julian
Format:	Article
Language:	English
Subjects:	Datasets Encyclopedias Knowledge Knowledge bases (artificial intelligence) Knowledge representation Taxonomy Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging knowledge base documents and knowledge graphs. We create the WikiWiki dataset: entities and passages from 10M Wikipedia articles linked to the Wikidata knowledge graph with 41K types. Models trained on WikiWiki achieve state-of-the-art performance in zero-shot dialog state tracking benchmarks, accurately infer entity types in Wikipedia articles, and can discover new types deemed useful by human judges.
ISSN:	2331-8422