Loading…
Open-Domain Long-Form Question–Answering Using Transformer-Based Pipeline
For a long time, question–answering has been a crucial part of natural language processing (NLP). This task refers to fetching accurate and complete answers for a question using certain support documents or knowledge sources. In recent years, much work has been done in this field, especially after t...
Saved in:
Published in: | SN computer science 2023-09, Vol.4 (5), p.595, Article 595 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | For a long time, question–answering has been a crucial part of natural language processing (NLP). This task refers to fetching accurate and complete answers for a question using certain support documents or knowledge sources. In recent years, much work has been done in this field, especially after the introduction of transformer models. However, analysis reveals that the majority of research done in this domain mainly focuses on answering questions curated to have short answers, and fewer works focus on long-form question–answering (LFQA). LFQA systems generate explanatory answers for questions and pose more challenges than the short-form version. This paper investigates the long-form question–answering task by proposing a system in the form of a pipeline consisting of various transformer-based models, enabling the system to give explanatory answers to open-domain long-form questions. The pipeline mainly consists of a retriever module and a generator module. The retriever module retrieves the relevant support documents containing evidence to answer a question from a comprehensive knowledge source. On the other hand, the generator module generates the final answer using the relevant documents retrieved by the retriever module. The Explain Like I’m Five (ELI5) dataset is used to train and evaluate the system, and the final results are documented using proper metrics. The system is implemented in the Python programming language using the PyTorch framework. According to the evaluation, the proposed LFQA pipeline outperforms the existing research works when evaluated on the Knowledge-Intensive Language Tasks (KILT) benchmark and is thus effective in question–answering tasks. |
---|---|
ISSN: | 2661-8907 2662-995X 2661-8907 |
DOI: | 10.1007/s42979-023-02039-x |