Loading…

SILT: Efficient transformer training for inter-lingual inference

The ability of transformers to perform precision tasks such as question answering, Natural Language Inference (NLI) or summarizing, has enabled them to be ranked as one of the best paradigms to address Natural Language Processing (NLP) tasks. NLI is one of the best scenarios to test these architectu...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2022-08, Vol.200, p.116923, Article 116923
Main Authors: Huertas-Tato, Javier, Martín, Alejandro, Camacho, David
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The ability of transformers to perform precision tasks such as question answering, Natural Language Inference (NLI) or summarizing, has enabled them to be ranked as one of the best paradigms to address Natural Language Processing (NLP) tasks. NLI is one of the best scenarios to test these architectures, due to the knowledge required to understand complex sentences and established relationships between a hypothesis and a premise. Nevertheless, these models suffer from the incapacity to generalize to other domains or from difficulties to face multilingual and interlingual scenarios. The leading pathway in the literature to address these issues involve designing and training extremely large architectures, but this causes unpredictable behaviors and establishes barriers which impede broad access and fine tuning. In this paper, we propose a new architecture called Siamese Inter-Lingual Transformer (SILT). This architecture is able to efficiently align multilingual embeddings for Natural Language Inference, allowing for unmatched language pairs to be processed. SILT leverages siamese pre-trained multi-lingual transformers with frozen weights where the two input sentences attend to each other to later be combined through a matrix alignment method. The experimental results carried out in this paper evidence that SILT allows to reduce drastically the number of trainable parameters while allowing for inter-lingual NLI and achieving state-of-the-art performance on common benchmarks. •Efficient alignment of multilingual embeddings for Natural Language Inference.•Siamese pretrained multilingual transformers with frozen weights and mutual attention.•A curated Spanish version of the SICK dataset, called SICK-es is provided.•Drastic reduction of trainable parameters and ability to perform inter-lingual tasks.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2022.116923