Loading…

Multi-Task Learning with Calibrated Mixture of Insightful Experts

Multi-task learning has been established as an important machine learning framework for leveraging shared knowledge among multiple different but related tasks, with the generalization performance of models enhanced. As a promising learning paradigm, multi-task learning has been widely adopted by var...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Sinan, Li, Yumeng, Li, Hongyan, Zhu, Tanchao, Li, Zhao, Ou, Wenwu
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multi-task learning has been established as an important machine learning framework for leveraging shared knowledge among multiple different but related tasks, with the generalization performance of models enhanced. As a promising learning paradigm, multi-task learning has been widely adopted by various real-world applications, such as recommendation systems. Multi-gate Mixture-of-Experts (MMoE), a well-received multi-task learning method in industry, based on the classic and inspiring Mixture-of-Experts (MoE) structure, explicitly models task relationships and learns task-specific functionalities, generating significant improvements. However, in our applications, negative transfer, which confuses considerable existing multi-task learning methods, is still observed to happen to MMoE. In this paper, an in-depth empirical investigation into negative transfer is launched. And it reveals that, incompetent experts, which play fundamental roles under the learning framework of MoE, are the key technique bottleneck. To tackle this dilemma, we propose the Calibrated Mixture of Insightful Experts (CMoIE), with three novel modules (Conflict Resolution, Expert Communication, and Mixture Calibration), customed for multi-task learning. Hence a group of insightful experts are constructed with enhanced diversity, communication and specialization. To validate the proposed method CMoIE, experiments are conducted on three public datasets and one real-world click-through-rate prediction dataset we construct based on traffic logs collected from a large-scale online product recommendation system. Our approach yields best performance across all of these benchmarks, demonstrating the superiority of it.
ISSN:2375-026X
DOI:10.1109/ICDE53745.2022.00312