Loading…
A Multi-modal Multi-task based Approach for Movie Recommendation
An online recommendation system is one of the desires of digital e-commerce sectors and the OTT platforms like Amazon Prime, Netflix, SonyLiv, etc. In recent times, with an increase in the interaction of users with the different e-commerce platforms and then analyzing their liking-disliking essence,...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | An online recommendation system is one of the desires of digital e-commerce sectors and the OTT platforms like Amazon Prime, Netflix, SonyLiv, etc. In recent times, with an increase in the interaction of users with the different e-commerce platforms and then analyzing their liking-disliking essence, the recommendation system tries to predict the preference of the user for recommending new items that may capture his attention. In the current study, a multi-task-based architecture is designed to solve the multi-modal movie recommendation problem. Here our hypothesis is that solving two related tasks, namely (a) genre classification of movies and (b) rating identification for a user-movie pair, helps in generating good quality movie embeddings in an end-to-end setting without using a rating vector. For generating the representation of movies, unlike the state-of-the-art techniques, feature vectors extracted from multiple modalities like textual summary, audio and video information present in the movie trailers, and meta-data information are fused together. For representing the user, average representations of movies that are liked by the user are considered. Different multitasking models, fully shared (FS), shared-private (SP), and adversarial shared-private (ASP) feature models are developed for solving the above-mentioned two tasks simultaneously, genre classification, and user-movie rating prediction. For experimental purposes, MMTF-14K: a multifaceted movie trailer feature dataset was extended by incorporating textual features and meta-data information, and a multi-modal version of the MovieLens-100K dataset is used. Results of different multitasking models are shown in terms of RMSE and different rank-based metrics. The proposed multi-task model along with the adversarial training outperforms the state-of-the-art models when applied to the MMTF-14K and multi-modal version of MovieLens-100K datasets. |
---|---|
ISSN: | 2161-4407 |
DOI: | 10.1109/IJCNN54540.2023.10191882 |