Loading…

AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models

AutoGluon-Multimodal (AutoMM) is introduced as an open-source AutoML library designed specifically for multimodal learning. Distinguished by its exceptional ease of use, AutoMM enables fine-tuning of foundation models with just three lines of code. Supporting various modalities including image, text...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-04
Main Authors: Tang, Zhiqiang, Fang, Haoyang, Zhou, Su, Yang, Taojiannan, Zhong, Zihan, Hu, Tony, Kirchhoff, Katrin, Karypis, George
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Tang, Zhiqiang
Fang, Haoyang
Zhou, Su
Yang, Taojiannan
Zhong, Zihan
Hu, Tony
Kirchhoff, Katrin
Karypis, George
description AutoGluon-Multimodal (AutoMM) is introduced as an open-source AutoML library designed specifically for multimodal learning. Distinguished by its exceptional ease of use, AutoMM enables fine-tuning of foundation models with just three lines of code. Supporting various modalities including image, text, and tabular data, both independently and in combination, the library offers a comprehensive suite of functionalities spanning classification, regression, object detection, semantic matching, and image segmentation. Experiments across diverse datasets and tasks showcases AutoMM's superior performance in basic classification and regression tasks compared to existing AutoML tools, while also demonstrating competitive results in advanced tasks, aligning with specialized toolboxes designed for such purposes.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3046998962</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3046998962</sourcerecordid><originalsourceid>FETCH-proquest_journals_30469989623</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQIdSwtyXfPKc3P0_UtzSnJzM1PScxR0ACJ-vpqWikElxakFiVnJBalZ-alKyApAavwUSjPLMlQcMsvzUtJLMnMz1PwzU9JzSnmYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4YwMTM0tLC0szI2PiVAEAmM5Ang</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3046998962</pqid></control><display><type>article</type><title>AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models</title><source>Publicly Available Content Database (ProQuest Open Access資料庫)</source><creator>Tang, Zhiqiang ; Fang, Haoyang ; Zhou, Su ; Yang, Taojiannan ; Zhong, Zihan ; Hu, Tony ; Kirchhoff, Katrin ; Karypis, George</creator><creatorcontrib>Tang, Zhiqiang ; Fang, Haoyang ; Zhou, Su ; Yang, Taojiannan ; Zhong, Zihan ; Hu, Tony ; Kirchhoff, Katrin ; Karypis, George</creatorcontrib><description>AutoGluon-Multimodal (AutoMM) is introduced as an open-source AutoML library designed specifically for multimodal learning. Distinguished by its exceptional ease of use, AutoMM enables fine-tuning of foundation models with just three lines of code. Supporting various modalities including image, text, and tabular data, both independently and in combination, the library offers a comprehensive suite of functionalities spanning classification, regression, object detection, semantic matching, and image segmentation. Experiments across diverse datasets and tasks showcases AutoMM's superior performance in basic classification and regression tasks compared to existing AutoML tools, while also demonstrating competitive results in advanced tasks, aligning with specialized toolboxes designed for such purposes.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Classification ; Image segmentation ; Libraries ; Machine learning ; Object recognition</subject><ispartof>arXiv.org, 2024-04</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3046998962?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Tang, Zhiqiang</creatorcontrib><creatorcontrib>Fang, Haoyang</creatorcontrib><creatorcontrib>Zhou, Su</creatorcontrib><creatorcontrib>Yang, Taojiannan</creatorcontrib><creatorcontrib>Zhong, Zihan</creatorcontrib><creatorcontrib>Hu, Tony</creatorcontrib><creatorcontrib>Kirchhoff, Katrin</creatorcontrib><creatorcontrib>Karypis, George</creatorcontrib><title>AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models</title><title>arXiv.org</title><description>AutoGluon-Multimodal (AutoMM) is introduced as an open-source AutoML library designed specifically for multimodal learning. Distinguished by its exceptional ease of use, AutoMM enables fine-tuning of foundation models with just three lines of code. Supporting various modalities including image, text, and tabular data, both independently and in combination, the library offers a comprehensive suite of functionalities spanning classification, regression, object detection, semantic matching, and image segmentation. Experiments across diverse datasets and tasks showcases AutoMM's superior performance in basic classification and regression tasks compared to existing AutoML tools, while also demonstrating competitive results in advanced tasks, aligning with specialized toolboxes designed for such purposes.</description><subject>Classification</subject><subject>Image segmentation</subject><subject>Libraries</subject><subject>Machine learning</subject><subject>Object recognition</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mQIdSwtyXfPKc3P0_UtzSnJzM1PScxR0ACJ-vpqWikElxakFiVnJBalZ-alKyApAavwUSjPLMlQcMsvzUtJLMnMz1PwzU9JzSnmYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4YwMTM0tLC0szI2PiVAEAmM5Ang</recordid><startdate>20240430</startdate><enddate>20240430</enddate><creator>Tang, Zhiqiang</creator><creator>Fang, Haoyang</creator><creator>Zhou, Su</creator><creator>Yang, Taojiannan</creator><creator>Zhong, Zihan</creator><creator>Hu, Tony</creator><creator>Kirchhoff, Katrin</creator><creator>Karypis, George</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240430</creationdate><title>AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models</title><author>Tang, Zhiqiang ; Fang, Haoyang ; Zhou, Su ; Yang, Taojiannan ; Zhong, Zihan ; Hu, Tony ; Kirchhoff, Katrin ; Karypis, George</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30469989623</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Image segmentation</topic><topic>Libraries</topic><topic>Machine learning</topic><topic>Object recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Tang, Zhiqiang</creatorcontrib><creatorcontrib>Fang, Haoyang</creatorcontrib><creatorcontrib>Zhou, Su</creatorcontrib><creatorcontrib>Yang, Taojiannan</creatorcontrib><creatorcontrib>Zhong, Zihan</creatorcontrib><creatorcontrib>Hu, Tony</creatorcontrib><creatorcontrib>Kirchhoff, Katrin</creatorcontrib><creatorcontrib>Karypis, George</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database (ProQuest Open Access資料庫)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tang, Zhiqiang</au><au>Fang, Haoyang</au><au>Zhou, Su</au><au>Yang, Taojiannan</au><au>Zhong, Zihan</au><au>Hu, Tony</au><au>Kirchhoff, Katrin</au><au>Karypis, George</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models</atitle><jtitle>arXiv.org</jtitle><date>2024-04-30</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>AutoGluon-Multimodal (AutoMM) is introduced as an open-source AutoML library designed specifically for multimodal learning. Distinguished by its exceptional ease of use, AutoMM enables fine-tuning of foundation models with just three lines of code. Supporting various modalities including image, text, and tabular data, both independently and in combination, the library offers a comprehensive suite of functionalities spanning classification, regression, object detection, semantic matching, and image segmentation. Experiments across diverse datasets and tasks showcases AutoMM's superior performance in basic classification and regression tasks compared to existing AutoML tools, while also demonstrating competitive results in advanced tasks, aligning with specialized toolboxes designed for such purposes.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-04
issn 2331-8422
language eng
recordid cdi_proquest_journals_3046998962
source Publicly Available Content Database (ProQuest Open Access資料庫)
subjects Classification
Image segmentation
Libraries
Machine learning
Object recognition
title AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T04%3A45%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=AutoGluon-Multimodal%20(AutoMM):%20Supercharging%20Multimodal%20AutoML%20with%20Foundation%20Models&rft.jtitle=arXiv.org&rft.au=Tang,%20Zhiqiang&rft.date=2024-04-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3046998962%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_30469989623%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3046998962&rft_id=info:pmid/&rfr_iscdi=true