Loading…

PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines

In recent years, a wide variety of automated machine learning (AutoML) methods have been proposed to generate end-to-end ML pipelines. While these techniques facilitate the creation of models, given their black-box nature, the complexity of the underlying algorithms, and the large number of pipeline...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on visualization and computer graphics 2021-02, Vol.27 (2), p.390-400
Main Authors: Ono, Jorge Piazentin, Castelo, Sonia, Lopez, Roque, Bertini, Enrico, Freire, Juliana, Silva, Claudio
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c349t-54507d2b6ff1df4e384ca16bc4537bacc0a992decb5fce9ccc47fd2024abc59a3
cites cdi_FETCH-LOGICAL-c349t-54507d2b6ff1df4e384ca16bc4537bacc0a992decb5fce9ccc47fd2024abc59a3
container_end_page 400
container_issue 2
container_start_page 390
container_title IEEE transactions on visualization and computer graphics
container_volume 27
creator Ono, Jorge Piazentin
Castelo, Sonia
Lopez, Roque
Bertini, Enrico
Freire, Juliana
Silva, Claudio
description In recent years, a wide variety of automated machine learning (AutoML) methods have been proposed to generate end-to-end ML pipelines. While these techniques facilitate the creation of models, given their black-box nature, the complexity of the underlying algorithms, and the large number of pipelines they derive, they are difficult for developers to debug. It is also challenging for machine learning experts to select an AutoML system that is well suited for a given problem. In this paper, we present the Pipeline Profiler, an interactive visualization tool that allows the exploration and comparison of the solution space of machine learning (ML) pipelines produced by AutoML systems. PipelineProfiler is integrated with Jupyter Notebook and can be combined with common data science tools to enable a rich set of analyses of the ML pipelines, providing users a better understanding of the algorithms that generated them as well as insights into how they can be improved. We demonstrate the utility of our tool through use cases where PipelineProfiler is used to better understand and improve a real-world AutoML system. Furthermore, we validate our approach by presenting a detailed analysis of a think-aloud experiment with six data scientists who develop and evaluate AutoML tools.
doi_str_mv 10.1109/TVCG.2020.3030361
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmed_primary_33048694</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9222086</ieee_id><sourcerecordid>2483266814</sourcerecordid><originalsourceid>FETCH-LOGICAL-c349t-54507d2b6ff1df4e384ca16bc4537bacc0a992decb5fce9ccc47fd2024abc59a3</originalsourceid><addsrcrecordid>eNpdkE1LAzEQhoMotlZ_gAgS8OJla742u_G2lFqFihVqr0s2m2BK2tRkF-y_d0s_DjKHGZhnXpgHgFuMhhgj8TRfjCZDgggaUtQVx2egjwXDCUoRP-9mlGUJ4YT3wFWMS4QwY7m4BD1KEcu5YH3wObMb7exaz4I31unwDAu4sLGVDhZr6baNVRHOvXfQ-ACbbw3Hvxvng2ysX0NvYNE2_n0KjznxGlwY6aK-OfQB-HoZz0evyfRj8jYqpomiTDRJylKU1aTixuDaME1zpiTmlWIpzSqpFJJCkFqrKjVKC6UUy0zdPctkpVIh6QA87nM3wf-0OjblykalnZNr7dtYEpZiTGmWpR368A9d-jZ03-2onBLOc8w6Cu8pFXyMQZtyE-xKhm2JUbnzXe58lzvf5cF3d3N_SG6rla5PF0fBHXC3B6zW-rQWhBCUc_oHsyyD5g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2483266814</pqid></control><display><type>article</type><title>PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines</title><source>IEEE Xplore (Online service)</source><creator>Ono, Jorge Piazentin ; Castelo, Sonia ; Lopez, Roque ; Bertini, Enrico ; Freire, Juliana ; Silva, Claudio</creator><creatorcontrib>Ono, Jorge Piazentin ; Castelo, Sonia ; Lopez, Roque ; Bertini, Enrico ; Freire, Juliana ; Silva, Claudio</creatorcontrib><description>In recent years, a wide variety of automated machine learning (AutoML) methods have been proposed to generate end-to-end ML pipelines. While these techniques facilitate the creation of models, given their black-box nature, the complexity of the underlying algorithms, and the large number of pipelines they derive, they are difficult for developers to debug. It is also challenging for machine learning experts to select an AutoML system that is well suited for a given problem. In this paper, we present the Pipeline Profiler, an interactive visualization tool that allows the exploration and comparison of the solution space of machine learning (ML) pipelines produced by AutoML systems. PipelineProfiler is integrated with Jupyter Notebook and can be combined with common data science tools to enable a rich set of analyses of the ML pipelines, providing users a better understanding of the algorithms that generated them as well as insights into how they can be improved. We demonstrate the utility of our tool through use cases where PipelineProfiler is used to better understand and improve a real-world AutoML system. Furthermore, we validate our approach by presenting a detailed analysis of a think-aloud experiment with six data scientists who develop and evaluate AutoML tools.</description><identifier>ISSN: 1077-2626</identifier><identifier>EISSN: 1941-0506</identifier><identifier>DOI: 10.1109/TVCG.2020.3030361</identifier><identifier>PMID: 33048694</identifier><identifier>CODEN: ITVGEA</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Algorithms ; Automatic Machine Learning ; Correlation ; Data visualization ; Machine learning ; Model Evaluation ; Pipeline Visualization ; Pipelines ; Search problems ; Solution space ; Visual analytics</subject><ispartof>IEEE transactions on visualization and computer graphics, 2021-02, Vol.27 (2), p.390-400</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c349t-54507d2b6ff1df4e384ca16bc4537bacc0a992decb5fce9ccc47fd2024abc59a3</citedby><cites>FETCH-LOGICAL-c349t-54507d2b6ff1df4e384ca16bc4537bacc0a992decb5fce9ccc47fd2024abc59a3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9222086$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33048694$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ono, Jorge Piazentin</creatorcontrib><creatorcontrib>Castelo, Sonia</creatorcontrib><creatorcontrib>Lopez, Roque</creatorcontrib><creatorcontrib>Bertini, Enrico</creatorcontrib><creatorcontrib>Freire, Juliana</creatorcontrib><creatorcontrib>Silva, Claudio</creatorcontrib><title>PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines</title><title>IEEE transactions on visualization and computer graphics</title><addtitle>TVCG</addtitle><addtitle>IEEE Trans Vis Comput Graph</addtitle><description>In recent years, a wide variety of automated machine learning (AutoML) methods have been proposed to generate end-to-end ML pipelines. While these techniques facilitate the creation of models, given their black-box nature, the complexity of the underlying algorithms, and the large number of pipelines they derive, they are difficult for developers to debug. It is also challenging for machine learning experts to select an AutoML system that is well suited for a given problem. In this paper, we present the Pipeline Profiler, an interactive visualization tool that allows the exploration and comparison of the solution space of machine learning (ML) pipelines produced by AutoML systems. PipelineProfiler is integrated with Jupyter Notebook and can be combined with common data science tools to enable a rich set of analyses of the ML pipelines, providing users a better understanding of the algorithms that generated them as well as insights into how they can be improved. We demonstrate the utility of our tool through use cases where PipelineProfiler is used to better understand and improve a real-world AutoML system. Furthermore, we validate our approach by presenting a detailed analysis of a think-aloud experiment with six data scientists who develop and evaluate AutoML tools.</description><subject>Algorithms</subject><subject>Automatic Machine Learning</subject><subject>Correlation</subject><subject>Data visualization</subject><subject>Machine learning</subject><subject>Model Evaluation</subject><subject>Pipeline Visualization</subject><subject>Pipelines</subject><subject>Search problems</subject><subject>Solution space</subject><subject>Visual analytics</subject><issn>1077-2626</issn><issn>1941-0506</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNpdkE1LAzEQhoMotlZ_gAgS8OJla742u_G2lFqFihVqr0s2m2BK2tRkF-y_d0s_DjKHGZhnXpgHgFuMhhgj8TRfjCZDgggaUtQVx2egjwXDCUoRP-9mlGUJ4YT3wFWMS4QwY7m4BD1KEcu5YH3wObMb7exaz4I31unwDAu4sLGVDhZr6baNVRHOvXfQ-ACbbw3Hvxvng2ysX0NvYNE2_n0KjznxGlwY6aK-OfQB-HoZz0evyfRj8jYqpomiTDRJylKU1aTixuDaME1zpiTmlWIpzSqpFJJCkFqrKjVKC6UUy0zdPctkpVIh6QA87nM3wf-0OjblykalnZNr7dtYEpZiTGmWpR368A9d-jZ03-2onBLOc8w6Cu8pFXyMQZtyE-xKhm2JUbnzXe58lzvf5cF3d3N_SG6rla5PF0fBHXC3B6zW-rQWhBCUc_oHsyyD5g</recordid><startdate>20210201</startdate><enddate>20210201</enddate><creator>Ono, Jorge Piazentin</creator><creator>Castelo, Sonia</creator><creator>Lopez, Roque</creator><creator>Bertini, Enrico</creator><creator>Freire, Juliana</creator><creator>Silva, Claudio</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20210201</creationdate><title>PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines</title><author>Ono, Jorge Piazentin ; Castelo, Sonia ; Lopez, Roque ; Bertini, Enrico ; Freire, Juliana ; Silva, Claudio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c349t-54507d2b6ff1df4e384ca16bc4537bacc0a992decb5fce9ccc47fd2024abc59a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Automatic Machine Learning</topic><topic>Correlation</topic><topic>Data visualization</topic><topic>Machine learning</topic><topic>Model Evaluation</topic><topic>Pipeline Visualization</topic><topic>Pipelines</topic><topic>Search problems</topic><topic>Solution space</topic><topic>Visual analytics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ono, Jorge Piazentin</creatorcontrib><creatorcontrib>Castelo, Sonia</creatorcontrib><creatorcontrib>Lopez, Roque</creatorcontrib><creatorcontrib>Bertini, Enrico</creatorcontrib><creatorcontrib>Freire, Juliana</creatorcontrib><creatorcontrib>Silva, Claudio</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore (Online service)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on visualization and computer graphics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ono, Jorge Piazentin</au><au>Castelo, Sonia</au><au>Lopez, Roque</au><au>Bertini, Enrico</au><au>Freire, Juliana</au><au>Silva, Claudio</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines</atitle><jtitle>IEEE transactions on visualization and computer graphics</jtitle><stitle>TVCG</stitle><addtitle>IEEE Trans Vis Comput Graph</addtitle><date>2021-02-01</date><risdate>2021</risdate><volume>27</volume><issue>2</issue><spage>390</spage><epage>400</epage><pages>390-400</pages><issn>1077-2626</issn><eissn>1941-0506</eissn><coden>ITVGEA</coden><abstract>In recent years, a wide variety of automated machine learning (AutoML) methods have been proposed to generate end-to-end ML pipelines. While these techniques facilitate the creation of models, given their black-box nature, the complexity of the underlying algorithms, and the large number of pipelines they derive, they are difficult for developers to debug. It is also challenging for machine learning experts to select an AutoML system that is well suited for a given problem. In this paper, we present the Pipeline Profiler, an interactive visualization tool that allows the exploration and comparison of the solution space of machine learning (ML) pipelines produced by AutoML systems. PipelineProfiler is integrated with Jupyter Notebook and can be combined with common data science tools to enable a rich set of analyses of the ML pipelines, providing users a better understanding of the algorithms that generated them as well as insights into how they can be improved. We demonstrate the utility of our tool through use cases where PipelineProfiler is used to better understand and improve a real-world AutoML system. Furthermore, we validate our approach by presenting a detailed analysis of a think-aloud experiment with six data scientists who develop and evaluate AutoML tools.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>33048694</pmid><doi>10.1109/TVCG.2020.3030361</doi><tpages>11</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1077-2626
ispartof IEEE transactions on visualization and computer graphics, 2021-02, Vol.27 (2), p.390-400
issn 1077-2626
1941-0506
language eng
recordid cdi_pubmed_primary_33048694
source IEEE Xplore (Online service)
subjects Algorithms
Automatic Machine Learning
Correlation
Data visualization
Machine learning
Model Evaluation
Pipeline Visualization
Pipelines
Search problems
Solution space
Visual analytics
title PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T13%3A25%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PipelineProfiler:%20A%20Visual%20Analytics%20Tool%20for%20the%20Exploration%20of%20AutoML%20Pipelines&rft.jtitle=IEEE%20transactions%20on%20visualization%20and%20computer%20graphics&rft.au=Ono,%20Jorge%20Piazentin&rft.date=2021-02-01&rft.volume=27&rft.issue=2&rft.spage=390&rft.epage=400&rft.pages=390-400&rft.issn=1077-2626&rft.eissn=1941-0506&rft.coden=ITVGEA&rft_id=info:doi/10.1109/TVCG.2020.3030361&rft_dat=%3Cproquest_pubme%3E2483266814%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c349t-54507d2b6ff1df4e384ca16bc4537bacc0a992decb5fce9ccc47fd2024abc59a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2483266814&rft_id=info:pmid/33048694&rft_ieee_id=9222086&rfr_iscdi=true