Loading…
Imputation of missing values in multi-view data
Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This may lead to very large quantities of missing data which, espec...
Saved in:
Published in: | Information fusion 2024-11, Vol.111, p.102524, Article 102524 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c231t-2ea128fe920889ea5336401d60c6a33a19dc8864a08d16c0299e8aebf5f853433 |
container_end_page | |
container_issue | |
container_start_page | 102524 |
container_title | Information fusion |
container_volume | 111 |
creator | van Loon, Wouter Fokkema, Marjolein de Vos, Frank Koini, Marisa Schmidt, Reinhold de Rooij, Mark |
description | Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This may lead to very large quantities of missing data which, especially when combined with high-dimensionality, can make the application of conditional imputation methods computationally infeasible. However, the multi-view structure could be leveraged to reduce the complexity and computational load of imputation. We introduce a new imputation method based on the existing stacked penalized logistic regression (StaPLR) algorithm for multi-view learning. It performs imputation in a dimension-reduced space to address computational challenges inherent to the multi-view context. We compare the performance of the new imputation method with several existing imputation algorithms in simulated data sets and a real data application. The results show that the new imputation method leads to competitive results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible.
•A new imputation method for multi-view data is introduced.•The new method shows competitive results at a much lower computational cost.•The new method allows state-of-the-art algorithms to be used in much larger data sets than before. |
doi_str_mv | 10.1016/j.inffus.2024.102524 |
format | article |
fullrecord | <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_inffus_2024_102524</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1566253524003026</els_id><sourcerecordid>S1566253524003026</sourcerecordid><originalsourceid>FETCH-LOGICAL-c231t-2ea128fe920889ea5336401d60c6a33a19dc8864a08d16c0299e8aebf5f853433</originalsourceid><addsrcrecordid>eNp9j8tKxDAYhYMoOI6-gYu8QDt_rpNuBBm8DAy40XWI6R9J6WVo0hHf3g517eocDpzD-Qi5Z1AyYHrTlLEPYUolBy7niCsuL8iKmS0vtAB1OXuldcGVUNfkJqUGgG1BsBXZ7LvjlF2OQ0-HQLuYUuy_6Mm1EyYae9pNbY7FKeI3rV12t-QquDbh3Z-uycfz0_vutTi8vex3j4fCc8FywdExbgJWHIyp0CkhtARWa_DaCeFYVXtjtHRgaqY98KpC4_AzqGCUkEKsiVx2_TikNGKwxzF2bvyxDOwZ2jZ2gbZnaLtAz7WHpYbzt_n0aJOP2Hus44g-23qI_w_8AiBUYN8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Imputation of missing values in multi-view data</title><source>ScienceDirect Journals</source><creator>van Loon, Wouter ; Fokkema, Marjolein ; de Vos, Frank ; Koini, Marisa ; Schmidt, Reinhold ; de Rooij, Mark</creator><creatorcontrib>van Loon, Wouter ; Fokkema, Marjolein ; de Vos, Frank ; Koini, Marisa ; Schmidt, Reinhold ; de Rooij, Mark</creatorcontrib><description>Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This may lead to very large quantities of missing data which, especially when combined with high-dimensionality, can make the application of conditional imputation methods computationally infeasible. However, the multi-view structure could be leveraged to reduce the complexity and computational load of imputation. We introduce a new imputation method based on the existing stacked penalized logistic regression (StaPLR) algorithm for multi-view learning. It performs imputation in a dimension-reduced space to address computational challenges inherent to the multi-view context. We compare the performance of the new imputation method with several existing imputation algorithms in simulated data sets and a real data application. The results show that the new imputation method leads to competitive results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible.
•A new imputation method for multi-view data is introduced.•The new method shows competitive results at a much lower computational cost.•The new method allows state-of-the-art algorithms to be used in much larger data sets than before.</description><identifier>ISSN: 1566-2535</identifier><identifier>EISSN: 1872-6305</identifier><identifier>DOI: 10.1016/j.inffus.2024.102524</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Feature selection ; Imputation ; Missing data ; Multi-view learning ; Stacked generalization</subject><ispartof>Information fusion, 2024-11, Vol.111, p.102524, Article 102524</ispartof><rights>2024 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c231t-2ea128fe920889ea5336401d60c6a33a19dc8864a08d16c0299e8aebf5f853433</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids></links><search><creatorcontrib>van Loon, Wouter</creatorcontrib><creatorcontrib>Fokkema, Marjolein</creatorcontrib><creatorcontrib>de Vos, Frank</creatorcontrib><creatorcontrib>Koini, Marisa</creatorcontrib><creatorcontrib>Schmidt, Reinhold</creatorcontrib><creatorcontrib>de Rooij, Mark</creatorcontrib><title>Imputation of missing values in multi-view data</title><title>Information fusion</title><description>Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This may lead to very large quantities of missing data which, especially when combined with high-dimensionality, can make the application of conditional imputation methods computationally infeasible. However, the multi-view structure could be leveraged to reduce the complexity and computational load of imputation. We introduce a new imputation method based on the existing stacked penalized logistic regression (StaPLR) algorithm for multi-view learning. It performs imputation in a dimension-reduced space to address computational challenges inherent to the multi-view context. We compare the performance of the new imputation method with several existing imputation algorithms in simulated data sets and a real data application. The results show that the new imputation method leads to competitive results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible.
•A new imputation method for multi-view data is introduced.•The new method shows competitive results at a much lower computational cost.•The new method allows state-of-the-art algorithms to be used in much larger data sets than before.</description><subject>Feature selection</subject><subject>Imputation</subject><subject>Missing data</subject><subject>Multi-view learning</subject><subject>Stacked generalization</subject><issn>1566-2535</issn><issn>1872-6305</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9j8tKxDAYhYMoOI6-gYu8QDt_rpNuBBm8DAy40XWI6R9J6WVo0hHf3g517eocDpzD-Qi5Z1AyYHrTlLEPYUolBy7niCsuL8iKmS0vtAB1OXuldcGVUNfkJqUGgG1BsBXZ7LvjlF2OQ0-HQLuYUuy_6Mm1EyYae9pNbY7FKeI3rV12t-QquDbh3Z-uycfz0_vutTi8vex3j4fCc8FywdExbgJWHIyp0CkhtARWa_DaCeFYVXtjtHRgaqY98KpC4_AzqGCUkEKsiVx2_TikNGKwxzF2bvyxDOwZ2jZ2gbZnaLtAz7WHpYbzt_n0aJOP2Hus44g-23qI_w_8AiBUYN8</recordid><startdate>202411</startdate><enddate>202411</enddate><creator>van Loon, Wouter</creator><creator>Fokkema, Marjolein</creator><creator>de Vos, Frank</creator><creator>Koini, Marisa</creator><creator>Schmidt, Reinhold</creator><creator>de Rooij, Mark</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>202411</creationdate><title>Imputation of missing values in multi-view data</title><author>van Loon, Wouter ; Fokkema, Marjolein ; de Vos, Frank ; Koini, Marisa ; Schmidt, Reinhold ; de Rooij, Mark</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c231t-2ea128fe920889ea5336401d60c6a33a19dc8864a08d16c0299e8aebf5f853433</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Feature selection</topic><topic>Imputation</topic><topic>Missing data</topic><topic>Multi-view learning</topic><topic>Stacked generalization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>van Loon, Wouter</creatorcontrib><creatorcontrib>Fokkema, Marjolein</creatorcontrib><creatorcontrib>de Vos, Frank</creatorcontrib><creatorcontrib>Koini, Marisa</creatorcontrib><creatorcontrib>Schmidt, Reinhold</creatorcontrib><creatorcontrib>de Rooij, Mark</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><jtitle>Information fusion</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>van Loon, Wouter</au><au>Fokkema, Marjolein</au><au>de Vos, Frank</au><au>Koini, Marisa</au><au>Schmidt, Reinhold</au><au>de Rooij, Mark</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Imputation of missing values in multi-view data</atitle><jtitle>Information fusion</jtitle><date>2024-11</date><risdate>2024</risdate><volume>111</volume><spage>102524</spage><pages>102524-</pages><artnum>102524</artnum><issn>1566-2535</issn><eissn>1872-6305</eissn><abstract>Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This may lead to very large quantities of missing data which, especially when combined with high-dimensionality, can make the application of conditional imputation methods computationally infeasible. However, the multi-view structure could be leveraged to reduce the complexity and computational load of imputation. We introduce a new imputation method based on the existing stacked penalized logistic regression (StaPLR) algorithm for multi-view learning. It performs imputation in a dimension-reduced space to address computational challenges inherent to the multi-view context. We compare the performance of the new imputation method with several existing imputation algorithms in simulated data sets and a real data application. The results show that the new imputation method leads to competitive results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible.
•A new imputation method for multi-view data is introduced.•The new method shows competitive results at a much lower computational cost.•The new method allows state-of-the-art algorithms to be used in much larger data sets than before.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.inffus.2024.102524</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1566-2535 |
ispartof | Information fusion, 2024-11, Vol.111, p.102524, Article 102524 |
issn | 1566-2535 1872-6305 |
language | eng |
recordid | cdi_crossref_primary_10_1016_j_inffus_2024_102524 |
source | ScienceDirect Journals |
subjects | Feature selection Imputation Missing data Multi-view learning Stacked generalization |
title | Imputation of missing values in multi-view data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T18%3A53%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Imputation%20of%20missing%20values%20in%20multi-view%20data&rft.jtitle=Information%20fusion&rft.au=van%20Loon,%20Wouter&rft.date=2024-11&rft.volume=111&rft.spage=102524&rft.pages=102524-&rft.artnum=102524&rft.issn=1566-2535&rft.eissn=1872-6305&rft_id=info:doi/10.1016/j.inffus.2024.102524&rft_dat=%3Celsevier_cross%3ES1566253524003026%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c231t-2ea128fe920889ea5336401d60c6a33a19dc8864a08d16c0299e8aebf5f853433%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |