Loading…

A Novel Framework for Imputation of Missing Values in Databases

Many of the industrial and research databases are plagued by the problem of missing values. Some evident examples include databases associated with instrument maintenance, medical applications, and surveys. One of the common ways to cope with missing values is to complete their imputation (filling i...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on systems, man and cybernetics. Part A, Systems and humans man and cybernetics. Part A, Systems and humans, 2007-09, Vol.37 (5), p.692-709
Main Authors: Farhangfar, A., Kurgan, L.A., Pedrycz, W.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c438t-f18e29f77e674a0fe0decb1c98efe362e88803501582a74b1fe44a27c44575a53
cites cdi_FETCH-LOGICAL-c438t-f18e29f77e674a0fe0decb1c98efe362e88803501582a74b1fe44a27c44575a53
container_end_page 709
container_issue 5
container_start_page 692
container_title IEEE transactions on systems, man and cybernetics. Part A, Systems and humans
container_volume 37
creator Farhangfar, A.
Kurgan, L.A.
Pedrycz, W.
description Many of the industrial and research databases are plagued by the problem of missing values. Some evident examples include databases associated with instrument maintenance, medical applications, and surveys. One of the common ways to cope with missing values is to complete their imputation (filling in). Given the rapid growth of sizes of databases, it becomes imperative to come up with a new imputation methodology along with efficient algorithms. The main objective of this paper is to develop a unified framework supporting a host of imputation methods. In the development of this framework, we require that its usage should (on average) lead to the significant improvement of accuracy of imputation while maintaining the same asymptotic computational complexity of the individual methods. Our intent is to provide a comprehensive review of the representative imputation techniques. It is noticeable that the use of the framework in the case of a low-quality single-imputation method has resulted in the imputation accuracy that is comparable to the one achieved when dealing with some other advanced imputation techniques. We also demonstrate, both theoretically and experimentally, that the application of the proposed framework leads to a linear computational complexity and, therefore, does not affect the asymptotic complexity of the associated imputation method.
doi_str_mv 10.1109/TSMCA.2007.902631
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TSMCA_2007_902631</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4292217</ieee_id><sourcerecordid>880659392</sourcerecordid><originalsourceid>FETCH-LOGICAL-c438t-f18e29f77e674a0fe0decb1c98efe362e88803501582a74b1fe44a27c44575a53</originalsourceid><addsrcrecordid>eNpdkLFOwzAURS0EEqXwAYjFYmFKsZ3n2JlQVShUamGgsFpueEYpSVzsBMTfk1LEwPTecO7V1SHklLMR5yy_XD4uJuORYEyNciaylO-RAZdSJwJEtt__TKcJgFCH5CjGNWMcIIcBuRrTe_-BFZ0GW-OnD2_U-UBn9aZrbVv6hnpHF2WMZfNKn23VYaRlQ69ta1c2YjwmB85WEU9-75A8TW-Wk7tk_nA7m4znSQGpbhPHNYrcKYWZAsscshcsVrzINTpMM4Faa5ZKxqUWVsGKOwSwQhUAUkkr0yG52PVugn_vR7SmLmOBVWUb9F00fTyTeZqLnjz_R659F5p-nNEZcK6BQw_xHVQEH2NAZzahrG34MpyZrVDzI9RshZqd0D5ztsuUiPjHg8iF4Cr9Bt8tb9s</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>864118414</pqid></control><display><type>article</type><title>A Novel Framework for Imputation of Missing Values in Databases</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Farhangfar, A. ; Kurgan, L.A. ; Pedrycz, W.</creator><creatorcontrib>Farhangfar, A. ; Kurgan, L.A. ; Pedrycz, W.</creatorcontrib><description>Many of the industrial and research databases are plagued by the problem of missing values. Some evident examples include databases associated with instrument maintenance, medical applications, and surveys. One of the common ways to cope with missing values is to complete their imputation (filling in). Given the rapid growth of sizes of databases, it becomes imperative to come up with a new imputation methodology along with efficient algorithms. The main objective of this paper is to develop a unified framework supporting a host of imputation methods. In the development of this framework, we require that its usage should (on average) lead to the significant improvement of accuracy of imputation while maintaining the same asymptotic computational complexity of the individual methods. Our intent is to provide a comprehensive review of the representative imputation techniques. It is noticeable that the use of the framework in the case of a low-quality single-imputation method has resulted in the imputation accuracy that is comparable to the one achieved when dealing with some other advanced imputation techniques. We also demonstrate, both theoretically and experimentally, that the application of the proposed framework leads to a linear computational complexity and, therefore, does not affect the asymptotic complexity of the associated imputation method.</description><identifier>ISSN: 1083-4427</identifier><identifier>ISSN: 2168-2216</identifier><identifier>EISSN: 1558-2426</identifier><identifier>EISSN: 2168-2232</identifier><identifier>DOI: 10.1109/TSMCA.2007.902631</identifier><identifier>CODEN: ITSHFX</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accuracy ; Algorithms ; Asymptotic properties ; Biomedical engineering ; Biomedical equipment ; Complexity ; Computation ; Computational complexity ; Councils ; Cybernetics ; Data analysis ; Dealing ; Filling ; Human ; Instruments ; Medical services ; missing values ; multiple imputation (MI) ; single imputation ; Studies ; Testing</subject><ispartof>IEEE transactions on systems, man and cybernetics. Part A, Systems and humans, 2007-09, Vol.37 (5), p.692-709</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2007</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c438t-f18e29f77e674a0fe0decb1c98efe362e88803501582a74b1fe44a27c44575a53</citedby><cites>FETCH-LOGICAL-c438t-f18e29f77e674a0fe0decb1c98efe362e88803501582a74b1fe44a27c44575a53</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4292217$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27922,27923,54794</link.rule.ids></links><search><creatorcontrib>Farhangfar, A.</creatorcontrib><creatorcontrib>Kurgan, L.A.</creatorcontrib><creatorcontrib>Pedrycz, W.</creatorcontrib><title>A Novel Framework for Imputation of Missing Values in Databases</title><title>IEEE transactions on systems, man and cybernetics. Part A, Systems and humans</title><addtitle>TSMCA</addtitle><description>Many of the industrial and research databases are plagued by the problem of missing values. Some evident examples include databases associated with instrument maintenance, medical applications, and surveys. One of the common ways to cope with missing values is to complete their imputation (filling in). Given the rapid growth of sizes of databases, it becomes imperative to come up with a new imputation methodology along with efficient algorithms. The main objective of this paper is to develop a unified framework supporting a host of imputation methods. In the development of this framework, we require that its usage should (on average) lead to the significant improvement of accuracy of imputation while maintaining the same asymptotic computational complexity of the individual methods. Our intent is to provide a comprehensive review of the representative imputation techniques. It is noticeable that the use of the framework in the case of a low-quality single-imputation method has resulted in the imputation accuracy that is comparable to the one achieved when dealing with some other advanced imputation techniques. We also demonstrate, both theoretically and experimentally, that the application of the proposed framework leads to a linear computational complexity and, therefore, does not affect the asymptotic complexity of the associated imputation method.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Asymptotic properties</subject><subject>Biomedical engineering</subject><subject>Biomedical equipment</subject><subject>Complexity</subject><subject>Computation</subject><subject>Computational complexity</subject><subject>Councils</subject><subject>Cybernetics</subject><subject>Data analysis</subject><subject>Dealing</subject><subject>Filling</subject><subject>Human</subject><subject>Instruments</subject><subject>Medical services</subject><subject>missing values</subject><subject>multiple imputation (MI)</subject><subject>single imputation</subject><subject>Studies</subject><subject>Testing</subject><issn>1083-4427</issn><issn>2168-2216</issn><issn>1558-2426</issn><issn>2168-2232</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><recordid>eNpdkLFOwzAURS0EEqXwAYjFYmFKsZ3n2JlQVShUamGgsFpueEYpSVzsBMTfk1LEwPTecO7V1SHklLMR5yy_XD4uJuORYEyNciaylO-RAZdSJwJEtt__TKcJgFCH5CjGNWMcIIcBuRrTe_-BFZ0GW-OnD2_U-UBn9aZrbVv6hnpHF2WMZfNKn23VYaRlQ69ta1c2YjwmB85WEU9-75A8TW-Wk7tk_nA7m4znSQGpbhPHNYrcKYWZAsscshcsVrzINTpMM4Faa5ZKxqUWVsGKOwSwQhUAUkkr0yG52PVugn_vR7SmLmOBVWUb9F00fTyTeZqLnjz_R659F5p-nNEZcK6BQw_xHVQEH2NAZzahrG34MpyZrVDzI9RshZqd0D5ztsuUiPjHg8iF4Cr9Bt8tb9s</recordid><startdate>20070901</startdate><enddate>20070901</enddate><creator>Farhangfar, A.</creator><creator>Kurgan, L.A.</creator><creator>Pedrycz, W.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope></search><sort><creationdate>20070901</creationdate><title>A Novel Framework for Imputation of Missing Values in Databases</title><author>Farhangfar, A. ; Kurgan, L.A. ; Pedrycz, W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c438t-f18e29f77e674a0fe0decb1c98efe362e88803501582a74b1fe44a27c44575a53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Asymptotic properties</topic><topic>Biomedical engineering</topic><topic>Biomedical equipment</topic><topic>Complexity</topic><topic>Computation</topic><topic>Computational complexity</topic><topic>Councils</topic><topic>Cybernetics</topic><topic>Data analysis</topic><topic>Dealing</topic><topic>Filling</topic><topic>Human</topic><topic>Instruments</topic><topic>Medical services</topic><topic>missing values</topic><topic>multiple imputation (MI)</topic><topic>single imputation</topic><topic>Studies</topic><topic>Testing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Farhangfar, A.</creatorcontrib><creatorcontrib>Kurgan, L.A.</creatorcontrib><creatorcontrib>Pedrycz, W.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) Online</collection><collection>IEEE Xplore Digital Library</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><jtitle>IEEE transactions on systems, man and cybernetics. Part A, Systems and humans</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Farhangfar, A.</au><au>Kurgan, L.A.</au><au>Pedrycz, W.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Novel Framework for Imputation of Missing Values in Databases</atitle><jtitle>IEEE transactions on systems, man and cybernetics. Part A, Systems and humans</jtitle><stitle>TSMCA</stitle><date>2007-09-01</date><risdate>2007</risdate><volume>37</volume><issue>5</issue><spage>692</spage><epage>709</epage><pages>692-709</pages><issn>1083-4427</issn><issn>2168-2216</issn><eissn>1558-2426</eissn><eissn>2168-2232</eissn><coden>ITSHFX</coden><abstract>Many of the industrial and research databases are plagued by the problem of missing values. Some evident examples include databases associated with instrument maintenance, medical applications, and surveys. One of the common ways to cope with missing values is to complete their imputation (filling in). Given the rapid growth of sizes of databases, it becomes imperative to come up with a new imputation methodology along with efficient algorithms. The main objective of this paper is to develop a unified framework supporting a host of imputation methods. In the development of this framework, we require that its usage should (on average) lead to the significant improvement of accuracy of imputation while maintaining the same asymptotic computational complexity of the individual methods. Our intent is to provide a comprehensive review of the representative imputation techniques. It is noticeable that the use of the framework in the case of a low-quality single-imputation method has resulted in the imputation accuracy that is comparable to the one achieved when dealing with some other advanced imputation techniques. We also demonstrate, both theoretically and experimentally, that the application of the proposed framework leads to a linear computational complexity and, therefore, does not affect the asymptotic complexity of the associated imputation method.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSMCA.2007.902631</doi><tpages>18</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1083-4427
ispartof IEEE transactions on systems, man and cybernetics. Part A, Systems and humans, 2007-09, Vol.37 (5), p.692-709
issn 1083-4427
2168-2216
1558-2426
2168-2232
language eng
recordid cdi_crossref_primary_10_1109_TSMCA_2007_902631
source IEEE Electronic Library (IEL) Journals
subjects Accuracy
Algorithms
Asymptotic properties
Biomedical engineering
Biomedical equipment
Complexity
Computation
Computational complexity
Councils
Cybernetics
Data analysis
Dealing
Filling
Human
Instruments
Medical services
missing values
multiple imputation (MI)
single imputation
Studies
Testing
title A Novel Framework for Imputation of Missing Values in Databases
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T10%3A18%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Novel%20Framework%20for%20Imputation%20of%20Missing%20Values%20in%20Databases&rft.jtitle=IEEE%20transactions%20on%20systems,%20man%20and%20cybernetics.%20Part%20A,%20Systems%20and%20humans&rft.au=Farhangfar,%20A.&rft.date=2007-09-01&rft.volume=37&rft.issue=5&rft.spage=692&rft.epage=709&rft.pages=692-709&rft.issn=1083-4427&rft.eissn=1558-2426&rft.coden=ITSHFX&rft_id=info:doi/10.1109/TSMCA.2007.902631&rft_dat=%3Cproquest_cross%3E880659392%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c438t-f18e29f77e674a0fe0decb1c98efe362e88803501582a74b1fe44a27c44575a53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=864118414&rft_id=info:pmid/&rft_ieee_id=4292217&rfr_iscdi=true