Loading…

Online updating method to correct for measurement error in big data streams

When huge amounts of data arrive in streams, online updating is an important method to alleviate both computational and data storage issues. The scope of previous research for online updating is extended in the context of the classical linear measurement error model. In the case where some covariate...

Full description

Saved in:
Bibliographic Details
Published in:Computational statistics & data analysis 2020-09, Vol.149, p.106976, Article 106976
Main Authors: Lee, JooChul, Wang, HaiYing, Schifano, Elizabeth D.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c300t-b099004abb3543ae2ab6da5025530c91357d5da47cb6834d65afe0d69251d14f3
cites cdi_FETCH-LOGICAL-c300t-b099004abb3543ae2ab6da5025530c91357d5da47cb6834d65afe0d69251d14f3
container_end_page
container_issue
container_start_page 106976
container_title Computational statistics & data analysis
container_volume 149
creator Lee, JooChul
Wang, HaiYing
Schifano, Elizabeth D.
description When huge amounts of data arrive in streams, online updating is an important method to alleviate both computational and data storage issues. The scope of previous research for online updating is extended in the context of the classical linear measurement error model. In the case where some covariates are unknowingly measured with error at the beginning of the stream, but then are measured without error after a particular point along the data stream, the updated estimators ignoring the measurement error are biased for the true parameters. Once the covariates measured without error are first observed, a method to correct the bias of the estimators, as well as to correct the biases in their variance estimator, is proposed; after correction, the traditional online updating method can then proceed as usual. Further, asymptotic distributions for the corrected and updated estimators are established. Simulation studies and a real data analysis with an airline on-time dataset are provided to illustrate the performance of the proposed method. •Extends scope of online updating methods to linear measurement error models.•Corrects measurement error biases once measurements are observed precisely.•Establishes asymptotic distributions for the corrected and updated estimators.
doi_str_mv 10.1016/j.csda.2020.106976
format article
fullrecord <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_csda_2020_106976</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0167947320300670</els_id><sourcerecordid>S0167947320300670</sourcerecordid><originalsourceid>FETCH-LOGICAL-c300t-b099004abb3543ae2ab6da5025530c91357d5da47cb6834d65afe0d69251d14f3</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wFP-wNbJdxe8SFErFnrRc8gmszWlu1uSVPDfu0s9exp4medl5iHknsGCAdMP-4XPwS048CnQtdEXZMaWhldGKH5JZuOSqWppxDW5yXkPAFya5Yy8b_tD7JGejsGV2O9oh-VrCLQM1A8poS-0HdKYunxK2GFfKKY0JrGnTdzRkXI0l4Suy7fkqnWHjHd_c04-X54_Vutqs319Wz1tKi8AStVAXQNI1zRCSeGQu0YHp4ArJcDXTCgTVHDS-EYvhQxauRYh6JorFphsxZzwc69PQ84JW3tMsXPpxzKwkw67t5MOO-mwZx0j9HiGcLzsO2Ky2UfsPYY4fWnDEP_DfwGGdWj0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Online updating method to correct for measurement error in big data streams</title><source>ScienceDirect Freedom Collection</source><source>ScienceDirect Journals</source><source>Backfile Package - Computer Science (Legacy) [YCS]</source><source>Backfile Package - Mathematics (Legacy) [YMT]</source><creator>Lee, JooChul ; Wang, HaiYing ; Schifano, Elizabeth D.</creator><creatorcontrib>Lee, JooChul ; Wang, HaiYing ; Schifano, Elizabeth D.</creatorcontrib><description>When huge amounts of data arrive in streams, online updating is an important method to alleviate both computational and data storage issues. The scope of previous research for online updating is extended in the context of the classical linear measurement error model. In the case where some covariates are unknowingly measured with error at the beginning of the stream, but then are measured without error after a particular point along the data stream, the updated estimators ignoring the measurement error are biased for the true parameters. Once the covariates measured without error are first observed, a method to correct the bias of the estimators, as well as to correct the biases in their variance estimator, is proposed; after correction, the traditional online updating method can then proceed as usual. Further, asymptotic distributions for the corrected and updated estimators are established. Simulation studies and a real data analysis with an airline on-time dataset are provided to illustrate the performance of the proposed method. •Extends scope of online updating methods to linear measurement error models.•Corrects measurement error biases once measurements are observed precisely.•Establishes asymptotic distributions for the corrected and updated estimators.</description><identifier>ISSN: 0167-9473</identifier><identifier>EISSN: 1872-7352</identifier><identifier>DOI: 10.1016/j.csda.2020.106976</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Data compression ; Errors-in-variables ; Linear regression ; Streaming data</subject><ispartof>Computational statistics &amp; data analysis, 2020-09, Vol.149, p.106976, Article 106976</ispartof><rights>2020 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c300t-b099004abb3543ae2ab6da5025530c91357d5da47cb6834d65afe0d69251d14f3</citedby><cites>FETCH-LOGICAL-c300t-b099004abb3543ae2ab6da5025530c91357d5da47cb6834d65afe0d69251d14f3</cites><orcidid>0000-0002-9793-332X ; 0000-0001-7729-0243</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0167947320300670$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3428,3439,3563,27923,27924,45971,45990,46002</link.rule.ids></links><search><creatorcontrib>Lee, JooChul</creatorcontrib><creatorcontrib>Wang, HaiYing</creatorcontrib><creatorcontrib>Schifano, Elizabeth D.</creatorcontrib><title>Online updating method to correct for measurement error in big data streams</title><title>Computational statistics &amp; data analysis</title><description>When huge amounts of data arrive in streams, online updating is an important method to alleviate both computational and data storage issues. The scope of previous research for online updating is extended in the context of the classical linear measurement error model. In the case where some covariates are unknowingly measured with error at the beginning of the stream, but then are measured without error after a particular point along the data stream, the updated estimators ignoring the measurement error are biased for the true parameters. Once the covariates measured without error are first observed, a method to correct the bias of the estimators, as well as to correct the biases in their variance estimator, is proposed; after correction, the traditional online updating method can then proceed as usual. Further, asymptotic distributions for the corrected and updated estimators are established. Simulation studies and a real data analysis with an airline on-time dataset are provided to illustrate the performance of the proposed method. •Extends scope of online updating methods to linear measurement error models.•Corrects measurement error biases once measurements are observed precisely.•Establishes asymptotic distributions for the corrected and updated estimators.</description><subject>Data compression</subject><subject>Errors-in-variables</subject><subject>Linear regression</subject><subject>Streaming data</subject><issn>0167-9473</issn><issn>1872-7352</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wFP-wNbJdxe8SFErFnrRc8gmszWlu1uSVPDfu0s9exp4medl5iHknsGCAdMP-4XPwS048CnQtdEXZMaWhldGKH5JZuOSqWppxDW5yXkPAFya5Yy8b_tD7JGejsGV2O9oh-VrCLQM1A8poS-0HdKYunxK2GFfKKY0JrGnTdzRkXI0l4Suy7fkqnWHjHd_c04-X54_Vutqs319Wz1tKi8AStVAXQNI1zRCSeGQu0YHp4ArJcDXTCgTVHDS-EYvhQxauRYh6JorFphsxZzwc69PQ84JW3tMsXPpxzKwkw67t5MOO-mwZx0j9HiGcLzsO2Ky2UfsPYY4fWnDEP_DfwGGdWj0</recordid><startdate>202009</startdate><enddate>202009</enddate><creator>Lee, JooChul</creator><creator>Wang, HaiYing</creator><creator>Schifano, Elizabeth D.</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-9793-332X</orcidid><orcidid>https://orcid.org/0000-0001-7729-0243</orcidid></search><sort><creationdate>202009</creationdate><title>Online updating method to correct for measurement error in big data streams</title><author>Lee, JooChul ; Wang, HaiYing ; Schifano, Elizabeth D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c300t-b099004abb3543ae2ab6da5025530c91357d5da47cb6834d65afe0d69251d14f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Data compression</topic><topic>Errors-in-variables</topic><topic>Linear regression</topic><topic>Streaming data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lee, JooChul</creatorcontrib><creatorcontrib>Wang, HaiYing</creatorcontrib><creatorcontrib>Schifano, Elizabeth D.</creatorcontrib><collection>CrossRef</collection><jtitle>Computational statistics &amp; data analysis</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lee, JooChul</au><au>Wang, HaiYing</au><au>Schifano, Elizabeth D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online updating method to correct for measurement error in big data streams</atitle><jtitle>Computational statistics &amp; data analysis</jtitle><date>2020-09</date><risdate>2020</risdate><volume>149</volume><spage>106976</spage><pages>106976-</pages><artnum>106976</artnum><issn>0167-9473</issn><eissn>1872-7352</eissn><abstract>When huge amounts of data arrive in streams, online updating is an important method to alleviate both computational and data storage issues. The scope of previous research for online updating is extended in the context of the classical linear measurement error model. In the case where some covariates are unknowingly measured with error at the beginning of the stream, but then are measured without error after a particular point along the data stream, the updated estimators ignoring the measurement error are biased for the true parameters. Once the covariates measured without error are first observed, a method to correct the bias of the estimators, as well as to correct the biases in their variance estimator, is proposed; after correction, the traditional online updating method can then proceed as usual. Further, asymptotic distributions for the corrected and updated estimators are established. Simulation studies and a real data analysis with an airline on-time dataset are provided to illustrate the performance of the proposed method. •Extends scope of online updating methods to linear measurement error models.•Corrects measurement error biases once measurements are observed precisely.•Establishes asymptotic distributions for the corrected and updated estimators.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.csda.2020.106976</doi><orcidid>https://orcid.org/0000-0002-9793-332X</orcidid><orcidid>https://orcid.org/0000-0001-7729-0243</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0167-9473
ispartof Computational statistics & data analysis, 2020-09, Vol.149, p.106976, Article 106976
issn 0167-9473
1872-7352
language eng
recordid cdi_crossref_primary_10_1016_j_csda_2020_106976
source ScienceDirect Freedom Collection; ScienceDirect Journals; Backfile Package - Computer Science (Legacy) [YCS]; Backfile Package - Mathematics (Legacy) [YMT]
subjects Data compression
Errors-in-variables
Linear regression
Streaming data
title Online updating method to correct for measurement error in big data streams
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T15%3A22%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20updating%20method%20to%20correct%20for%20measurement%20error%20in%20big%20data%20streams&rft.jtitle=Computational%20statistics%20&%20data%20analysis&rft.au=Lee,%20JooChul&rft.date=2020-09&rft.volume=149&rft.spage=106976&rft.pages=106976-&rft.artnum=106976&rft.issn=0167-9473&rft.eissn=1872-7352&rft_id=info:doi/10.1016/j.csda.2020.106976&rft_dat=%3Celsevier_cross%3ES0167947320300670%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c300t-b099004abb3543ae2ab6da5025530c91357d5da47cb6834d65afe0d69251d14f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true