Loading…
Structured Matrix Completion with Applications to Genomic Data Integration
Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics, and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independ...
Saved in:
Published in: | Journal of the American Statistical Association 2016-06, Vol.111 (514), p.621-633 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c584t-c237fa2e843140bc5a3e32ad1739b80b7b570773a8f3310afadd5bbab3012af3 |
---|---|
cites | cdi_FETCH-LOGICAL-c584t-c237fa2e843140bc5a3e32ad1739b80b7b570773a8f3310afadd5bbab3012af3 |
container_end_page | 633 |
container_issue | 514 |
container_start_page | 621 |
container_title | Journal of the American Statistical Association |
container_volume | 111 |
creator | Cai, Tianxi Cai, T. Tony Zhang, Anru |
description | Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics, and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival. Supplementary materials for this article are available online. |
doi_str_mv | 10.1080/01621459.2015.1021005 |
format | article |
fullrecord | <record><control><sourceid>jstor_pubme</sourceid><recordid>TN_cdi_jstor_primary_24739556</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>24739556</jstor_id><sourcerecordid>24739556</sourcerecordid><originalsourceid>FETCH-LOGICAL-c584t-c237fa2e843140bc5a3e32ad1739b80b7b570773a8f3310afadd5bbab3012af3</originalsourceid><addsrcrecordid>eNqFkUtv1DAUhS0EokPhJxRFYtNNip-1s0FUQylFRSzogp114zitR4kdbIe2_x6HmZbHAryx5PPdo-tzEDog-IhghV9jckwJF80RxUSUJ0owFo_Qiggmayr518dotTD1Au2hZyltcDlSqadojyrMKVFqhT5-yXE2eY62qz5Bju62WodxGmx2wVc3Ll9XJ9M0OAPLQ6pyqM6sD6Mz1TvIUJ37bK_iT_E5etLDkOyL3b2PLt-fXq4_1Befz87XJxe1EYrn2lAme6BWcUY4bo0AZhmFjkjWtAq3shUSS8lA9YwRDD10nWhbaBkmFHq2j95sbae5HW1nrM8RBj1FN0K80wGc_lPx7lpfhe9akEYpzovB4c4ghm-zTVmPLhk7DOBtmJOmJSbSNJTh_6JECV6iLGEW9NVf6CbM0ZcgClXsCBF4MRRbysSQUrT9w94E66VXfd-rXnrVu17L3MvfP_0wdV9kAQ62wCblEH_pvKQqxHHR32515_sQR7gJceh0hrshxD6CNy5p9u8dfgCEC7tz</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1819911500</pqid></control><display><type>article</type><title>Structured Matrix Completion with Applications to Genomic Data Integration</title><source>International Bibliography of the Social Sciences (IBSS)</source><source>JSTOR Archival Journals and Primary Sources Collection</source><source>Taylor and Francis:Jisc Collections:Taylor and Francis Read and Publish Agreement 2024-2025:Science and Technology Collection (Reading list)</source><creator>Cai, Tianxi ; Cai, T. Tony ; Zhang, Anru</creator><creatorcontrib>Cai, Tianxi ; Cai, T. Tony ; Zhang, Anru</creatorcontrib><description>Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics, and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival. Supplementary materials for this article are available online.</description><identifier>ISSN: 0162-1459</identifier><identifier>ISSN: 1537-274X</identifier><identifier>EISSN: 1537-274X</identifier><identifier>DOI: 10.1080/01621459.2015.1021005</identifier><identifier>PMID: 28042188</identifier><identifier>CODEN: JSTNAL</identifier><language>eng</language><publisher>United States: Taylor & Francis</publisher><subject>animal ovaries ; Constrained minimization ; engineering ; equations ; Genomic data integration ; Genomics ; Low-rank matrix ; mathematics ; Matrix ; Matrix completion ; Ovarian cancer ; ovarian neoplasms ; prediction ; Sampling ; Simulation ; Singular value decomposition ; Statistics ; Structured matrix completion ; Theory and Methods</subject><ispartof>Journal of the American Statistical Association, 2016-06, Vol.111 (514), p.621-633</ispartof><rights>American Statistical Association 2016</rights><rights>2016 American Statistical Association</rights><rights>Copyright Taylor & Francis Ltd. Jun 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c584t-c237fa2e843140bc5a3e32ad1739b80b7b570773a8f3310afadd5bbab3012af3</citedby><cites>FETCH-LOGICAL-c584t-c237fa2e843140bc5a3e32ad1739b80b7b570773a8f3310afadd5bbab3012af3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/24739556$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/24739556$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,881,27901,27902,33200,58213,58446</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28042188$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Cai, Tianxi</creatorcontrib><creatorcontrib>Cai, T. Tony</creatorcontrib><creatorcontrib>Zhang, Anru</creatorcontrib><title>Structured Matrix Completion with Applications to Genomic Data Integration</title><title>Journal of the American Statistical Association</title><addtitle>J Am Stat Assoc</addtitle><description>Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics, and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival. Supplementary materials for this article are available online.</description><subject>animal ovaries</subject><subject>Constrained minimization</subject><subject>engineering</subject><subject>equations</subject><subject>Genomic data integration</subject><subject>Genomics</subject><subject>Low-rank matrix</subject><subject>mathematics</subject><subject>Matrix</subject><subject>Matrix completion</subject><subject>Ovarian cancer</subject><subject>ovarian neoplasms</subject><subject>prediction</subject><subject>Sampling</subject><subject>Simulation</subject><subject>Singular value decomposition</subject><subject>Statistics</subject><subject>Structured matrix completion</subject><subject>Theory and Methods</subject><issn>0162-1459</issn><issn>1537-274X</issn><issn>1537-274X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>8BJ</sourceid><recordid>eNqFkUtv1DAUhS0EokPhJxRFYtNNip-1s0FUQylFRSzogp114zitR4kdbIe2_x6HmZbHAryx5PPdo-tzEDog-IhghV9jckwJF80RxUSUJ0owFo_Qiggmayr518dotTD1Au2hZyltcDlSqadojyrMKVFqhT5-yXE2eY62qz5Bju62WodxGmx2wVc3Ll9XJ9M0OAPLQ6pyqM6sD6Mz1TvIUJ37bK_iT_E5etLDkOyL3b2PLt-fXq4_1Befz87XJxe1EYrn2lAme6BWcUY4bo0AZhmFjkjWtAq3shUSS8lA9YwRDD10nWhbaBkmFHq2j95sbae5HW1nrM8RBj1FN0K80wGc_lPx7lpfhe9akEYpzovB4c4ghm-zTVmPLhk7DOBtmJOmJSbSNJTh_6JECV6iLGEW9NVf6CbM0ZcgClXsCBF4MRRbysSQUrT9w94E66VXfd-rXnrVu17L3MvfP_0wdV9kAQ62wCblEH_pvKQqxHHR32515_sQR7gJceh0hrshxD6CNy5p9u8dfgCEC7tz</recordid><startdate>20160601</startdate><enddate>20160601</enddate><creator>Cai, Tianxi</creator><creator>Cai, T. Tony</creator><creator>Zhang, Anru</creator><general>Taylor & Francis</general><general>Taylor & Francis Group, LLC</general><general>Taylor & Francis Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8BJ</scope><scope>FQK</scope><scope>JBE</scope><scope>K9.</scope><scope>7X8</scope><scope>7S9</scope><scope>L.6</scope><scope>5PM</scope></search><sort><creationdate>20160601</creationdate><title>Structured Matrix Completion with Applications to Genomic Data Integration</title><author>Cai, Tianxi ; Cai, T. Tony ; Zhang, Anru</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c584t-c237fa2e843140bc5a3e32ad1739b80b7b570773a8f3310afadd5bbab3012af3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>animal ovaries</topic><topic>Constrained minimization</topic><topic>engineering</topic><topic>equations</topic><topic>Genomic data integration</topic><topic>Genomics</topic><topic>Low-rank matrix</topic><topic>mathematics</topic><topic>Matrix</topic><topic>Matrix completion</topic><topic>Ovarian cancer</topic><topic>ovarian neoplasms</topic><topic>prediction</topic><topic>Sampling</topic><topic>Simulation</topic><topic>Singular value decomposition</topic><topic>Statistics</topic><topic>Structured matrix completion</topic><topic>Theory and Methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cai, Tianxi</creatorcontrib><creatorcontrib>Cai, T. Tony</creatorcontrib><creatorcontrib>Zhang, Anru</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>International Bibliography of the Social Sciences</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of the American Statistical Association</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cai, Tianxi</au><au>Cai, T. Tony</au><au>Zhang, Anru</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Structured Matrix Completion with Applications to Genomic Data Integration</atitle><jtitle>Journal of the American Statistical Association</jtitle><addtitle>J Am Stat Assoc</addtitle><date>2016-06-01</date><risdate>2016</risdate><volume>111</volume><issue>514</issue><spage>621</spage><epage>633</epage><pages>621-633</pages><issn>0162-1459</issn><issn>1537-274X</issn><eissn>1537-274X</eissn><coden>JSTNAL</coden><abstract>Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics, and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival. Supplementary materials for this article are available online.</abstract><cop>United States</cop><pub>Taylor & Francis</pub><pmid>28042188</pmid><doi>10.1080/01621459.2015.1021005</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0162-1459 |
ispartof | Journal of the American Statistical Association, 2016-06, Vol.111 (514), p.621-633 |
issn | 0162-1459 1537-274X 1537-274X |
language | eng |
recordid | cdi_jstor_primary_24739556 |
source | International Bibliography of the Social Sciences (IBSS); JSTOR Archival Journals and Primary Sources Collection; Taylor and Francis:Jisc Collections:Taylor and Francis Read and Publish Agreement 2024-2025:Science and Technology Collection (Reading list) |
subjects | animal ovaries Constrained minimization engineering equations Genomic data integration Genomics Low-rank matrix mathematics Matrix Matrix completion Ovarian cancer ovarian neoplasms prediction Sampling Simulation Singular value decomposition Statistics Structured matrix completion Theory and Methods |
title | Structured Matrix Completion with Applications to Genomic Data Integration |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T16%3A48%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Structured%20Matrix%20Completion%20with%20Applications%20to%20Genomic%20Data%20Integration&rft.jtitle=Journal%20of%20the%20American%20Statistical%20Association&rft.au=Cai,%20Tianxi&rft.date=2016-06-01&rft.volume=111&rft.issue=514&rft.spage=621&rft.epage=633&rft.pages=621-633&rft.issn=0162-1459&rft.eissn=1537-274X&rft.coden=JSTNAL&rft_id=info:doi/10.1080/01621459.2015.1021005&rft_dat=%3Cjstor_pubme%3E24739556%3C/jstor_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c584t-c237fa2e843140bc5a3e32ad1739b80b7b570773a8f3310afadd5bbab3012af3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1819911500&rft_id=info:pmid/28042188&rft_jstor_id=24739556&rfr_iscdi=true |