Loading…

A methodological assessment of privacy preserving record linkage using survey and administrative data

The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. Howeve...

Full description

Saved in:
Bibliographic Details
Published in:Statistical journal of the IAOS 2022-06, Vol.38 (2), p.413-421
Main Authors: Mirel, Lisa B, Resnick, Dean M, Aram, Jonathan, Cox, Christine S
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913
cites cdi_FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913
container_end_page 421
container_issue 2
container_start_page 413
container_title Statistical journal of the IAOS
container_volume 38
creator Mirel, Lisa B
Resnick, Dean M
Aram, Jonathan
Cox, Christine S
description The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality. Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed. The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal. The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI. The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed.
doi_str_mv 10.3233/SJI-210891
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9335262</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2697096334</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913</originalsourceid><addsrcrecordid>eNpdkV1rFjEQhYMo9kNv_AES8EaE1U2y-boRStFaKXihXofZZPZt6u6mJrsL77839a1FvZph5uFwDoeQF6x9K7gQ775-vmw4a41lj8gxM1o2lsvu8e-9a7SS8oiclHLTttLqrntKjoS0rFVWHBM8oxMu1ymkMe2ih5FCKVjKhPNC00Bvc9zA7-vEgnmL845m9CkHOsb5B-yQruXuWNa84Z7CHCiEKc6xLBmWuCENsMAz8mSAseDz-3lKvn_88O38U3P15eLy_Oyq8UJ2rAkgDPhhkL5nnTGGD9xosF6LHvtBIXCpBeMw8F4rblB6ZpkPnWKt7INl4pS8P-jerv2EwdcQGUZXQ0yQ9y5BdP9-5njtdmlzVgjJFa8Cr-8Fcvq5YlncFIvHcYQZ01ocV1a3VgnRVfTVf-hNWvNc41VK68ooIyr15kD5nErJODyYYa27a8_V9tyhvQq__Nv-A_qnLvELSPWXYA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2677633683</pqid></control><display><type>article</type><title>A methodological assessment of privacy preserving record linkage using survey and administrative data</title><source>EconLit s plnými texty</source><source>EBSCOhost Business Source Ultimate</source><creator>Mirel, Lisa B ; Resnick, Dean M ; Aram, Jonathan ; Cox, Christine S</creator><creatorcontrib>Mirel, Lisa B ; Resnick, Dean M ; Aram, Jonathan ; Cox, Christine S</creatorcontrib><description>The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality. Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed. The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal. The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI. The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed.</description><identifier>ISSN: 1874-7655</identifier><identifier>EISSN: 1875-9254</identifier><identifier>DOI: 10.3233/SJI-210891</identifier><identifier>PMID: 35910693</identifier><language>eng</language><publisher>Netherlands: IOS Press BV</publisher><subject>Data analysis ; Data sources ; Privacy ; Recall ; Statistical analysis</subject><ispartof>Statistical journal of the IAOS, 2022-06, Vol.38 (2), p.413-421</ispartof><rights>Copyright IOS Press BV 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913</citedby><cites>FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35910693$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mirel, Lisa B</creatorcontrib><creatorcontrib>Resnick, Dean M</creatorcontrib><creatorcontrib>Aram, Jonathan</creatorcontrib><creatorcontrib>Cox, Christine S</creatorcontrib><title>A methodological assessment of privacy preserving record linkage using survey and administrative data</title><title>Statistical journal of the IAOS</title><addtitle>Stat J IAOS</addtitle><description>The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality. Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed. The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal. The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI. The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed.</description><subject>Data analysis</subject><subject>Data sources</subject><subject>Privacy</subject><subject>Recall</subject><subject>Statistical analysis</subject><issn>1874-7655</issn><issn>1875-9254</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNpdkV1rFjEQhYMo9kNv_AES8EaE1U2y-boRStFaKXihXofZZPZt6u6mJrsL77839a1FvZph5uFwDoeQF6x9K7gQ775-vmw4a41lj8gxM1o2lsvu8e-9a7SS8oiclHLTttLqrntKjoS0rFVWHBM8oxMu1ymkMe2ih5FCKVjKhPNC00Bvc9zA7-vEgnmL845m9CkHOsb5B-yQruXuWNa84Z7CHCiEKc6xLBmWuCENsMAz8mSAseDz-3lKvn_88O38U3P15eLy_Oyq8UJ2rAkgDPhhkL5nnTGGD9xosF6LHvtBIXCpBeMw8F4rblB6ZpkPnWKt7INl4pS8P-jerv2EwdcQGUZXQ0yQ9y5BdP9-5njtdmlzVgjJFa8Cr-8Fcvq5YlncFIvHcYQZ01ocV1a3VgnRVfTVf-hNWvNc41VK68ooIyr15kD5nErJODyYYa27a8_V9tyhvQq__Nv-A_qnLvELSPWXYA</recordid><startdate>20220607</startdate><enddate>20220607</enddate><creator>Mirel, Lisa B</creator><creator>Resnick, Dean M</creator><creator>Aram, Jonathan</creator><creator>Cox, Christine S</creator><general>IOS Press BV</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20220607</creationdate><title>A methodological assessment of privacy preserving record linkage using survey and administrative data</title><author>Mirel, Lisa B ; Resnick, Dean M ; Aram, Jonathan ; Cox, Christine S</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Data analysis</topic><topic>Data sources</topic><topic>Privacy</topic><topic>Recall</topic><topic>Statistical analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mirel, Lisa B</creatorcontrib><creatorcontrib>Resnick, Dean M</creatorcontrib><creatorcontrib>Aram, Jonathan</creatorcontrib><creatorcontrib>Cox, Christine S</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Statistical journal of the IAOS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mirel, Lisa B</au><au>Resnick, Dean M</au><au>Aram, Jonathan</au><au>Cox, Christine S</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A methodological assessment of privacy preserving record linkage using survey and administrative data</atitle><jtitle>Statistical journal of the IAOS</jtitle><addtitle>Stat J IAOS</addtitle><date>2022-06-07</date><risdate>2022</risdate><volume>38</volume><issue>2</issue><spage>413</spage><epage>421</epage><pages>413-421</pages><issn>1874-7655</issn><eissn>1875-9254</eissn><abstract>The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality. Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed. The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal. The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI. The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed.</abstract><cop>Netherlands</cop><pub>IOS Press BV</pub><pmid>35910693</pmid><doi>10.3233/SJI-210891</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1874-7655
ispartof Statistical journal of the IAOS, 2022-06, Vol.38 (2), p.413-421
issn 1874-7655
1875-9254
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9335262
source EconLit s plnými texty; EBSCOhost Business Source Ultimate
subjects Data analysis
Data sources
Privacy
Recall
Statistical analysis
title A methodological assessment of privacy preserving record linkage using survey and administrative data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T22%3A52%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20methodological%20assessment%20of%20privacy%20preserving%20record%20linkage%20using%20survey%20and%20administrative%20data&rft.jtitle=Statistical%20journal%20of%20the%20IAOS&rft.au=Mirel,%20Lisa%20B&rft.date=2022-06-07&rft.volume=38&rft.issue=2&rft.spage=413&rft.epage=421&rft.pages=413-421&rft.issn=1874-7655&rft.eissn=1875-9254&rft_id=info:doi/10.3233/SJI-210891&rft_dat=%3Cproquest_pubme%3E2697096334%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2677633683&rft_id=info:pmid/35910693&rfr_iscdi=true