Loading…
A methodological assessment of privacy preserving record linkage using survey and administrative data
The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. Howeve...
Saved in:
Published in: | Statistical journal of the IAOS 2022-06, Vol.38 (2), p.413-421 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913 |
---|---|
cites | cdi_FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913 |
container_end_page | 421 |
container_issue | 2 |
container_start_page | 413 |
container_title | Statistical journal of the IAOS |
container_volume | 38 |
creator | Mirel, Lisa B Resnick, Dean M Aram, Jonathan Cox, Christine S |
description | The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality.
Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed.
The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal.
The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI.
The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed. |
doi_str_mv | 10.3233/SJI-210891 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9335262</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2697096334</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913</originalsourceid><addsrcrecordid>eNpdkV1rFjEQhYMo9kNv_AES8EaE1U2y-boRStFaKXihXofZZPZt6u6mJrsL77839a1FvZph5uFwDoeQF6x9K7gQ775-vmw4a41lj8gxM1o2lsvu8e-9a7SS8oiclHLTttLqrntKjoS0rFVWHBM8oxMu1ymkMe2ih5FCKVjKhPNC00Bvc9zA7-vEgnmL845m9CkHOsb5B-yQruXuWNa84Z7CHCiEKc6xLBmWuCENsMAz8mSAseDz-3lKvn_88O38U3P15eLy_Oyq8UJ2rAkgDPhhkL5nnTGGD9xosF6LHvtBIXCpBeMw8F4rblB6ZpkPnWKt7INl4pS8P-jerv2EwdcQGUZXQ0yQ9y5BdP9-5njtdmlzVgjJFa8Cr-8Fcvq5YlncFIvHcYQZ01ocV1a3VgnRVfTVf-hNWvNc41VK68ooIyr15kD5nErJODyYYa27a8_V9tyhvQq__Nv-A_qnLvELSPWXYA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2677633683</pqid></control><display><type>article</type><title>A methodological assessment of privacy preserving record linkage using survey and administrative data</title><source>EconLit s plnými texty</source><source>EBSCOhost Business Source Ultimate</source><creator>Mirel, Lisa B ; Resnick, Dean M ; Aram, Jonathan ; Cox, Christine S</creator><creatorcontrib>Mirel, Lisa B ; Resnick, Dean M ; Aram, Jonathan ; Cox, Christine S</creatorcontrib><description>The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality.
Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed.
The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal.
The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI.
The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed.</description><identifier>ISSN: 1874-7655</identifier><identifier>EISSN: 1875-9254</identifier><identifier>DOI: 10.3233/SJI-210891</identifier><identifier>PMID: 35910693</identifier><language>eng</language><publisher>Netherlands: IOS Press BV</publisher><subject>Data analysis ; Data sources ; Privacy ; Recall ; Statistical analysis</subject><ispartof>Statistical journal of the IAOS, 2022-06, Vol.38 (2), p.413-421</ispartof><rights>Copyright IOS Press BV 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913</citedby><cites>FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35910693$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mirel, Lisa B</creatorcontrib><creatorcontrib>Resnick, Dean M</creatorcontrib><creatorcontrib>Aram, Jonathan</creatorcontrib><creatorcontrib>Cox, Christine S</creatorcontrib><title>A methodological assessment of privacy preserving record linkage using survey and administrative data</title><title>Statistical journal of the IAOS</title><addtitle>Stat J IAOS</addtitle><description>The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality.
Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed.
The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal.
The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI.
The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed.</description><subject>Data analysis</subject><subject>Data sources</subject><subject>Privacy</subject><subject>Recall</subject><subject>Statistical analysis</subject><issn>1874-7655</issn><issn>1875-9254</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNpdkV1rFjEQhYMo9kNv_AES8EaE1U2y-boRStFaKXihXofZZPZt6u6mJrsL77839a1FvZph5uFwDoeQF6x9K7gQ775-vmw4a41lj8gxM1o2lsvu8e-9a7SS8oiclHLTttLqrntKjoS0rFVWHBM8oxMu1ymkMe2ih5FCKVjKhPNC00Bvc9zA7-vEgnmL845m9CkHOsb5B-yQruXuWNa84Z7CHCiEKc6xLBmWuCENsMAz8mSAseDz-3lKvn_88O38U3P15eLy_Oyq8UJ2rAkgDPhhkL5nnTGGD9xosF6LHvtBIXCpBeMw8F4rblB6ZpkPnWKt7INl4pS8P-jerv2EwdcQGUZXQ0yQ9y5BdP9-5njtdmlzVgjJFa8Cr-8Fcvq5YlncFIvHcYQZ01ocV1a3VgnRVfTVf-hNWvNc41VK68ooIyr15kD5nErJODyYYa27a8_V9tyhvQq__Nv-A_qnLvELSPWXYA</recordid><startdate>20220607</startdate><enddate>20220607</enddate><creator>Mirel, Lisa B</creator><creator>Resnick, Dean M</creator><creator>Aram, Jonathan</creator><creator>Cox, Christine S</creator><general>IOS Press BV</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20220607</creationdate><title>A methodological assessment of privacy preserving record linkage using survey and administrative data</title><author>Mirel, Lisa B ; Resnick, Dean M ; Aram, Jonathan ; Cox, Christine S</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Data analysis</topic><topic>Data sources</topic><topic>Privacy</topic><topic>Recall</topic><topic>Statistical analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mirel, Lisa B</creatorcontrib><creatorcontrib>Resnick, Dean M</creatorcontrib><creatorcontrib>Aram, Jonathan</creatorcontrib><creatorcontrib>Cox, Christine S</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Statistical journal of the IAOS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mirel, Lisa B</au><au>Resnick, Dean M</au><au>Aram, Jonathan</au><au>Cox, Christine S</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A methodological assessment of privacy preserving record linkage using survey and administrative data</atitle><jtitle>Statistical journal of the IAOS</jtitle><addtitle>Stat J IAOS</addtitle><date>2022-06-07</date><risdate>2022</risdate><volume>38</volume><issue>2</issue><spage>413</spage><epage>421</epage><pages>413-421</pages><issn>1874-7655</issn><eissn>1875-9254</eissn><abstract>The National Center for Health Statistics (NCHS) links data from surveys to administrative data sources, but privacy concerns make accessing new data sources difficult. Privacy-preserving record linkage (PPRL) is an alternative to traditional linkage approaches that may overcome this barrier. However, prior to implementing PPRL techniques it is important to understand their effect on data quality.
Results from PPRL were compared to results from an established linkage method, which uses unencrypted (plain text) identifiers and both deterministic and probabilistic techniques. The established method was used as the gold standard. Links performed with PPRL were evaluated for precision and recall. An initial assessment and a refined approach were implemented. The impact of PPRL on secondary data analysis, including match and mortality rates, was assessed.
The match rates for all approaches were similar, 5.1% for the gold standard, 5.4% for the initial PPRL and 5.0% for the refined PPRL approach. Precision ranged from 93.8% to 98.9% and recall ranged from 98.7% to 97.8%, depending on the selection of tokens from PPRL. The impact of PPRL on secondary data analysis was minimal.
The findings suggest PPRL works well to link patient records to the National Death Index (NDI) since both sources have a high level of non-missing personally identifiable information, especially among adults 65 and older who may also have a higher likelihood of linking to the NDI.
The results from this study are encouraging for first steps for a statistical agency in the implementation of PPRL approaches, however, future research is still needed.</abstract><cop>Netherlands</cop><pub>IOS Press BV</pub><pmid>35910693</pmid><doi>10.3233/SJI-210891</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1874-7655 |
ispartof | Statistical journal of the IAOS, 2022-06, Vol.38 (2), p.413-421 |
issn | 1874-7655 1875-9254 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9335262 |
source | EconLit s plnými texty; EBSCOhost Business Source Ultimate |
subjects | Data analysis Data sources Privacy Recall Statistical analysis |
title | A methodological assessment of privacy preserving record linkage using survey and administrative data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T22%3A52%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20methodological%20assessment%20of%20privacy%20preserving%20record%20linkage%20using%20survey%20and%20administrative%20data&rft.jtitle=Statistical%20journal%20of%20the%20IAOS&rft.au=Mirel,%20Lisa%20B&rft.date=2022-06-07&rft.volume=38&rft.issue=2&rft.spage=413&rft.epage=421&rft.pages=413-421&rft.issn=1874-7655&rft.eissn=1875-9254&rft_id=info:doi/10.3233/SJI-210891&rft_dat=%3Cproquest_pubme%3E2697096334%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c3541-da38acff5cb148882f287a9c73bebf6ea257312af2b7628e5c191cd46105bd913%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2677633683&rft_id=info:pmid/35910693&rfr_iscdi=true |