Loading…

Bootstrapping Privacy Compliance in Big Data Systems

With the rapid increase in cloud services collecting and using user data to offer personalized experiences, ensuring that these services comply with their privacy policies has become a business imperative for building user trust. However, most compliance efforts in industry today rely on manual revi...

Full description

Saved in:
Bibliographic Details
Main Authors: Sen, Shayak, Guha, Saikat, Datta, Anupam, Rajamani, Sriram K., Tsai, Janice, Wing, Jeannette M.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 342
container_issue
container_start_page 327
container_title
container_volume
creator Sen, Shayak
Guha, Saikat
Datta, Anupam
Rajamani, Sriram K.
Tsai, Janice
Wing, Jeannette M.
description With the rapid increase in cloud services collecting and using user data to offer personalized experiences, ensuring that these services comply with their privacy policies has become a business imperative for building user trust. However, most compliance efforts in industry today rely on manual review processes and audits designed to safeguard user data, and therefore are resource intensive and lack coverage. In this paper, we present our experience building and operating a system to automate privacy policy compliance checking in Bing. Central to the design of the system are (a) Legal ease-a language that allows specification of privacy policies that impose restrictions on how user data is handled, and (b) Grok-a data inventory for Map-Reduce-like big data systems that tracks how user data flows among programs. Grok maps code-level schema elements to data types in Legal ease, in essence, annotating existing programs with information flow types with minimal human input. Compliance checking is thus reduced to information flow analysis of Big Data systems. The system, bootstrapped by a small team, checks compliance daily of millions of lines of ever-changing source code written by several thousand developers.
doi_str_mv 10.1109/SP.2014.28
format conference_proceeding
fullrecord <record><control><sourceid>ieee</sourceid><recordid>TN_cdi_ieee_primary_6956573</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6956573</ieee_id><sourcerecordid>6956573</sourcerecordid><originalsourceid>FETCH-LOGICAL-i211t-9728f1565d65a87049cb3218ab8a869f46d0c5be113982c3441b3a7decb4f5643</originalsourceid><addsrcrecordid>eNotzstKw0AUgOFRFEyrG7du5gWSzpn7LG2sWihYqK7LyWRSRpoLmSDk7S3o6t99_IQ8AisAmFsd9gVnIAtur8gCpHFOaqvdNcm4MCoHzswNyYBZyDUDuCOLlL4Z40w4mRG57vspTSMOQ-xOdD_GH_QzLft2OEfsfKCxo-t4oi84IT3MaQptuie3DZ5TePjvkny9bj7L93z38bYtn3d55ABT7gy3DSitaq3QGiadrwQHi5XFy2Ejdc28qgKAcJZ7ISVUAk0dfCUbpaVYkqc_N4YQjsMYWxzno3YX0gjxC0lHQ-M</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Bootstrapping Privacy Compliance in Big Data Systems</title><source>IEEE Xplore All Conference Series</source><creator>Sen, Shayak ; Guha, Saikat ; Datta, Anupam ; Rajamani, Sriram K. ; Tsai, Janice ; Wing, Jeannette M.</creator><creatorcontrib>Sen, Shayak ; Guha, Saikat ; Datta, Anupam ; Rajamani, Sriram K. ; Tsai, Janice ; Wing, Jeannette M.</creatorcontrib><description>With the rapid increase in cloud services collecting and using user data to offer personalized experiences, ensuring that these services comply with their privacy policies has become a business imperative for building user trust. However, most compliance efforts in industry today rely on manual review processes and audits designed to safeguard user data, and therefore are resource intensive and lack coverage. In this paper, we present our experience building and operating a system to automate privacy policy compliance checking in Bing. Central to the design of the system are (a) Legal ease-a language that allows specification of privacy policies that impose restrictions on how user data is handled, and (b) Grok-a data inventory for Map-Reduce-like big data systems that tracks how user data flows among programs. Grok maps code-level schema elements to data types in Legal ease, in essence, annotating existing programs with information flow types with minimal human input. Compliance checking is thus reduced to information flow analysis of Big Data systems. The system, bootstrapped by a small team, checks compliance daily of millions of lines of ever-changing source code written by several thousand developers.</description><identifier>ISSN: 1081-6011</identifier><identifier>EISSN: 2375-1207</identifier><identifier>EISBN: 1479946869</identifier><identifier>EISBN: 9781479946860</identifier><identifier>DOI: 10.1109/SP.2014.28</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Advertising ; Big data ; bing ; compliance ; Data privacy ; information flow ; IP networks ; Lattices ; policy ; Privacy ; program analysis ; Semantics</subject><ispartof>2014 IEEE Symposium on Security and Privacy, 2014, p.327-342</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6956573$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>309,310,777,781,786,787,23911,23912,25121,27906,54536,54913</link.rule.ids></links><search><creatorcontrib>Sen, Shayak</creatorcontrib><creatorcontrib>Guha, Saikat</creatorcontrib><creatorcontrib>Datta, Anupam</creatorcontrib><creatorcontrib>Rajamani, Sriram K.</creatorcontrib><creatorcontrib>Tsai, Janice</creatorcontrib><creatorcontrib>Wing, Jeannette M.</creatorcontrib><title>Bootstrapping Privacy Compliance in Big Data Systems</title><title>2014 IEEE Symposium on Security and Privacy</title><addtitle>SP</addtitle><description>With the rapid increase in cloud services collecting and using user data to offer personalized experiences, ensuring that these services comply with their privacy policies has become a business imperative for building user trust. However, most compliance efforts in industry today rely on manual review processes and audits designed to safeguard user data, and therefore are resource intensive and lack coverage. In this paper, we present our experience building and operating a system to automate privacy policy compliance checking in Bing. Central to the design of the system are (a) Legal ease-a language that allows specification of privacy policies that impose restrictions on how user data is handled, and (b) Grok-a data inventory for Map-Reduce-like big data systems that tracks how user data flows among programs. Grok maps code-level schema elements to data types in Legal ease, in essence, annotating existing programs with information flow types with minimal human input. Compliance checking is thus reduced to information flow analysis of Big Data systems. The system, bootstrapped by a small team, checks compliance daily of millions of lines of ever-changing source code written by several thousand developers.</description><subject>Advertising</subject><subject>Big data</subject><subject>bing</subject><subject>compliance</subject><subject>Data privacy</subject><subject>information flow</subject><subject>IP networks</subject><subject>Lattices</subject><subject>policy</subject><subject>Privacy</subject><subject>program analysis</subject><subject>Semantics</subject><issn>1081-6011</issn><issn>2375-1207</issn><isbn>1479946869</isbn><isbn>9781479946860</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2014</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>ESBDL</sourceid><recordid>eNotzstKw0AUgOFRFEyrG7du5gWSzpn7LG2sWihYqK7LyWRSRpoLmSDk7S3o6t99_IQ8AisAmFsd9gVnIAtur8gCpHFOaqvdNcm4MCoHzswNyYBZyDUDuCOLlL4Z40w4mRG57vspTSMOQ-xOdD_GH_QzLft2OEfsfKCxo-t4oi84IT3MaQptuie3DZ5TePjvkny9bj7L93z38bYtn3d55ABT7gy3DSitaq3QGiadrwQHi5XFy2Ejdc28qgKAcJZ7ISVUAk0dfCUbpaVYkqc_N4YQjsMYWxzno3YX0gjxC0lHQ-M</recordid><startdate>20141113</startdate><enddate>20141113</enddate><creator>Sen, Shayak</creator><creator>Guha, Saikat</creator><creator>Datta, Anupam</creator><creator>Rajamani, Sriram K.</creator><creator>Tsai, Janice</creator><creator>Wing, Jeannette M.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>ESBDL</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20141113</creationdate><title>Bootstrapping Privacy Compliance in Big Data Systems</title><author>Sen, Shayak ; Guha, Saikat ; Datta, Anupam ; Rajamani, Sriram K. ; Tsai, Janice ; Wing, Jeannette M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i211t-9728f1565d65a87049cb3218ab8a869f46d0c5be113982c3441b3a7decb4f5643</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Advertising</topic><topic>Big data</topic><topic>bing</topic><topic>compliance</topic><topic>Data privacy</topic><topic>information flow</topic><topic>IP networks</topic><topic>Lattices</topic><topic>policy</topic><topic>Privacy</topic><topic>program analysis</topic><topic>Semantics</topic><toplevel>online_resources</toplevel><creatorcontrib>Sen, Shayak</creatorcontrib><creatorcontrib>Guha, Saikat</creatorcontrib><creatorcontrib>Datta, Anupam</creatorcontrib><creatorcontrib>Rajamani, Sriram K.</creatorcontrib><creatorcontrib>Tsai, Janice</creatorcontrib><creatorcontrib>Wing, Jeannette M.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore Open Access Journals</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sen, Shayak</au><au>Guha, Saikat</au><au>Datta, Anupam</au><au>Rajamani, Sriram K.</au><au>Tsai, Janice</au><au>Wing, Jeannette M.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Bootstrapping Privacy Compliance in Big Data Systems</atitle><btitle>2014 IEEE Symposium on Security and Privacy</btitle><stitle>SP</stitle><date>2014-11-13</date><risdate>2014</risdate><spage>327</spage><epage>342</epage><pages>327-342</pages><issn>1081-6011</issn><eissn>2375-1207</eissn><eisbn>1479946869</eisbn><eisbn>9781479946860</eisbn><coden>IEEPAD</coden><abstract>With the rapid increase in cloud services collecting and using user data to offer personalized experiences, ensuring that these services comply with their privacy policies has become a business imperative for building user trust. However, most compliance efforts in industry today rely on manual review processes and audits designed to safeguard user data, and therefore are resource intensive and lack coverage. In this paper, we present our experience building and operating a system to automate privacy policy compliance checking in Bing. Central to the design of the system are (a) Legal ease-a language that allows specification of privacy policies that impose restrictions on how user data is handled, and (b) Grok-a data inventory for Map-Reduce-like big data systems that tracks how user data flows among programs. Grok maps code-level schema elements to data types in Legal ease, in essence, annotating existing programs with information flow types with minimal human input. Compliance checking is thus reduced to information flow analysis of Big Data systems. The system, bootstrapped by a small team, checks compliance daily of millions of lines of ever-changing source code written by several thousand developers.</abstract><pub>IEEE</pub><doi>10.1109/SP.2014.28</doi><tpages>16</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1081-6011
ispartof 2014 IEEE Symposium on Security and Privacy, 2014, p.327-342
issn 1081-6011
2375-1207
language eng
recordid cdi_ieee_primary_6956573
source IEEE Xplore All Conference Series
subjects Advertising
Big data
bing
compliance
Data privacy
information flow
IP networks
Lattices
policy
Privacy
program analysis
Semantics
title Bootstrapping Privacy Compliance in Big Data Systems
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T23%3A28%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Bootstrapping%20Privacy%20Compliance%20in%20Big%20Data%20Systems&rft.btitle=2014%20IEEE%20Symposium%20on%20Security%20and%20Privacy&rft.au=Sen,%20Shayak&rft.date=2014-11-13&rft.spage=327&rft.epage=342&rft.pages=327-342&rft.issn=1081-6011&rft.eissn=2375-1207&rft.coden=IEEPAD&rft_id=info:doi/10.1109/SP.2014.28&rft.eisbn=1479946869&rft.eisbn_list=9781479946860&rft_dat=%3Cieee%3E6956573%3C/ieee%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i211t-9728f1565d65a87049cb3218ab8a869f46d0c5be113982c3441b3a7decb4f5643%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6956573&rfr_iscdi=true