Loading…
Data Scientists in Software Teams: State of the Art and Challenges
The demand for analyzing large scale telemetry, machine, and quality data is rapidly increasing in software industry. Data scientists are becoming popular within software teams, e.g., Facebook, LinkedIn and Microsoft are creating a new career path for data scientists. In this paper, we present a lar...
Saved in:
Published in: | IEEE transactions on software engineering 2018-11, Vol.44 (11), p.1024-1038 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c404t-aa1197028420ef0e4c18a6a526846940c413b336d0067f7326a78fec76a650893 |
---|---|
cites | cdi_FETCH-LOGICAL-c404t-aa1197028420ef0e4c18a6a526846940c413b336d0067f7326a78fec76a650893 |
container_end_page | 1038 |
container_issue | 11 |
container_start_page | 1024 |
container_title | IEEE transactions on software engineering |
container_volume | 44 |
creator | Kim, Miryung Zimmermann, Thomas DeLine, Robert Begel, Andrew |
description | The demand for analyzing large scale telemetry, machine, and quality data is rapidly increasing in software industry. Data scientists are becoming popular within software teams, e.g., Facebook, LinkedIn and Microsoft are creating a new career path for data scientists. In this paper, we present a large-scale survey with 793 professional data scientists at Microsoft to understand their educational background, problem topics that they work on, tool usages, and activities. We cluster these data scientists based on the time spent for various activities and identify 9 distinct clusters of data scientists, and their corresponding characteristics. We also discuss the challenges that they face and the best practices they share with other data scientists. Our study finds several trends about data scientists in the software engineering context at Microsoft, and should inform managers on how to leverage data science capability effectively within their teams. |
doi_str_mv | 10.1109/TSE.2017.2754374 |
format | article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8046093</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8046093</ieee_id><sourcerecordid>2132075479</sourcerecordid><originalsourceid>FETCH-LOGICAL-c404t-aa1197028420ef0e4c18a6a526846940c413b336d0067f7326a78fec76a650893</originalsourceid><addsrcrecordid>eNo9kEtPAjEUhRujiYjuTdw0cT14-27dIeAjIXExuG7qcEeGwAy2JcZ_7xCIq7P5zr05HyG3DEaMgXtYlLMRB2ZG3CgpjDwjA-aEK4TicE4GAM4WSll3Sa5SWgOAMkYNyNM05EDLqsE2Nykn2rS07Or8EyLSBYZteqRlDhlpV9O8QjqOmYZ2SSersNlg-4XpmlzUYZPw5pRD8vE8W0xei_n7y9tkPC8qCTIXITDmDHArOWANKCtmgw6Kayu1k1BJJj6F0EsAbWojuA7G1lgZHbQC68SQ3B_v7mL3vceU_brbx7Z_6TkTHPrd5kDBkapil1LE2u9isw3x1zPwB1O-N-UPpvzJVF-5O1YaRPzHLUgNTog_2_FhUw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2132075479</pqid></control><display><type>article</type><title>Data Scientists in Software Teams: State of the Art and Challenges</title><source>IEEE Xplore (Online service)</source><creator>Kim, Miryung ; Zimmermann, Thomas ; DeLine, Robert ; Begel, Andrew</creator><creatorcontrib>Kim, Miryung ; Zimmermann, Thomas ; DeLine, Robert ; Begel, Andrew</creatorcontrib><description>The demand for analyzing large scale telemetry, machine, and quality data is rapidly increasing in software industry. Data scientists are becoming popular within software teams, e.g., Facebook, LinkedIn and Microsoft are creating a new career path for data scientists. In this paper, we present a large-scale survey with 793 professional data scientists at Microsoft to understand their educational background, problem topics that they work on, tool usages, and activities. We cluster these data scientists based on the time spent for various activities and identify 9 distinct clusters of data scientists, and their corresponding characteristics. We also discuss the challenges that they face and the best practices they share with other data scientists. Our study finds several trends about data scientists in the software engineering context at Microsoft, and should inform managers on how to leverage data science capability effectively within their teams.</description><identifier>ISSN: 0098-5589</identifier><identifier>EISSN: 1939-3520</identifier><identifier>DOI: 10.1109/TSE.2017.2754374</identifier><identifier>CODEN: IESEDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Best practices ; Data science ; Demand analysis ; development roles ; industry ; Interviews ; Scientists ; Sociology ; Software ; Software engineering ; State of the art ; Statistics ; Telemetry</subject><ispartof>IEEE transactions on software engineering, 2018-11, Vol.44 (11), p.1024-1038</ispartof><rights>Copyright IEEE Computer Society 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c404t-aa1197028420ef0e4c18a6a526846940c413b336d0067f7326a78fec76a650893</citedby><cites>FETCH-LOGICAL-c404t-aa1197028420ef0e4c18a6a526846940c413b336d0067f7326a78fec76a650893</cites><orcidid>0000-0003-4905-1469 ; 0000-0003-3802-1512</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8046093$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Kim, Miryung</creatorcontrib><creatorcontrib>Zimmermann, Thomas</creatorcontrib><creatorcontrib>DeLine, Robert</creatorcontrib><creatorcontrib>Begel, Andrew</creatorcontrib><title>Data Scientists in Software Teams: State of the Art and Challenges</title><title>IEEE transactions on software engineering</title><addtitle>TSE</addtitle><description>The demand for analyzing large scale telemetry, machine, and quality data is rapidly increasing in software industry. Data scientists are becoming popular within software teams, e.g., Facebook, LinkedIn and Microsoft are creating a new career path for data scientists. In this paper, we present a large-scale survey with 793 professional data scientists at Microsoft to understand their educational background, problem topics that they work on, tool usages, and activities. We cluster these data scientists based on the time spent for various activities and identify 9 distinct clusters of data scientists, and their corresponding characteristics. We also discuss the challenges that they face and the best practices they share with other data scientists. Our study finds several trends about data scientists in the software engineering context at Microsoft, and should inform managers on how to leverage data science capability effectively within their teams.</description><subject>Best practices</subject><subject>Data science</subject><subject>Demand analysis</subject><subject>development roles</subject><subject>industry</subject><subject>Interviews</subject><subject>Scientists</subject><subject>Sociology</subject><subject>Software</subject><subject>Software engineering</subject><subject>State of the art</subject><subject>Statistics</subject><subject>Telemetry</subject><issn>0098-5589</issn><issn>1939-3520</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNo9kEtPAjEUhRujiYjuTdw0cT14-27dIeAjIXExuG7qcEeGwAy2JcZ_7xCIq7P5zr05HyG3DEaMgXtYlLMRB2ZG3CgpjDwjA-aEK4TicE4GAM4WSll3Sa5SWgOAMkYNyNM05EDLqsE2Nykn2rS07Or8EyLSBYZteqRlDhlpV9O8QjqOmYZ2SSersNlg-4XpmlzUYZPw5pRD8vE8W0xei_n7y9tkPC8qCTIXITDmDHArOWANKCtmgw6Kayu1k1BJJj6F0EsAbWojuA7G1lgZHbQC68SQ3B_v7mL3vceU_brbx7Z_6TkTHPrd5kDBkapil1LE2u9isw3x1zPwB1O-N-UPpvzJVF-5O1YaRPzHLUgNTog_2_FhUw</recordid><startdate>20181101</startdate><enddate>20181101</enddate><creator>Kim, Miryung</creator><creator>Zimmermann, Thomas</creator><creator>DeLine, Robert</creator><creator>Begel, Andrew</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>K9.</scope><orcidid>https://orcid.org/0000-0003-4905-1469</orcidid><orcidid>https://orcid.org/0000-0003-3802-1512</orcidid></search><sort><creationdate>20181101</creationdate><title>Data Scientists in Software Teams: State of the Art and Challenges</title><author>Kim, Miryung ; Zimmermann, Thomas ; DeLine, Robert ; Begel, Andrew</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c404t-aa1197028420ef0e4c18a6a526846940c413b336d0067f7326a78fec76a650893</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Best practices</topic><topic>Data science</topic><topic>Demand analysis</topic><topic>development roles</topic><topic>industry</topic><topic>Interviews</topic><topic>Scientists</topic><topic>Sociology</topic><topic>Software</topic><topic>Software engineering</topic><topic>State of the art</topic><topic>Statistics</topic><topic>Telemetry</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kim, Miryung</creatorcontrib><creatorcontrib>Zimmermann, Thomas</creatorcontrib><creatorcontrib>DeLine, Robert</creatorcontrib><creatorcontrib>Begel, Andrew</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Xplore (Online service)</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><jtitle>IEEE transactions on software engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kim, Miryung</au><au>Zimmermann, Thomas</au><au>DeLine, Robert</au><au>Begel, Andrew</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data Scientists in Software Teams: State of the Art and Challenges</atitle><jtitle>IEEE transactions on software engineering</jtitle><stitle>TSE</stitle><date>2018-11-01</date><risdate>2018</risdate><volume>44</volume><issue>11</issue><spage>1024</spage><epage>1038</epage><pages>1024-1038</pages><issn>0098-5589</issn><eissn>1939-3520</eissn><coden>IESEDJ</coden><abstract>The demand for analyzing large scale telemetry, machine, and quality data is rapidly increasing in software industry. Data scientists are becoming popular within software teams, e.g., Facebook, LinkedIn and Microsoft are creating a new career path for data scientists. In this paper, we present a large-scale survey with 793 professional data scientists at Microsoft to understand their educational background, problem topics that they work on, tool usages, and activities. We cluster these data scientists based on the time spent for various activities and identify 9 distinct clusters of data scientists, and their corresponding characteristics. We also discuss the challenges that they face and the best practices they share with other data scientists. Our study finds several trends about data scientists in the software engineering context at Microsoft, and should inform managers on how to leverage data science capability effectively within their teams.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSE.2017.2754374</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-4905-1469</orcidid><orcidid>https://orcid.org/0000-0003-3802-1512</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0098-5589 |
ispartof | IEEE transactions on software engineering, 2018-11, Vol.44 (11), p.1024-1038 |
issn | 0098-5589 1939-3520 |
language | eng |
recordid | cdi_ieee_primary_8046093 |
source | IEEE Xplore (Online service) |
subjects | Best practices Data science Demand analysis development roles industry Interviews Scientists Sociology Software Software engineering State of the art Statistics Telemetry |
title | Data Scientists in Software Teams: State of the Art and Challenges |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T23%3A25%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data%20Scientists%20in%20Software%20Teams:%20State%20of%20the%20Art%20and%20Challenges&rft.jtitle=IEEE%20transactions%20on%20software%20engineering&rft.au=Kim,%20Miryung&rft.date=2018-11-01&rft.volume=44&rft.issue=11&rft.spage=1024&rft.epage=1038&rft.pages=1024-1038&rft.issn=0098-5589&rft.eissn=1939-3520&rft.coden=IESEDJ&rft_id=info:doi/10.1109/TSE.2017.2754374&rft_dat=%3Cproquest_ieee_%3E2132075479%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c404t-aa1197028420ef0e4c18a6a526846940c413b336d0067f7326a78fec76a650893%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2132075479&rft_id=info:pmid/&rft_ieee_id=8046093&rfr_iscdi=true |