Loading…
A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App)
In recent decades, the amount of text available for organizational science research has grown tremendously. Despite the availability of text and advances in text analysis methods, many of these techniques remain largely segmented by discipline. Moreover, there is an increasing number of open-source...
Saved in:
Published in: | Journal of business and psychology 2018-08, Vol.33 (4), p.445-459 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c418t-d544b875c4dc108fce9250f192b8d6d9944001d91791048540e0dcd38c8c59183 |
---|---|
cites | cdi_FETCH-LOGICAL-c418t-d544b875c4dc108fce9250f192b8d6d9944001d91791048540e0dcd38c8c59183 |
container_end_page | 459 |
container_issue | 4 |
container_start_page | 445 |
container_title | Journal of business and psychology |
container_volume | 33 |
creator | Banks, George C. Woznyj, Haley M. Wesslen, Ryan S. Ross, Roxanne L. |
description | In recent decades, the amount of text available for organizational science research has grown tremendously. Despite the availability of text and advances in text analysis methods, many of these techniques remain largely segmented by discipline. Moreover, there is an increasing number of open-source tools (R, Python) for text analysis, yet these tools are not easily taken advantage of by social science researchers who likely have limited programming knowledge and exposure to computational methods. In this article, we compare quantitative and qualitative text analysis methods used across social sciences. We describe basic terminology and the overlooked, but critically important, steps in pre-processing raw text (e.g., selection of stop words; stemming). Next, we provide an exploratory analysis of open-ended responses from a prototypical survey dataset using topic modeling with R. We provide a list of best practice recommendations for text analysis focused on (1) hypothesis and question formation, (2) design and data collection, (3) data pre-processing, and (4) topic modeling. We also discuss the creation of scale scores for more traditional correlation and regression analyses. All the data are available in an online repository for the interested reader to practice with, along with a reference list for additional reading, an R markdown file, and an open source interactive topic model tool (topicApp; see https://github.com/wesslen/topicApp, https://github.com/wesslen/text-analysis-org-science, https://dataverse.unc.edu/dataset.xhtml?persistentId=doi:10.15139/S3/R4W7ZS). |
doi_str_mv | 10.1007/s10869-017-9528-3 |
format | article |
fullrecord | <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_journals_1992788273</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>48700765</jstor_id><sourcerecordid>48700765</sourcerecordid><originalsourceid>FETCH-LOGICAL-c418t-d544b875c4dc108fce9250f192b8d6d9944001d91791048540e0dcd38c8c59183</originalsourceid><addsrcrecordid>eNp9kEFLAzEQhYMoWKs_wIMQ8KKH6GST7CbHWqwKglJaEC9hm2RlS7tZk63af2_KinjyNDB87828h9AphSsKUFxHCjJXBGhBlMgkYXtoQEXBCBPsZR8NQEpFWJbLQ3QU4xIABM1hgF5HeOo-aveJfYVvXOzwcyhNVxuX9sav166xZVf7JuLKBzxzXx0eNeVqG-uI6wZP8UXZWFzieXSBTEKd-NUWj9r28hgdVOUqupOfOUTzye1sfE8en-4exqNHYjiVHbGC84UshOHWpBCVcSoTUFGVLaTNrVKcA1CraKEocCk4OLDGMmmkEYpKNkTnvW8b_PsmRdBLvwnpx6ipUlkhZVawRNGeMsHHGFyl21Cvy7DVFPSuQt1XqFOFeleh3mmyXhMT27y58Mf5H9FZL1rGzoffK1wWic8F-wZy43sD</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1992788273</pqid></control><display><type>article</type><title>A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App)</title><source>EBSCOhost Business Source Ultimate</source><source>ABI/INFORM global</source><source>JSTOR Archival Journals and Primary Sources Collection</source><source>Springer Nature</source><creator>Banks, George C. ; Woznyj, Haley M. ; Wesslen, Ryan S. ; Ross, Roxanne L.</creator><creatorcontrib>Banks, George C. ; Woznyj, Haley M. ; Wesslen, Ryan S. ; Ross, Roxanne L.</creatorcontrib><description>In recent decades, the amount of text available for organizational science research has grown tremendously. Despite the availability of text and advances in text analysis methods, many of these techniques remain largely segmented by discipline. Moreover, there is an increasing number of open-source tools (R, Python) for text analysis, yet these tools are not easily taken advantage of by social science researchers who likely have limited programming knowledge and exposure to computational methods. In this article, we compare quantitative and qualitative text analysis methods used across social sciences. We describe basic terminology and the overlooked, but critically important, steps in pre-processing raw text (e.g., selection of stop words; stemming). Next, we provide an exploratory analysis of open-ended responses from a prototypical survey dataset using topic modeling with R. We provide a list of best practice recommendations for text analysis focused on (1) hypothesis and question formation, (2) design and data collection, (3) data pre-processing, and (4) topic modeling. We also discuss the creation of scale scores for more traditional correlation and regression analyses. All the data are available in an online repository for the interested reader to practice with, along with a reference list for additional reading, an R markdown file, and an open source interactive topic model tool (topicApp; see https://github.com/wesslen/topicApp, https://github.com/wesslen/text-analysis-org-science, https://dataverse.unc.edu/dataset.xhtml?persistentId=doi:10.15139/S3/R4W7ZS).</description><identifier>ISSN: 0889-3268</identifier><identifier>EISSN: 1573-353X</identifier><identifier>DOI: 10.1007/s10869-017-9528-3</identifier><language>eng</language><publisher>New York: Springer Science + Business Media</publisher><subject>Behavioral Science and Psychology ; Business and Management ; Community and Environmental Psychology ; Computer programming ; Data mining ; Industrial and Organizational Psychology ; Information technology ; Organization theory ; ORIGINAL PAPER ; Personality and Social Psychology ; Psychology ; Social Sciences ; Text analysis</subject><ispartof>Journal of business and psychology, 2018-08, Vol.33 (4), p.445-459</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2018</rights><rights>Journal of Business and Psychology is a copyright of Springer, (2018). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c418t-d544b875c4dc108fce9250f192b8d6d9944001d91791048540e0dcd38c8c59183</citedby><cites>FETCH-LOGICAL-c418t-d544b875c4dc108fce9250f192b8d6d9944001d91791048540e0dcd38c8c59183</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1992788273/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1992788273?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,58238,58471,74895</link.rule.ids></links><search><creatorcontrib>Banks, George C.</creatorcontrib><creatorcontrib>Woznyj, Haley M.</creatorcontrib><creatorcontrib>Wesslen, Ryan S.</creatorcontrib><creatorcontrib>Ross, Roxanne L.</creatorcontrib><title>A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App)</title><title>Journal of business and psychology</title><addtitle>J Bus Psychol</addtitle><description>In recent decades, the amount of text available for organizational science research has grown tremendously. Despite the availability of text and advances in text analysis methods, many of these techniques remain largely segmented by discipline. Moreover, there is an increasing number of open-source tools (R, Python) for text analysis, yet these tools are not easily taken advantage of by social science researchers who likely have limited programming knowledge and exposure to computational methods. In this article, we compare quantitative and qualitative text analysis methods used across social sciences. We describe basic terminology and the overlooked, but critically important, steps in pre-processing raw text (e.g., selection of stop words; stemming). Next, we provide an exploratory analysis of open-ended responses from a prototypical survey dataset using topic modeling with R. We provide a list of best practice recommendations for text analysis focused on (1) hypothesis and question formation, (2) design and data collection, (3) data pre-processing, and (4) topic modeling. We also discuss the creation of scale scores for more traditional correlation and regression analyses. All the data are available in an online repository for the interested reader to practice with, along with a reference list for additional reading, an R markdown file, and an open source interactive topic model tool (topicApp; see https://github.com/wesslen/topicApp, https://github.com/wesslen/text-analysis-org-science, https://dataverse.unc.edu/dataset.xhtml?persistentId=doi:10.15139/S3/R4W7ZS).</description><subject>Behavioral Science and Psychology</subject><subject>Business and Management</subject><subject>Community and Environmental Psychology</subject><subject>Computer programming</subject><subject>Data mining</subject><subject>Industrial and Organizational Psychology</subject><subject>Information technology</subject><subject>Organization theory</subject><subject>ORIGINAL PAPER</subject><subject>Personality and Social Psychology</subject><subject>Psychology</subject><subject>Social Sciences</subject><subject>Text analysis</subject><issn>0889-3268</issn><issn>1573-353X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp9kEFLAzEQhYMoWKs_wIMQ8KKH6GST7CbHWqwKglJaEC9hm2RlS7tZk63af2_KinjyNDB87828h9AphSsKUFxHCjJXBGhBlMgkYXtoQEXBCBPsZR8NQEpFWJbLQ3QU4xIABM1hgF5HeOo-aveJfYVvXOzwcyhNVxuX9sav166xZVf7JuLKBzxzXx0eNeVqG-uI6wZP8UXZWFzieXSBTEKd-NUWj9r28hgdVOUqupOfOUTzye1sfE8en-4exqNHYjiVHbGC84UshOHWpBCVcSoTUFGVLaTNrVKcA1CraKEocCk4OLDGMmmkEYpKNkTnvW8b_PsmRdBLvwnpx6ipUlkhZVawRNGeMsHHGFyl21Cvy7DVFPSuQt1XqFOFeleh3mmyXhMT27y58Mf5H9FZL1rGzoffK1wWic8F-wZy43sD</recordid><startdate>20180801</startdate><enddate>20180801</enddate><creator>Banks, George C.</creator><creator>Woznyj, Haley M.</creator><creator>Wesslen, Ryan S.</creator><creator>Ross, Roxanne L.</creator><general>Springer Science + Business Media</general><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>88C</scope><scope>88G</scope><scope>8AO</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>FYUFA</scope><scope>F~G</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>K60</scope><scope>K6~</scope><scope>L.-</scope><scope>M0C</scope><scope>M0T</scope><scope>M2M</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PSYQQ</scope><scope>Q9U</scope></search><sort><creationdate>20180801</creationdate><title>A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App)</title><author>Banks, George C. ; Woznyj, Haley M. ; Wesslen, Ryan S. ; Ross, Roxanne L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c418t-d544b875c4dc108fce9250f192b8d6d9944001d91791048540e0dcd38c8c59183</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Behavioral Science and Psychology</topic><topic>Business and Management</topic><topic>Community and Environmental Psychology</topic><topic>Computer programming</topic><topic>Data mining</topic><topic>Industrial and Organizational Psychology</topic><topic>Information technology</topic><topic>Organization theory</topic><topic>ORIGINAL PAPER</topic><topic>Personality and Social Psychology</topic><topic>Psychology</topic><topic>Social Sciences</topic><topic>Text analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Banks, George C.</creatorcontrib><creatorcontrib>Woznyj, Haley M.</creatorcontrib><creatorcontrib>Wesslen, Ryan S.</creatorcontrib><creatorcontrib>Ross, Roxanne L.</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ABI/INFORM Collection (ProQuest)</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Healthcare Administration Database (Alumni)</collection><collection>Psychology Database (Alumni)</collection><collection>ProQuest Pharma Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest Business Premium Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>Health Research Premium Collection</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ABI/INFORM global</collection><collection>Healthcare Administration Database (Proquest)</collection><collection>Psychology Database (ProQuest)</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Psychology</collection><collection>ProQuest Central Basic</collection><jtitle>Journal of business and psychology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Banks, George C.</au><au>Woznyj, Haley M.</au><au>Wesslen, Ryan S.</au><au>Ross, Roxanne L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App)</atitle><jtitle>Journal of business and psychology</jtitle><stitle>J Bus Psychol</stitle><date>2018-08-01</date><risdate>2018</risdate><volume>33</volume><issue>4</issue><spage>445</spage><epage>459</epage><pages>445-459</pages><issn>0889-3268</issn><eissn>1573-353X</eissn><abstract>In recent decades, the amount of text available for organizational science research has grown tremendously. Despite the availability of text and advances in text analysis methods, many of these techniques remain largely segmented by discipline. Moreover, there is an increasing number of open-source tools (R, Python) for text analysis, yet these tools are not easily taken advantage of by social science researchers who likely have limited programming knowledge and exposure to computational methods. In this article, we compare quantitative and qualitative text analysis methods used across social sciences. We describe basic terminology and the overlooked, but critically important, steps in pre-processing raw text (e.g., selection of stop words; stemming). Next, we provide an exploratory analysis of open-ended responses from a prototypical survey dataset using topic modeling with R. We provide a list of best practice recommendations for text analysis focused on (1) hypothesis and question formation, (2) design and data collection, (3) data pre-processing, and (4) topic modeling. We also discuss the creation of scale scores for more traditional correlation and regression analyses. All the data are available in an online repository for the interested reader to practice with, along with a reference list for additional reading, an R markdown file, and an open source interactive topic model tool (topicApp; see https://github.com/wesslen/topicApp, https://github.com/wesslen/text-analysis-org-science, https://dataverse.unc.edu/dataset.xhtml?persistentId=doi:10.15139/S3/R4W7ZS).</abstract><cop>New York</cop><pub>Springer Science + Business Media</pub><doi>10.1007/s10869-017-9528-3</doi><tpages>15</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0889-3268 |
ispartof | Journal of business and psychology, 2018-08, Vol.33 (4), p.445-459 |
issn | 0889-3268 1573-353X |
language | eng |
recordid | cdi_proquest_journals_1992788273 |
source | EBSCOhost Business Source Ultimate; ABI/INFORM global; JSTOR Archival Journals and Primary Sources Collection; Springer Nature |
subjects | Behavioral Science and Psychology Business and Management Community and Environmental Psychology Computer programming Data mining Industrial and Organizational Psychology Information technology Organization theory ORIGINAL PAPER Personality and Social Psychology Psychology Social Sciences Text analysis |
title | A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App) |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T02%3A17%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Review%20of%20Best%20Practice%20Recommendations%20for%20Text%20Analysis%20in%20R%20(and%20a%20User-Friendly%20App)&rft.jtitle=Journal%20of%20business%20and%20psychology&rft.au=Banks,%20George%20C.&rft.date=2018-08-01&rft.volume=33&rft.issue=4&rft.spage=445&rft.epage=459&rft.pages=445-459&rft.issn=0889-3268&rft.eissn=1573-353X&rft_id=info:doi/10.1007/s10869-017-9528-3&rft_dat=%3Cjstor_proqu%3E48700765%3C/jstor_proqu%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c418t-d544b875c4dc108fce9250f192b8d6d9944001d91791048540e0dcd38c8c59183%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1992788273&rft_id=info:pmid/&rft_jstor_id=48700765&rfr_iscdi=true |