Loading…

Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning

Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from...

Full description

Saved in:

Bibliographic Details
Published in:	Biomedical informatics insights 2016-01, Vol.2016 (8), p.11-18
Main Authors:	Cohen, Kevin Bretonnel, Glass, Benjamin, Greiner, Hansel M, Holland-Bouley, Katherine, Standridge, Shannon, Arya, Ravindra, Faist, Robert, Morita, Diego, Mangano, Francesco, Connolly, Brian, Glauser, Tracy, Pestian, John
Format:	Article
Language:	English
Subjects:	Classification Electronic health records Epilepsy Hypotheses Machine learning Natural language processing Original Research Surgery
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3
cites	cdi_FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3
container_end_page	18
container_issue	8
container_start_page	11
container_title	Biomedical informatics insights
container_volume	2016
creator	Cohen, Kevin Bretonnel Glass, Benjamin Greiner, Hansel M Holland-Bouley, Katherine Standridge, Shannon Arya, Ravindra Faist, Robert Morita, Diego Mangano, Francesco Connolly, Brian Glauser, Tracy Pestian, John
description	Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient's status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral.
doi_str_mv	10.4137/BII.S38308
format	article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4876984</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.4137_BII.S38308</sage_id><sourcerecordid>1793902191</sourcerecordid><originalsourceid>FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3</originalsourceid><addsrcrecordid>eNqFkt9rFDEQxxdRbKl98Q-QBR8U4Wp-7CbZF0GPqgdXLbQ-h9lkdjdld3Mmu8K9-peb485yLYJ5yZD55JvJdybLXlJyUVAu339arS5uuOJEPclOKZVqwRgTT4_ik-w8xjuyW5KUXD7PTphkpeRKnGa_r3DqvPW9b52BPl_FOGPM3ZhfB7TOTG5s8-sUwRScyS83rsdN3OY3c2gxbPMljNZZmNKd2y74ue3ybzDNIUmtYWxnaDEpeYMx7pQSnV-B6dyI-RohjOnwRfasgT7i-WE_y358vrxdfl2sv39ZLT-uF6YoxbRQZW0LXrKaSVNSa2htG1MZK6Rioih5jZUgFoDQogCJlkhUTaVQpQxI0fCz7MNedzPXA1qD45TK1JvgBghb7cHph5nRdbr1v3ShpKhUkQTeHgSC_5lcmvTgosG-hxH9HDVVrEylUiH_j8qKV4TRiib09SP0zs9hTE7sKMqEKhRL1Ls9ZYKPMWBzXzclejcIOg2C3g9Cgl8d__Qe_dv2BLzZAzH15-i9f0kdTOtdjWGCCAYsDg4e-PYoaUAbP-hSFIT_AYNA1A0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1791268482</pqid></control><display><type>article</type><title>Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning</title><source>Publicly Available Content (ProQuest)</source><source>PubMed Central</source><creator>Cohen, Kevin Bretonnel ; Glass, Benjamin ; Greiner, Hansel M ; Holland-Bouley, Katherine ; Standridge, Shannon ; Arya, Ravindra ; Faist, Robert ; Morita, Diego ; Mangano, Francesco ; Connolly, Brian ; Glauser, Tracy ; Pestian, John</creator><creatorcontrib>Cohen, Kevin Bretonnel ; Glass, Benjamin ; Greiner, Hansel M ; Holland-Bouley, Katherine ; Standridge, Shannon ; Arya, Ravindra ; Faist, Robert ; Morita, Diego ; Mangano, Francesco ; Connolly, Brian ; Glauser, Tracy ; Pestian, John</creatorcontrib><description>Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient's status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral.</description><identifier>ISSN: 1178-2226</identifier><identifier>EISSN: 1178-2226</identifier><identifier>DOI: 10.4137/BII.S38308</identifier><identifier>PMID: 27257386</identifier><language>eng</language><publisher>London, England: SAGE Publishing</publisher><subject>Classification ; Electronic health records ; Epilepsy ; Hypotheses ; Machine learning ; Natural language processing ; Original Research ; Surgery</subject><ispartof>Biomedical informatics insights, 2016-01, Vol.2016 (8), p.11-18</ispartof><rights>2016 SAGE Publications.</rights><rights>Copyright Libertas Academica Ltd 2016</rights><rights>2016 the author(s), publisher and licensee Libertas Academica Ltd. 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3</citedby><cites>FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4876984/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1791268482?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25733,27903,27904,36991,36992,44569,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/27257386$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Cohen, Kevin Bretonnel</creatorcontrib><creatorcontrib>Glass, Benjamin</creatorcontrib><creatorcontrib>Greiner, Hansel M</creatorcontrib><creatorcontrib>Holland-Bouley, Katherine</creatorcontrib><creatorcontrib>Standridge, Shannon</creatorcontrib><creatorcontrib>Arya, Ravindra</creatorcontrib><creatorcontrib>Faist, Robert</creatorcontrib><creatorcontrib>Morita, Diego</creatorcontrib><creatorcontrib>Mangano, Francesco</creatorcontrib><creatorcontrib>Connolly, Brian</creatorcontrib><creatorcontrib>Glauser, Tracy</creatorcontrib><creatorcontrib>Pestian, John</creatorcontrib><title>Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning</title><title>Biomedical informatics insights</title><addtitle>Biomed Inform Insights</addtitle><description>Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient's status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral.</description><subject>Classification</subject><subject>Electronic health records</subject><subject>Epilepsy</subject><subject>Hypotheses</subject><subject>Machine learning</subject><subject>Natural language processing</subject><subject>Original Research</subject><subject>Surgery</subject><issn>1178-2226</issn><issn>1178-2226</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>AFRWT</sourceid><sourceid>PIMPY</sourceid><recordid>eNqFkt9rFDEQxxdRbKl98Q-QBR8U4Wp-7CbZF0GPqgdXLbQ-h9lkdjdld3Mmu8K9-peb485yLYJ5yZD55JvJdybLXlJyUVAu339arS5uuOJEPclOKZVqwRgTT4_ik-w8xjuyW5KUXD7PTphkpeRKnGa_r3DqvPW9b52BPl_FOGPM3ZhfB7TOTG5s8-sUwRScyS83rsdN3OY3c2gxbPMljNZZmNKd2y74ue3ybzDNIUmtYWxnaDEpeYMx7pQSnV-B6dyI-RohjOnwRfasgT7i-WE_y358vrxdfl2sv39ZLT-uF6YoxbRQZW0LXrKaSVNSa2htG1MZK6Rioih5jZUgFoDQogCJlkhUTaVQpQxI0fCz7MNedzPXA1qD45TK1JvgBghb7cHph5nRdbr1v3ShpKhUkQTeHgSC_5lcmvTgosG-hxH9HDVVrEylUiH_j8qKV4TRiib09SP0zs9hTE7sKMqEKhRL1Ls9ZYKPMWBzXzclejcIOg2C3g9Cgl8d__Qe_dv2BLzZAzH15-i9f0kdTOtdjWGCCAYsDg4e-PYoaUAbP-hSFIT_AYNA1A0</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Cohen, Kevin Bretonnel</creator><creator>Glass, Benjamin</creator><creator>Greiner, Hansel M</creator><creator>Holland-Bouley, Katherine</creator><creator>Standridge, Shannon</creator><creator>Arya, Ravindra</creator><creator>Faist, Robert</creator><creator>Morita, Diego</creator><creator>Mangano, Francesco</creator><creator>Connolly, Brian</creator><creator>Glauser, Tracy</creator><creator>Pestian, John</creator><general>SAGE Publishing</general><general>SAGE Publications</general><general>Sage Publications Ltd</general><general>Libertas Academica</general><scope>AFRWT</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AYAGU</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L6V</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M7P</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20160101</creationdate><title>Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning</title><author>Cohen, Kevin Bretonnel ; Glass, Benjamin ; Greiner, Hansel M ; Holland-Bouley, Katherine ; Standridge, Shannon ; Arya, Ravindra ; Faist, Robert ; Morita, Diego ; Mangano, Francesco ; Connolly, Brian ; Glauser, Tracy ; Pestian, John</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Classification</topic><topic>Electronic health records</topic><topic>Epilepsy</topic><topic>Hypotheses</topic><topic>Machine learning</topic><topic>Natural language processing</topic><topic>Original Research</topic><topic>Surgery</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cohen, Kevin Bretonnel</creatorcontrib><creatorcontrib>Glass, Benjamin</creatorcontrib><creatorcontrib>Greiner, Hansel M</creatorcontrib><creatorcontrib>Holland-Bouley, Katherine</creatorcontrib><creatorcontrib>Standridge, Shannon</creatorcontrib><creatorcontrib>Arya, Ravindra</creatorcontrib><creatorcontrib>Faist, Robert</creatorcontrib><creatorcontrib>Morita, Diego</creatorcontrib><creatorcontrib>Mangano, Francesco</creatorcontrib><creatorcontrib>Connolly, Brian</creatorcontrib><creatorcontrib>Glauser, Tracy</creatorcontrib><creatorcontrib>Pestian, John</creatorcontrib><collection>SAGE Open Access Journals</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>Australia & New Zealand Database</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Biomedical informatics insights</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cohen, Kevin Bretonnel</au><au>Glass, Benjamin</au><au>Greiner, Hansel M</au><au>Holland-Bouley, Katherine</au><au>Standridge, Shannon</au><au>Arya, Ravindra</au><au>Faist, Robert</au><au>Morita, Diego</au><au>Mangano, Francesco</au><au>Connolly, Brian</au><au>Glauser, Tracy</au><au>Pestian, John</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning</atitle><jtitle>Biomedical informatics insights</jtitle><addtitle>Biomed Inform Insights</addtitle><date>2016-01-01</date><risdate>2016</risdate><volume>2016</volume><issue>8</issue><spage>11</spage><epage>18</epage><pages>11-18</pages><issn>1178-2226</issn><eissn>1178-2226</eissn><abstract>Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient's status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral.</abstract><cop>London, England</cop><pub>SAGE Publishing</pub><pmid>27257386</pmid><doi>10.4137/BII.S38308</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1178-2226
ispartof	Biomedical informatics insights, 2016-01, Vol.2016 (8), p.11-18
issn	1178-2226 1178-2226
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4876984
source	Publicly Available Content (ProQuest); PubMed Central
subjects	Classification Electronic health records Epilepsy Hypotheses Machine learning Natural language processing Original Research Surgery
title	Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T02%3A19%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Methodological%20Issues%20in%20Predicting%20Pediatric%20Epilepsy%20Surgery%20Candidates%20Through%20Natural%20Language%20Processing%20and%20Machine%20Learning&rft.jtitle=Biomedical%20informatics%20insights&rft.au=Cohen,%20Kevin%20Bretonnel&rft.date=2016-01-01&rft.volume=2016&rft.issue=8&rft.spage=11&rft.epage=18&rft.pages=11-18&rft.issn=1178-2226&rft.eissn=1178-2226&rft_id=info:doi/10.4137/BII.S38308&rft_dat=%3Cproquest_pubme%3E1793902191%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1791268482&rft_id=info:pmid/27257386&rft_sage_id=10.4137_BII.S38308&rfr_iscdi=true