Loading…
Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning
Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from...
Saved in:
Published in: | Biomedical informatics insights 2016-01, Vol.2016 (8), p.11-18 |
---|---|
Main Authors: | , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3 |
---|---|
cites | cdi_FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3 |
container_end_page | 18 |
container_issue | 8 |
container_start_page | 11 |
container_title | Biomedical informatics insights |
container_volume | 2016 |
creator | Cohen, Kevin Bretonnel Glass, Benjamin Greiner, Hansel M Holland-Bouley, Katherine Standridge, Shannon Arya, Ravindra Faist, Robert Morita, Diego Mangano, Francesco Connolly, Brian Glauser, Tracy Pestian, John |
description | Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient's status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral. |
doi_str_mv | 10.4137/BII.S38308 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4876984</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.4137_BII.S38308</sage_id><sourcerecordid>1793902191</sourcerecordid><originalsourceid>FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3</originalsourceid><addsrcrecordid>eNqFkt9rFDEQxxdRbKl98Q-QBR8U4Wp-7CbZF0GPqgdXLbQ-h9lkdjdld3Mmu8K9-peb485yLYJ5yZD55JvJdybLXlJyUVAu339arS5uuOJEPclOKZVqwRgTT4_ik-w8xjuyW5KUXD7PTphkpeRKnGa_r3DqvPW9b52BPl_FOGPM3ZhfB7TOTG5s8-sUwRScyS83rsdN3OY3c2gxbPMljNZZmNKd2y74ue3ybzDNIUmtYWxnaDEpeYMx7pQSnV-B6dyI-RohjOnwRfasgT7i-WE_y358vrxdfl2sv39ZLT-uF6YoxbRQZW0LXrKaSVNSa2htG1MZK6Rioih5jZUgFoDQogCJlkhUTaVQpQxI0fCz7MNedzPXA1qD45TK1JvgBghb7cHph5nRdbr1v3ShpKhUkQTeHgSC_5lcmvTgosG-hxH9HDVVrEylUiH_j8qKV4TRiib09SP0zs9hTE7sKMqEKhRL1Ls9ZYKPMWBzXzclejcIOg2C3g9Cgl8d__Qe_dv2BLzZAzH15-i9f0kdTOtdjWGCCAYsDg4e-PYoaUAbP-hSFIT_AYNA1A0</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1791268482</pqid></control><display><type>article</type><title>Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning</title><source>Publicly Available Content (ProQuest)</source><source>PubMed Central</source><creator>Cohen, Kevin Bretonnel ; Glass, Benjamin ; Greiner, Hansel M ; Holland-Bouley, Katherine ; Standridge, Shannon ; Arya, Ravindra ; Faist, Robert ; Morita, Diego ; Mangano, Francesco ; Connolly, Brian ; Glauser, Tracy ; Pestian, John</creator><creatorcontrib>Cohen, Kevin Bretonnel ; Glass, Benjamin ; Greiner, Hansel M ; Holland-Bouley, Katherine ; Standridge, Shannon ; Arya, Ravindra ; Faist, Robert ; Morita, Diego ; Mangano, Francesco ; Connolly, Brian ; Glauser, Tracy ; Pestian, John</creatorcontrib><description>Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient's status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral.</description><identifier>ISSN: 1178-2226</identifier><identifier>EISSN: 1178-2226</identifier><identifier>DOI: 10.4137/BII.S38308</identifier><identifier>PMID: 27257386</identifier><language>eng</language><publisher>London, England: SAGE Publishing</publisher><subject>Classification ; Electronic health records ; Epilepsy ; Hypotheses ; Machine learning ; Natural language processing ; Original Research ; Surgery</subject><ispartof>Biomedical informatics insights, 2016-01, Vol.2016 (8), p.11-18</ispartof><rights>2016 SAGE Publications.</rights><rights>Copyright Libertas Academica Ltd 2016</rights><rights>2016 the author(s), publisher and licensee Libertas Academica Ltd. 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3</citedby><cites>FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4876984/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1791268482?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25733,27903,27904,36991,36992,44569,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/27257386$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Cohen, Kevin Bretonnel</creatorcontrib><creatorcontrib>Glass, Benjamin</creatorcontrib><creatorcontrib>Greiner, Hansel M</creatorcontrib><creatorcontrib>Holland-Bouley, Katherine</creatorcontrib><creatorcontrib>Standridge, Shannon</creatorcontrib><creatorcontrib>Arya, Ravindra</creatorcontrib><creatorcontrib>Faist, Robert</creatorcontrib><creatorcontrib>Morita, Diego</creatorcontrib><creatorcontrib>Mangano, Francesco</creatorcontrib><creatorcontrib>Connolly, Brian</creatorcontrib><creatorcontrib>Glauser, Tracy</creatorcontrib><creatorcontrib>Pestian, John</creatorcontrib><title>Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning</title><title>Biomedical informatics insights</title><addtitle>Biomed Inform Insights</addtitle><description>Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient's status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral.</description><subject>Classification</subject><subject>Electronic health records</subject><subject>Epilepsy</subject><subject>Hypotheses</subject><subject>Machine learning</subject><subject>Natural language processing</subject><subject>Original Research</subject><subject>Surgery</subject><issn>1178-2226</issn><issn>1178-2226</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>AFRWT</sourceid><sourceid>PIMPY</sourceid><recordid>eNqFkt9rFDEQxxdRbKl98Q-QBR8U4Wp-7CbZF0GPqgdXLbQ-h9lkdjdld3Mmu8K9-peb485yLYJ5yZD55JvJdybLXlJyUVAu339arS5uuOJEPclOKZVqwRgTT4_ik-w8xjuyW5KUXD7PTphkpeRKnGa_r3DqvPW9b52BPl_FOGPM3ZhfB7TOTG5s8-sUwRScyS83rsdN3OY3c2gxbPMljNZZmNKd2y74ue3ybzDNIUmtYWxnaDEpeYMx7pQSnV-B6dyI-RohjOnwRfasgT7i-WE_y358vrxdfl2sv39ZLT-uF6YoxbRQZW0LXrKaSVNSa2htG1MZK6Rioih5jZUgFoDQogCJlkhUTaVQpQxI0fCz7MNedzPXA1qD45TK1JvgBghb7cHph5nRdbr1v3ShpKhUkQTeHgSC_5lcmvTgosG-hxH9HDVVrEylUiH_j8qKV4TRiib09SP0zs9hTE7sKMqEKhRL1Ls9ZYKPMWBzXzclejcIOg2C3g9Cgl8d__Qe_dv2BLzZAzH15-i9f0kdTOtdjWGCCAYsDg4e-PYoaUAbP-hSFIT_AYNA1A0</recordid><startdate>20160101</startdate><enddate>20160101</enddate><creator>Cohen, Kevin Bretonnel</creator><creator>Glass, Benjamin</creator><creator>Greiner, Hansel M</creator><creator>Holland-Bouley, Katherine</creator><creator>Standridge, Shannon</creator><creator>Arya, Ravindra</creator><creator>Faist, Robert</creator><creator>Morita, Diego</creator><creator>Mangano, Francesco</creator><creator>Connolly, Brian</creator><creator>Glauser, Tracy</creator><creator>Pestian, John</creator><general>SAGE Publishing</general><general>SAGE Publications</general><general>Sage Publications Ltd</general><general>Libertas Academica</general><scope>AFRWT</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AYAGU</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L6V</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M7P</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20160101</creationdate><title>Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning</title><author>Cohen, Kevin Bretonnel ; Glass, Benjamin ; Greiner, Hansel M ; Holland-Bouley, Katherine ; Standridge, Shannon ; Arya, Ravindra ; Faist, Robert ; Morita, Diego ; Mangano, Francesco ; Connolly, Brian ; Glauser, Tracy ; Pestian, John</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Classification</topic><topic>Electronic health records</topic><topic>Epilepsy</topic><topic>Hypotheses</topic><topic>Machine learning</topic><topic>Natural language processing</topic><topic>Original Research</topic><topic>Surgery</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cohen, Kevin Bretonnel</creatorcontrib><creatorcontrib>Glass, Benjamin</creatorcontrib><creatorcontrib>Greiner, Hansel M</creatorcontrib><creatorcontrib>Holland-Bouley, Katherine</creatorcontrib><creatorcontrib>Standridge, Shannon</creatorcontrib><creatorcontrib>Arya, Ravindra</creatorcontrib><creatorcontrib>Faist, Robert</creatorcontrib><creatorcontrib>Morita, Diego</creatorcontrib><creatorcontrib>Mangano, Francesco</creatorcontrib><creatorcontrib>Connolly, Brian</creatorcontrib><creatorcontrib>Glauser, Tracy</creatorcontrib><creatorcontrib>Pestian, John</creatorcontrib><collection>SAGE Open Access Journals</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>Australia & New Zealand Database</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Biomedical informatics insights</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cohen, Kevin Bretonnel</au><au>Glass, Benjamin</au><au>Greiner, Hansel M</au><au>Holland-Bouley, Katherine</au><au>Standridge, Shannon</au><au>Arya, Ravindra</au><au>Faist, Robert</au><au>Morita, Diego</au><au>Mangano, Francesco</au><au>Connolly, Brian</au><au>Glauser, Tracy</au><au>Pestian, John</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning</atitle><jtitle>Biomedical informatics insights</jtitle><addtitle>Biomed Inform Insights</addtitle><date>2016-01-01</date><risdate>2016</risdate><volume>2016</volume><issue>8</issue><spage>11</spage><epage>18</epage><pages>11-18</pages><issn>1178-2226</issn><eissn>1178-2226</eissn><abstract>Objective: We describe the development and evaluation of a system that uses machine learning and natural language processing techniques to identify potential candidates for surgical intervention for drug-resistant pediatric epilepsy. The data are comprised of free-text clinical notes extracted from the electronic health record (EHR). Both known clinical outcomes from the EHR and manual chart annotations provide gold standards for the patient's status. The following hypotheses are then tested: 1) machine learning methods can identify epilepsy surgery candidates as well as physicians do and 2) machine learning methods can identify candidates earlier than physicians do. These hypotheses are tested by systematically evaluating the effects of the data source, amount of training data, class balance, classification algorithm, and feature set on classifier performance. The results support both hypotheses, with F-measures ranging from 0.71 to 0.82. The feature set, classification algorithm, amount of training data, class balance, and gold standard all significantly affected classification performance. It was further observed that classification performance was better than the highest agreement between two annotators, even at one year before documented surgery referral. The results demonstrate that such machine learning methods can contribute to predicting pediatric epilepsy surgery candidates and reducing lag time to surgery referral.</abstract><cop>London, England</cop><pub>SAGE Publishing</pub><pmid>27257386</pmid><doi>10.4137/BII.S38308</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1178-2226 |
ispartof | Biomedical informatics insights, 2016-01, Vol.2016 (8), p.11-18 |
issn | 1178-2226 1178-2226 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4876984 |
source | Publicly Available Content (ProQuest); PubMed Central |
subjects | Classification Electronic health records Epilepsy Hypotheses Machine learning Natural language processing Original Research Surgery |
title | Methodological Issues in Predicting Pediatric Epilepsy Surgery Candidates Through Natural Language Processing and Machine Learning |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T02%3A19%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Methodological%20Issues%20in%20Predicting%20Pediatric%20Epilepsy%20Surgery%20Candidates%20Through%20Natural%20Language%20Processing%20and%20Machine%20Learning&rft.jtitle=Biomedical%20informatics%20insights&rft.au=Cohen,%20Kevin%20Bretonnel&rft.date=2016-01-01&rft.volume=2016&rft.issue=8&rft.spage=11&rft.epage=18&rft.pages=11-18&rft.issn=1178-2226&rft.eissn=1178-2226&rft_id=info:doi/10.4137/BII.S38308&rft_dat=%3Cproquest_pubme%3E1793902191%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c456t-85bd4352b27c51dc1bdfc9cd67826453be960daa0144a7ed07e8f98e83bea76f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1791268482&rft_id=info:pmid/27257386&rft_sage_id=10.4137_BII.S38308&rfr_iscdi=true |