Loading…
A deep tree-based model for software defect prediction
Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on designing features (e.g. complexity metrics) that correlate wi...
Saved in:
Published in: | arXiv.org 2018-02 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Hoa Khanh Dam Pham, Trang Shien Wee Ng Tran, Truyen Grundy, John Ghose, Aditya Kim, Taeksu Chul-Joo, Kim |
description | Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on designing features (e.g. complexity metrics) that correlate with potentially defective code. Those approaches however do not sufficiently capture the syntax and different levels of semantics of source code, an important capability for building accurate prediction models. In this paper, we develop a novel prediction model which is capable of automatically learning features for representing source code and using them for defect prediction. Our prediction system is built upon the powerful deep learning, tree-structured Long Short Term Memory network which directly matches with the Abstract Syntax Tree representation of source code. An evaluation on two datasets, one from open source projects contributed by Samsung and the other from the public PROMISE repository, demonstrates the effectiveness of our approach for both within-project and cross-project predictions. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2071309606</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2071309606</sourcerecordid><originalsourceid>FETCH-proquest_journals_20713096063</originalsourceid><addsrcrecordid>eNqNykEKwjAQQNEgCBbtHQZcB6aJTXUpongA9yU2E2ipTcykeH278ACu_uK_lSiU1pU8HpTaiJJ5QERlGlXXuhDmDI4oQk5E8mmZHLyCoxF8SMDB549NtBBPXYaYyPVd7sO0E2tvR6by163Y366Py13GFN4zcW6HMKdpWa3CptJ4Mmj0f-oLlUs0-Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2071309606</pqid></control><display><type>article</type><title>A deep tree-based model for software defect prediction</title><source>Publicly Available Content Database</source><creator>Hoa Khanh Dam ; Pham, Trang ; Shien Wee Ng ; Tran, Truyen ; Grundy, John ; Ghose, Aditya ; Kim, Taeksu ; Chul-Joo, Kim</creator><creatorcontrib>Hoa Khanh Dam ; Pham, Trang ; Shien Wee Ng ; Tran, Truyen ; Grundy, John ; Ghose, Aditya ; Kim, Taeksu ; Chul-Joo, Kim</creatorcontrib><description>Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on designing features (e.g. complexity metrics) that correlate with potentially defective code. Those approaches however do not sufficiently capture the syntax and different levels of semantics of source code, an important capability for building accurate prediction models. In this paper, we develop a novel prediction model which is capable of automatically learning features for representing source code and using them for defect prediction. Our prediction system is built upon the powerful deep learning, tree-structured Long Short Term Memory network which directly matches with the Abstract Syntax Tree representation of source code. An evaluation on two datasets, one from open source projects contributed by Samsung and the other from the public PROMISE repository, demonstrates the effectiveness of our approach for both within-project and cross-project predictions.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Defects ; Machine learning ; Mathematical models ; Semantics ; Software ; Source code ; Syntax</subject><ispartof>arXiv.org, 2018-02</ispartof><rights>2018. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2071309606?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Hoa Khanh Dam</creatorcontrib><creatorcontrib>Pham, Trang</creatorcontrib><creatorcontrib>Shien Wee Ng</creatorcontrib><creatorcontrib>Tran, Truyen</creatorcontrib><creatorcontrib>Grundy, John</creatorcontrib><creatorcontrib>Ghose, Aditya</creatorcontrib><creatorcontrib>Kim, Taeksu</creatorcontrib><creatorcontrib>Chul-Joo, Kim</creatorcontrib><title>A deep tree-based model for software defect prediction</title><title>arXiv.org</title><description>Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on designing features (e.g. complexity metrics) that correlate with potentially defective code. Those approaches however do not sufficiently capture the syntax and different levels of semantics of source code, an important capability for building accurate prediction models. In this paper, we develop a novel prediction model which is capable of automatically learning features for representing source code and using them for defect prediction. Our prediction system is built upon the powerful deep learning, tree-structured Long Short Term Memory network which directly matches with the Abstract Syntax Tree representation of source code. An evaluation on two datasets, one from open source projects contributed by Samsung and the other from the public PROMISE repository, demonstrates the effectiveness of our approach for both within-project and cross-project predictions.</description><subject>Defects</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Semantics</subject><subject>Software</subject><subject>Source code</subject><subject>Syntax</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNykEKwjAQQNEgCBbtHQZcB6aJTXUpongA9yU2E2ipTcykeH278ACu_uK_lSiU1pU8HpTaiJJ5QERlGlXXuhDmDI4oQk5E8mmZHLyCoxF8SMDB549NtBBPXYaYyPVd7sO0E2tvR6by163Y366Py13GFN4zcW6HMKdpWa3CptJ4Mmj0f-oLlUs0-Q</recordid><startdate>20180203</startdate><enddate>20180203</enddate><creator>Hoa Khanh Dam</creator><creator>Pham, Trang</creator><creator>Shien Wee Ng</creator><creator>Tran, Truyen</creator><creator>Grundy, John</creator><creator>Ghose, Aditya</creator><creator>Kim, Taeksu</creator><creator>Chul-Joo, Kim</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20180203</creationdate><title>A deep tree-based model for software defect prediction</title><author>Hoa Khanh Dam ; Pham, Trang ; Shien Wee Ng ; Tran, Truyen ; Grundy, John ; Ghose, Aditya ; Kim, Taeksu ; Chul-Joo, Kim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20713096063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Defects</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Semantics</topic><topic>Software</topic><topic>Source code</topic><topic>Syntax</topic><toplevel>online_resources</toplevel><creatorcontrib>Hoa Khanh Dam</creatorcontrib><creatorcontrib>Pham, Trang</creatorcontrib><creatorcontrib>Shien Wee Ng</creatorcontrib><creatorcontrib>Tran, Truyen</creatorcontrib><creatorcontrib>Grundy, John</creatorcontrib><creatorcontrib>Ghose, Aditya</creatorcontrib><creatorcontrib>Kim, Taeksu</creatorcontrib><creatorcontrib>Chul-Joo, Kim</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hoa Khanh Dam</au><au>Pham, Trang</au><au>Shien Wee Ng</au><au>Tran, Truyen</au><au>Grundy, John</au><au>Ghose, Aditya</au><au>Kim, Taeksu</au><au>Chul-Joo, Kim</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>A deep tree-based model for software defect prediction</atitle><jtitle>arXiv.org</jtitle><date>2018-02-03</date><risdate>2018</risdate><eissn>2331-8422</eissn><abstract>Defects are common in software systems and can potentially cause various problems to software users. Different methods have been developed to quickly predict the most likely locations of defects in large code bases. Most of them focus on designing features (e.g. complexity metrics) that correlate with potentially defective code. Those approaches however do not sufficiently capture the syntax and different levels of semantics of source code, an important capability for building accurate prediction models. In this paper, we develop a novel prediction model which is capable of automatically learning features for representing source code and using them for defect prediction. Our prediction system is built upon the powerful deep learning, tree-structured Long Short Term Memory network which directly matches with the Abstract Syntax Tree representation of source code. An evaluation on two datasets, one from open source projects contributed by Samsung and the other from the public PROMISE repository, demonstrates the effectiveness of our approach for both within-project and cross-project predictions.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2018-02 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2071309606 |
source | Publicly Available Content Database |
subjects | Defects Machine learning Mathematical models Semantics Software Source code Syntax |
title | A deep tree-based model for software defect prediction |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T08%3A19%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=A%20deep%20tree-based%20model%20for%20software%20defect%20prediction&rft.jtitle=arXiv.org&rft.au=Hoa%20Khanh%20Dam&rft.date=2018-02-03&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2071309606%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_20713096063%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2071309606&rft_id=info:pmid/&rfr_iscdi=true |