Loading…

Domain knowledge-based security bug reports prediction

To eliminate security attack risks of software products, the security bug report (SBR) prediction has been increasingly investigated. However, there is still much room for improving the performance of automatic SBR prediction. This work is inspired by the work of two recent studies proposed by Peter...

Full description

Saved in:
Bibliographic Details
Published in:Knowledge-based systems 2022-04, Vol.241, p.108293, Article 108293
Main Authors: Zheng, Wei, Cheng, JingYuan, Wu, Xiaoxue, Sun, Ruiyang, Wang, Xiaolong, Sun, Xiaobing
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c334t-45a373216b0de678d02ac5e06f5e3a2817735499d3cdf44f5f297f44d2e147153
cites cdi_FETCH-LOGICAL-c334t-45a373216b0de678d02ac5e06f5e3a2817735499d3cdf44f5f297f44d2e147153
container_end_page
container_issue
container_start_page 108293
container_title Knowledge-based systems
container_volume 241
creator Zheng, Wei
Cheng, JingYuan
Wu, Xiaoxue
Sun, Ruiyang
Wang, Xiaolong
Sun, Xiaobing
description To eliminate security attack risks of software products, the security bug report (SBR) prediction has been increasingly investigated. However, there is still much room for improving the performance of automatic SBR prediction. This work is inspired by the work of two recent studies proposed by Peters et al. and Wu et al., which focused on SBR prediction and have been published on the top tier journal TSE (IEEE Transactions on Software Engineering). The goal of this work is to improve the effectiveness of supervised machine learning-based SBR prediction with the help of software security domain knowledge. First, we split the words in summary and description fields of the SBRs. Then, we use customized relationships to label entities and build a rule-based entity recognition corpus. After that, we establish relationships between entities and construct knowledge graphs. The information of CWE (Common Weakness Enumeration) is used to expand our corpus and the security-related words and phrases are integrated. Finally, we predict SBRs from target project by calculating the cosine similarity between our integrated corpus and the target bug reports. Our experimental evaluation on 5 open-source SBR datasets shows that our domain knowledge-guided approach could improve the effectiveness of SBRs prediction by 52% in terms of F1-score on average.
doi_str_mv 10.1016/j.knosys.2022.108293
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2642938803</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S095070512200096X</els_id><sourcerecordid>2642938803</sourcerecordid><originalsourceid>FETCH-LOGICAL-c334t-45a373216b0de678d02ac5e06f5e3a2817735499d3cdf44f5f297f44d2e147153</originalsourceid><addsrcrecordid>eNp9kEtLxDAUhYMoOI7-AxcF1x3zbNKNIOMTBtzoOqTJ7ZA609SkVebfm6GuXd3L5ZxzOR9C1wSvCCbVbbf67EM6pBXFlOaTojU7QQuiJC0lx_UpWuBa4FJiQc7RRUodxllJ1AJVD2FvfF_kgJ8duC2UjUngigR2in48FM20LSIMIY6pGCI4b0cf-kt01ppdgqu_uUQfT4_v65dy8_b8ur7flJYxPpZcGCYZJVWDHVRSOUyNFYCrVgAzVBEpmeB17Zh1LeetaGkt8-IoEC6JYEt0M-cOMXxNkEbdhSn2-aWmFc81lcIsq_issjGkFKHVQ_R7Ew-aYH0kpDs9E9JHQnomlG13sw1yg28PUSfrobe5ZAQ7ahf8_wG_baNv6A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2642938803</pqid></control><display><type>article</type><title>Domain knowledge-based security bug reports prediction</title><source>Library &amp; Information Science Abstracts (LISA)</source><source>ScienceDirect Freedom Collection 2022-2024</source><creator>Zheng, Wei ; Cheng, JingYuan ; Wu, Xiaoxue ; Sun, Ruiyang ; Wang, Xiaolong ; Sun, Xiaobing</creator><creatorcontrib>Zheng, Wei ; Cheng, JingYuan ; Wu, Xiaoxue ; Sun, Ruiyang ; Wang, Xiaolong ; Sun, Xiaobing</creatorcontrib><description>To eliminate security attack risks of software products, the security bug report (SBR) prediction has been increasingly investigated. However, there is still much room for improving the performance of automatic SBR prediction. This work is inspired by the work of two recent studies proposed by Peters et al. and Wu et al., which focused on SBR prediction and have been published on the top tier journal TSE (IEEE Transactions on Software Engineering). The goal of this work is to improve the effectiveness of supervised machine learning-based SBR prediction with the help of software security domain knowledge. First, we split the words in summary and description fields of the SBRs. Then, we use customized relationships to label entities and build a rule-based entity recognition corpus. After that, we establish relationships between entities and construct knowledge graphs. The information of CWE (Common Weakness Enumeration) is used to expand our corpus and the security-related words and phrases are integrated. Finally, we predict SBRs from target project by calculating the cosine similarity between our integrated corpus and the target bug reports. Our experimental evaluation on 5 open-source SBR datasets shows that our domain knowledge-guided approach could improve the effectiveness of SBRs prediction by 52% in terms of F1-score on average.</description><identifier>ISSN: 0950-7051</identifier><identifier>EISSN: 1872-7409</identifier><identifier>DOI: 10.1016/j.knosys.2022.108293</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Debugging ; Domain knowledge ; Domains ; Entity recognition ; Enumeration ; Knowledge graph ; Knowledge representation ; Machine learning ; Security ; Security bug report prediction ; Software engineering ; Software security</subject><ispartof>Knowledge-based systems, 2022-04, Vol.241, p.108293, Article 108293</ispartof><rights>2022 Elsevier B.V.</rights><rights>Copyright Elsevier Science Ltd. Apr 6, 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c334t-45a373216b0de678d02ac5e06f5e3a2817735499d3cdf44f5f297f44d2e147153</citedby><cites>FETCH-LOGICAL-c334t-45a373216b0de678d02ac5e06f5e3a2817735499d3cdf44f5f297f44d2e147153</cites><orcidid>0000-0002-7567-3643</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925,34135</link.rule.ids></links><search><creatorcontrib>Zheng, Wei</creatorcontrib><creatorcontrib>Cheng, JingYuan</creatorcontrib><creatorcontrib>Wu, Xiaoxue</creatorcontrib><creatorcontrib>Sun, Ruiyang</creatorcontrib><creatorcontrib>Wang, Xiaolong</creatorcontrib><creatorcontrib>Sun, Xiaobing</creatorcontrib><title>Domain knowledge-based security bug reports prediction</title><title>Knowledge-based systems</title><description>To eliminate security attack risks of software products, the security bug report (SBR) prediction has been increasingly investigated. However, there is still much room for improving the performance of automatic SBR prediction. This work is inspired by the work of two recent studies proposed by Peters et al. and Wu et al., which focused on SBR prediction and have been published on the top tier journal TSE (IEEE Transactions on Software Engineering). The goal of this work is to improve the effectiveness of supervised machine learning-based SBR prediction with the help of software security domain knowledge. First, we split the words in summary and description fields of the SBRs. Then, we use customized relationships to label entities and build a rule-based entity recognition corpus. After that, we establish relationships between entities and construct knowledge graphs. The information of CWE (Common Weakness Enumeration) is used to expand our corpus and the security-related words and phrases are integrated. Finally, we predict SBRs from target project by calculating the cosine similarity between our integrated corpus and the target bug reports. Our experimental evaluation on 5 open-source SBR datasets shows that our domain knowledge-guided approach could improve the effectiveness of SBRs prediction by 52% in terms of F1-score on average.</description><subject>Debugging</subject><subject>Domain knowledge</subject><subject>Domains</subject><subject>Entity recognition</subject><subject>Enumeration</subject><subject>Knowledge graph</subject><subject>Knowledge representation</subject><subject>Machine learning</subject><subject>Security</subject><subject>Security bug report prediction</subject><subject>Software engineering</subject><subject>Software security</subject><issn>0950-7051</issn><issn>1872-7409</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>F2A</sourceid><recordid>eNp9kEtLxDAUhYMoOI7-AxcF1x3zbNKNIOMTBtzoOqTJ7ZA609SkVebfm6GuXd3L5ZxzOR9C1wSvCCbVbbf67EM6pBXFlOaTojU7QQuiJC0lx_UpWuBa4FJiQc7RRUodxllJ1AJVD2FvfF_kgJ8duC2UjUngigR2in48FM20LSIMIY6pGCI4b0cf-kt01ppdgqu_uUQfT4_v65dy8_b8ur7flJYxPpZcGCYZJVWDHVRSOUyNFYCrVgAzVBEpmeB17Zh1LeetaGkt8-IoEC6JYEt0M-cOMXxNkEbdhSn2-aWmFc81lcIsq_issjGkFKHVQ_R7Ew-aYH0kpDs9E9JHQnomlG13sw1yg28PUSfrobe5ZAQ7ahf8_wG_baNv6A</recordid><startdate>20220406</startdate><enddate>20220406</enddate><creator>Zheng, Wei</creator><creator>Cheng, JingYuan</creator><creator>Wu, Xiaoxue</creator><creator>Sun, Ruiyang</creator><creator>Wang, Xiaolong</creator><creator>Sun, Xiaobing</creator><general>Elsevier B.V</general><general>Elsevier Science Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-7567-3643</orcidid></search><sort><creationdate>20220406</creationdate><title>Domain knowledge-based security bug reports prediction</title><author>Zheng, Wei ; Cheng, JingYuan ; Wu, Xiaoxue ; Sun, Ruiyang ; Wang, Xiaolong ; Sun, Xiaobing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c334t-45a373216b0de678d02ac5e06f5e3a2817735499d3cdf44f5f297f44d2e147153</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Debugging</topic><topic>Domain knowledge</topic><topic>Domains</topic><topic>Entity recognition</topic><topic>Enumeration</topic><topic>Knowledge graph</topic><topic>Knowledge representation</topic><topic>Machine learning</topic><topic>Security</topic><topic>Security bug report prediction</topic><topic>Software engineering</topic><topic>Software security</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zheng, Wei</creatorcontrib><creatorcontrib>Cheng, JingYuan</creatorcontrib><creatorcontrib>Wu, Xiaoxue</creatorcontrib><creatorcontrib>Sun, Ruiyang</creatorcontrib><creatorcontrib>Wang, Xiaolong</creatorcontrib><creatorcontrib>Sun, Xiaobing</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Knowledge-based systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zheng, Wei</au><au>Cheng, JingYuan</au><au>Wu, Xiaoxue</au><au>Sun, Ruiyang</au><au>Wang, Xiaolong</au><au>Sun, Xiaobing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Domain knowledge-based security bug reports prediction</atitle><jtitle>Knowledge-based systems</jtitle><date>2022-04-06</date><risdate>2022</risdate><volume>241</volume><spage>108293</spage><pages>108293-</pages><artnum>108293</artnum><issn>0950-7051</issn><eissn>1872-7409</eissn><abstract>To eliminate security attack risks of software products, the security bug report (SBR) prediction has been increasingly investigated. However, there is still much room for improving the performance of automatic SBR prediction. This work is inspired by the work of two recent studies proposed by Peters et al. and Wu et al., which focused on SBR prediction and have been published on the top tier journal TSE (IEEE Transactions on Software Engineering). The goal of this work is to improve the effectiveness of supervised machine learning-based SBR prediction with the help of software security domain knowledge. First, we split the words in summary and description fields of the SBRs. Then, we use customized relationships to label entities and build a rule-based entity recognition corpus. After that, we establish relationships between entities and construct knowledge graphs. The information of CWE (Common Weakness Enumeration) is used to expand our corpus and the security-related words and phrases are integrated. Finally, we predict SBRs from target project by calculating the cosine similarity between our integrated corpus and the target bug reports. Our experimental evaluation on 5 open-source SBR datasets shows that our domain knowledge-guided approach could improve the effectiveness of SBRs prediction by 52% in terms of F1-score on average.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.knosys.2022.108293</doi><orcidid>https://orcid.org/0000-0002-7567-3643</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0950-7051
ispartof Knowledge-based systems, 2022-04, Vol.241, p.108293, Article 108293
issn 0950-7051
1872-7409
language eng
recordid cdi_proquest_journals_2642938803
source Library & Information Science Abstracts (LISA); ScienceDirect Freedom Collection 2022-2024
subjects Debugging
Domain knowledge
Domains
Entity recognition
Enumeration
Knowledge graph
Knowledge representation
Machine learning
Security
Security bug report prediction
Software engineering
Software security
title Domain knowledge-based security bug reports prediction
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T23%3A22%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Domain%20knowledge-based%20security%20bug%20reports%20prediction&rft.jtitle=Knowledge-based%20systems&rft.au=Zheng,%20Wei&rft.date=2022-04-06&rft.volume=241&rft.spage=108293&rft.pages=108293-&rft.artnum=108293&rft.issn=0950-7051&rft.eissn=1872-7409&rft_id=info:doi/10.1016/j.knosys.2022.108293&rft_dat=%3Cproquest_cross%3E2642938803%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c334t-45a373216b0de678d02ac5e06f5e3a2817735499d3cdf44f5f297f44d2e147153%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2642938803&rft_id=info:pmid/&rfr_iscdi=true