Loading…

Are My Invariants Valid? A Learning Approach

Ensuring that a program operates correctly is a difficult task in large, complex systems. Enshrining invariants -- desired properties of correct execution -- in code or comments can support maintainability and help sustain correctness. Tools that can automatically infer and recommend invariants can...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2019-03
Main Authors: Hellendoorn, Vincent J, Devanbu, Premkumar T, Polozov, Oleksandr, Marron, Mark
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Hellendoorn, Vincent J
Devanbu, Premkumar T
Polozov, Oleksandr
Marron, Mark
description Ensuring that a program operates correctly is a difficult task in large, complex systems. Enshrining invariants -- desired properties of correct execution -- in code or comments can support maintainability and help sustain correctness. Tools that can automatically infer and recommend invariants can thus be very beneficial. However, current invariant-suggesting tools, such as Daikon, suffer from high rates of false positives, in part because they only leverage traced program values from available test cases, rather than directly exploiting knowledge of the source code per se. We propose a machine-learning approach to judging the validity of invariants, specifically of method pre- and post-conditions, based directly on a method's source code. We introduce a new, scalable approach to creating labeled invariants: using programs with large test-suites, we generate Daikon invariants using traces from subsets of these test-suites, and then label these as valid/invalid by cross-validating them with held-out tests. This process induces a large set of labels that provide a form of noisy supervision, which is then used to train a deep neural model, based on gated graph neural networks. Our model learns to map the lexical, syntactic, and semantic structure of a given method's body into a probability that a candidate pre- or post-condition on that method's body is correct and is able to accurately label invariants based on the noisy signal, even in cross-project settings. Most importantly, it performs well on a hand-curated dataset of invariants.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2191678311</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2191678311</sourcerecordid><originalsourceid>FETCH-proquest_journals_21916783113</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQcSxKVfCtVPDMK0ssykzMKylWCEvMyUyxV3BU8ElNLMrLzEtXcCwoKMpPTM7gYWBNS8wpTuWF0twMym6uIc4eukDpwtLU4pL4rPzSojygVLyRoaWhmbmFsaGhMXGqAAc6MIo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2191678311</pqid></control><display><type>article</type><title>Are My Invariants Valid? A Learning Approach</title><source>Publicly Available Content Database</source><creator>Hellendoorn, Vincent J ; Devanbu, Premkumar T ; Polozov, Oleksandr ; Marron, Mark</creator><creatorcontrib>Hellendoorn, Vincent J ; Devanbu, Premkumar T ; Polozov, Oleksandr ; Marron, Mark</creatorcontrib><description>Ensuring that a program operates correctly is a difficult task in large, complex systems. Enshrining invariants -- desired properties of correct execution -- in code or comments can support maintainability and help sustain correctness. Tools that can automatically infer and recommend invariants can thus be very beneficial. However, current invariant-suggesting tools, such as Daikon, suffer from high rates of false positives, in part because they only leverage traced program values from available test cases, rather than directly exploiting knowledge of the source code per se. We propose a machine-learning approach to judging the validity of invariants, specifically of method pre- and post-conditions, based directly on a method's source code. We introduce a new, scalable approach to creating labeled invariants: using programs with large test-suites, we generate Daikon invariants using traces from subsets of these test-suites, and then label these as valid/invalid by cross-validating them with held-out tests. This process induces a large set of labels that provide a form of noisy supervision, which is then used to train a deep neural model, based on gated graph neural networks. Our model learns to map the lexical, syntactic, and semantic structure of a given method's body into a probability that a candidate pre- or post-condition on that method's body is correct and is able to accurately label invariants based on the noisy signal, even in cross-project settings. Most importantly, it performs well on a hand-curated dataset of invariants.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Complex systems ; Graph neural networks ; Invariants ; Machine learning ; Maintainability ; Source code</subject><ispartof>arXiv.org, 2019-03</ispartof><rights>2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2191678311?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Hellendoorn, Vincent J</creatorcontrib><creatorcontrib>Devanbu, Premkumar T</creatorcontrib><creatorcontrib>Polozov, Oleksandr</creatorcontrib><creatorcontrib>Marron, Mark</creatorcontrib><title>Are My Invariants Valid? A Learning Approach</title><title>arXiv.org</title><description>Ensuring that a program operates correctly is a difficult task in large, complex systems. Enshrining invariants -- desired properties of correct execution -- in code or comments can support maintainability and help sustain correctness. Tools that can automatically infer and recommend invariants can thus be very beneficial. However, current invariant-suggesting tools, such as Daikon, suffer from high rates of false positives, in part because they only leverage traced program values from available test cases, rather than directly exploiting knowledge of the source code per se. We propose a machine-learning approach to judging the validity of invariants, specifically of method pre- and post-conditions, based directly on a method's source code. We introduce a new, scalable approach to creating labeled invariants: using programs with large test-suites, we generate Daikon invariants using traces from subsets of these test-suites, and then label these as valid/invalid by cross-validating them with held-out tests. This process induces a large set of labels that provide a form of noisy supervision, which is then used to train a deep neural model, based on gated graph neural networks. Our model learns to map the lexical, syntactic, and semantic structure of a given method's body into a probability that a candidate pre- or post-condition on that method's body is correct and is able to accurately label invariants based on the noisy signal, even in cross-project settings. Most importantly, it performs well on a hand-curated dataset of invariants.</description><subject>Complex systems</subject><subject>Graph neural networks</subject><subject>Invariants</subject><subject>Machine learning</subject><subject>Maintainability</subject><subject>Source code</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQcSxKVfCtVPDMK0ssykzMKylWCEvMyUyxV3BU8ElNLMrLzEtXcCwoKMpPTM7gYWBNS8wpTuWF0twMym6uIc4eukDpwtLU4pL4rPzSojygVLyRoaWhmbmFsaGhMXGqAAc6MIo</recordid><startdate>20190314</startdate><enddate>20190314</enddate><creator>Hellendoorn, Vincent J</creator><creator>Devanbu, Premkumar T</creator><creator>Polozov, Oleksandr</creator><creator>Marron, Mark</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20190314</creationdate><title>Are My Invariants Valid? A Learning Approach</title><author>Hellendoorn, Vincent J ; Devanbu, Premkumar T ; Polozov, Oleksandr ; Marron, Mark</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_21916783113</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Complex systems</topic><topic>Graph neural networks</topic><topic>Invariants</topic><topic>Machine learning</topic><topic>Maintainability</topic><topic>Source code</topic><toplevel>online_resources</toplevel><creatorcontrib>Hellendoorn, Vincent J</creatorcontrib><creatorcontrib>Devanbu, Premkumar T</creatorcontrib><creatorcontrib>Polozov, Oleksandr</creatorcontrib><creatorcontrib>Marron, Mark</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hellendoorn, Vincent J</au><au>Devanbu, Premkumar T</au><au>Polozov, Oleksandr</au><au>Marron, Mark</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Are My Invariants Valid? A Learning Approach</atitle><jtitle>arXiv.org</jtitle><date>2019-03-14</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>Ensuring that a program operates correctly is a difficult task in large, complex systems. Enshrining invariants -- desired properties of correct execution -- in code or comments can support maintainability and help sustain correctness. Tools that can automatically infer and recommend invariants can thus be very beneficial. However, current invariant-suggesting tools, such as Daikon, suffer from high rates of false positives, in part because they only leverage traced program values from available test cases, rather than directly exploiting knowledge of the source code per se. We propose a machine-learning approach to judging the validity of invariants, specifically of method pre- and post-conditions, based directly on a method's source code. We introduce a new, scalable approach to creating labeled invariants: using programs with large test-suites, we generate Daikon invariants using traces from subsets of these test-suites, and then label these as valid/invalid by cross-validating them with held-out tests. This process induces a large set of labels that provide a form of noisy supervision, which is then used to train a deep neural model, based on gated graph neural networks. Our model learns to map the lexical, syntactic, and semantic structure of a given method's body into a probability that a candidate pre- or post-condition on that method's body is correct and is able to accurately label invariants based on the noisy signal, even in cross-project settings. Most importantly, it performs well on a hand-curated dataset of invariants.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2019-03
issn 2331-8422
language eng
recordid cdi_proquest_journals_2191678311
source Publicly Available Content Database
subjects Complex systems
Graph neural networks
Invariants
Machine learning
Maintainability
Source code
title Are My Invariants Valid? A Learning Approach
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T23%3A22%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Are%20My%20Invariants%20Valid?%20A%20Learning%20Approach&rft.jtitle=arXiv.org&rft.au=Hellendoorn,%20Vincent%20J&rft.date=2019-03-14&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2191678311%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_21916783113%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2191678311&rft_id=info:pmid/&rfr_iscdi=true