Loading…

Learning to Improve Code Efficiency

Improvements in the performance of computing systems, driven by Moore's Law, have transformed society. As such hardware-driven gains slow down, it becomes even more important for software developers to focus on performance and efficiency during development. While several studies have demonstrat...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2022-08
Main Authors:	Chen, Binghong, Tarlow, Daniel, Swersky, Kevin, Maas, Martin, Heiber, Pablo, Naik, Ashish, Hashemi, Milad, Parthasarathy Ranganathan
Format:	Article
Language:	English
Subjects:	Coders Datasets Efficiency Hardware Machine learning Moore's law Software development
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Chen, Binghong Tarlow, Daniel Swersky, Kevin Maas, Martin Heiber, Pablo Naik, Ashish Hashemi, Milad Parthasarathy Ranganathan
description	Improvements in the performance of computing systems, driven by Moore's Law, have transformed society. As such hardware-driven gains slow down, it becomes even more important for software developers to focus on performance and efficiency during development. While several studies have demonstrated the potential from such improved code efficiency (e.g., 2x better generational improvements compared to hardware), unlocking these gains in practice has been challenging. Reasoning about algorithmic complexity and the interaction of coding patterns on hardware can be challenging for the average programmer, especially when combined with pragmatic constraints around development velocity and multi-person development. This paper seeks to address this problem. We analyze a large competitive programming dataset from the Google Code Jam competition and find that efficient code is indeed rare, with a 2x runtime difference between the median and the 90th percentile of solutions. We propose using machine learning to automatically provide prescriptive feedback in the form of hints, to guide programmers towards writing high-performance code. To automatically learn these hints from the dataset, we propose a novel discrete variational auto-encoder, where each discrete latent variable represents a different learned category of code-edit that increases performance. We show that this method represents the multi-modal space of code efficiency edits better than a sequence-to-sequence baseline and generates a distribution of more efficient solutions.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2700907998</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2700907998</sourcerecordid><originalsourceid>FETCH-proquest_journals_27009079983</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRQ9klNLMrLzEtXKMlX8MwtKMovS1Vwzk9JVXBNS8tMzkzNS67kYWBNS8wpTuWF0twMym6uIc4eukDVhaWpxSXxWfmlRXlAqXgjcwMDSwNzS0sLY-JUAQBvqi3z</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700907998</pqid></control><display><type>article</type><title>Learning to Improve Code Efficiency</title><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Chen, Binghong ; Tarlow, Daniel ; Swersky, Kevin ; Maas, Martin ; Heiber, Pablo ; Naik, Ashish ; Hashemi, Milad ; Parthasarathy Ranganathan</creator><creatorcontrib>Chen, Binghong ; Tarlow, Daniel ; Swersky, Kevin ; Maas, Martin ; Heiber, Pablo ; Naik, Ashish ; Hashemi, Milad ; Parthasarathy Ranganathan</creatorcontrib><description>Improvements in the performance of computing systems, driven by Moore's Law, have transformed society. As such hardware-driven gains slow down, it becomes even more important for software developers to focus on performance and efficiency during development. While several studies have demonstrated the potential from such improved code efficiency (e.g., 2x better generational improvements compared to hardware), unlocking these gains in practice has been challenging. Reasoning about algorithmic complexity and the interaction of coding patterns on hardware can be challenging for the average programmer, especially when combined with pragmatic constraints around development velocity and multi-person development. This paper seeks to address this problem. We analyze a large competitive programming dataset from the Google Code Jam competition and find that efficient code is indeed rare, with a 2x runtime difference between the median and the 90th percentile of solutions. We propose using machine learning to automatically provide prescriptive feedback in the form of hints, to guide programmers towards writing high-performance code. To automatically learn these hints from the dataset, we propose a novel discrete variational auto-encoder, where each discrete latent variable represents a different learned category of code-edit that increases performance. We show that this method represents the multi-modal space of code efficiency edits better than a sequence-to-sequence baseline and generates a distribution of more efficient solutions.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Coders ; Datasets ; Efficiency ; Hardware ; Machine learning ; Moore's law ; Software development</subject><ispartof>arXiv.org, 2022-08</ispartof><rights>2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2700907998?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Chen, Binghong</creatorcontrib><creatorcontrib>Tarlow, Daniel</creatorcontrib><creatorcontrib>Swersky, Kevin</creatorcontrib><creatorcontrib>Maas, Martin</creatorcontrib><creatorcontrib>Heiber, Pablo</creatorcontrib><creatorcontrib>Naik, Ashish</creatorcontrib><creatorcontrib>Hashemi, Milad</creatorcontrib><creatorcontrib>Parthasarathy Ranganathan</creatorcontrib><title>Learning to Improve Code Efficiency</title><title>arXiv.org</title><description>Improvements in the performance of computing systems, driven by Moore's Law, have transformed society. As such hardware-driven gains slow down, it becomes even more important for software developers to focus on performance and efficiency during development. While several studies have demonstrated the potential from such improved code efficiency (e.g., 2x better generational improvements compared to hardware), unlocking these gains in practice has been challenging. Reasoning about algorithmic complexity and the interaction of coding patterns on hardware can be challenging for the average programmer, especially when combined with pragmatic constraints around development velocity and multi-person development. This paper seeks to address this problem. We analyze a large competitive programming dataset from the Google Code Jam competition and find that efficient code is indeed rare, with a 2x runtime difference between the median and the 90th percentile of solutions. We propose using machine learning to automatically provide prescriptive feedback in the form of hints, to guide programmers towards writing high-performance code. To automatically learn these hints from the dataset, we propose a novel discrete variational auto-encoder, where each discrete latent variable represents a different learned category of code-edit that increases performance. We show that this method represents the multi-modal space of code efficiency edits better than a sequence-to-sequence baseline and generates a distribution of more efficient solutions.</description><subject>Coders</subject><subject>Datasets</subject><subject>Efficiency</subject><subject>Hardware</subject><subject>Machine learning</subject><subject>Moore's law</subject><subject>Software development</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRQ9klNLMrLzEtXKMlX8MwtKMovS1Vwzk9JVXBNS8tMzkzNS67kYWBNS8wpTuWF0twMym6uIc4eukDVhaWpxSXxWfmlRXlAqXgjcwMDSwNzS0sLY-JUAQBvqi3z</recordid><startdate>20220809</startdate><enddate>20220809</enddate><creator>Chen, Binghong</creator><creator>Tarlow, Daniel</creator><creator>Swersky, Kevin</creator><creator>Maas, Martin</creator><creator>Heiber, Pablo</creator><creator>Naik, Ashish</creator><creator>Hashemi, Milad</creator><creator>Parthasarathy Ranganathan</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220809</creationdate><title>Learning to Improve Code Efficiency</title><author>Chen, Binghong ; Tarlow, Daniel ; Swersky, Kevin ; Maas, Martin ; Heiber, Pablo ; Naik, Ashish ; Hashemi, Milad ; Parthasarathy Ranganathan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27009079983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Coders</topic><topic>Datasets</topic><topic>Efficiency</topic><topic>Hardware</topic><topic>Machine learning</topic><topic>Moore's law</topic><topic>Software development</topic><toplevel>online_resources</toplevel><creatorcontrib>Chen, Binghong</creatorcontrib><creatorcontrib>Tarlow, Daniel</creatorcontrib><creatorcontrib>Swersky, Kevin</creatorcontrib><creatorcontrib>Maas, Martin</creatorcontrib><creatorcontrib>Heiber, Pablo</creatorcontrib><creatorcontrib>Naik, Ashish</creatorcontrib><creatorcontrib>Hashemi, Milad</creatorcontrib><creatorcontrib>Parthasarathy Ranganathan</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Binghong</au><au>Tarlow, Daniel</au><au>Swersky, Kevin</au><au>Maas, Martin</au><au>Heiber, Pablo</au><au>Naik, Ashish</au><au>Hashemi, Milad</au><au>Parthasarathy Ranganathan</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Learning to Improve Code Efficiency</atitle><jtitle>arXiv.org</jtitle><date>2022-08-09</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Improvements in the performance of computing systems, driven by Moore's Law, have transformed society. As such hardware-driven gains slow down, it becomes even more important for software developers to focus on performance and efficiency during development. While several studies have demonstrated the potential from such improved code efficiency (e.g., 2x better generational improvements compared to hardware), unlocking these gains in practice has been challenging. Reasoning about algorithmic complexity and the interaction of coding patterns on hardware can be challenging for the average programmer, especially when combined with pragmatic constraints around development velocity and multi-person development. This paper seeks to address this problem. We analyze a large competitive programming dataset from the Google Code Jam competition and find that efficient code is indeed rare, with a 2x runtime difference between the median and the 90th percentile of solutions. We propose using machine learning to automatically provide prescriptive feedback in the form of hints, to guide programmers towards writing high-performance code. To automatically learn these hints from the dataset, we propose a novel discrete variational auto-encoder, where each discrete latent variable represents a different learned category of code-edit that increases performance. We show that this method represents the multi-modal space of code efficiency edits better than a sequence-to-sequence baseline and generates a distribution of more efficient solutions.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-08
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2700907998
source	Publicly Available Content Database (Proquest) (PQ_SDU_P3)
subjects	Coders Datasets Efficiency Hardware Machine learning Moore's law Software development
title	Learning to Improve Code Efficiency
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T15%3A16%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Learning%20to%20Improve%20Code%20Efficiency&rft.jtitle=arXiv.org&rft.au=Chen,%20Binghong&rft.date=2022-08-09&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2700907998%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_27009079983%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2700907998&rft_id=info:pmid/&rfr_iscdi=true