Loading…

A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts

Code comment has been an important part of computer programs, greatly facilitating the understanding and maintenance of source code. However, high-quality code comments are often unavailable in smart contracts, the increasingly popular programs that run on the blockchain. In this paper, we propose a...

Full description

Saved in:
Bibliographic Details
Main Authors: Yang, Zhen, Keung, Jacky, Yu, Xiao, Gu, Xiaodong, Wei, Zhengyuan, Ma, Xiaoxue, Zhang, Miao
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c300t-4fa5e198d04d423eafa3e62c8c4ae057a282eed35daa4bb6bae1857d77d1837e3
cites
container_end_page 12
container_issue
container_start_page 1
container_title
container_volume
creator Yang, Zhen
Keung, Jacky
Yu, Xiao
Gu, Xiaodong
Wei, Zhengyuan
Ma, Xiaoxue
Zhang, Miao
description Code comment has been an important part of computer programs, greatly facilitating the understanding and maintenance of source code. However, high-quality code comments are often unavailable in smart contracts, the increasingly popular programs that run on the blockchain. In this paper, we propose a Multi-Modal Transformer-based (MMTrans) code summarization approach for smart contracts. Specifically, the MMTrans learns the representation of source code from the two heterogeneous modalities of the Abstract Syntax Tree (AST), i.e., Structure-based Traversal (SBT) sequences and graphs. The SBT sequence provides the global semantic information of AST, while the graph convolution focuses on the local details. The MMTrans uses two encoders to extract both global and local semantic information from the two modalities respectively, and then uses a joint decoder to generate code comments. Both the encoders and the decoder employ the multi-head attention structure of the Transformer to enhance the ability to capture the long-range dependencies between code tokens. We build a dataset with over 300K pairs of smart contracts, and evaluate the MMTrans on it. The experimental results demonstrate that the MMTrans outperforms the state-of-the-art baselines in terms of four evaluation metrics by a substantial margin, and can generate higher quality comments.
doi_str_mv 10.1109/ICPC52881.2021.00010
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_9463060</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9463060</ieee_id><sourcerecordid>9463060</sourcerecordid><originalsourceid>FETCH-LOGICAL-c300t-4fa5e198d04d423eafa3e62c8c4ae057a282eed35daa4bb6bae1857d77d1837e3</originalsourceid><addsrcrecordid>eNotzs1Kw0AUBeBREKy1T6CLeYHEe2cmM9NlCP4UWiq0rstN5gYj-SmTdKFPb0BXZ3E-DkeIR4QUEdZPm-K9yJT3mCpQmAIAwpW4Q2szgwY0XIuFskYnDh3eitU4fs1GK9DGuYXY53J3aacm2Q2BWnmM1I_1EDuOSUkjB1kMgeXh0nUUmx-amqGX-fkcB6o-5QzlYS6mWfVTpGoa78VNTe3Iq_9cio-X52Pxlmz3r5si3yaVBpgSU1PGuPYBTDBKM9Wk2arKV4YYMkfKK-ags0BkytKWxOgzF5wL6LVjvRQPf7sNM5_OsZlvfJ_WxmqwoH8Bh8dP_Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts</title><source>IEEE Xplore All Conference Series</source><creator>Yang, Zhen ; Keung, Jacky ; Yu, Xiao ; Gu, Xiaodong ; Wei, Zhengyuan ; Ma, Xiaoxue ; Zhang, Miao</creator><creatorcontrib>Yang, Zhen ; Keung, Jacky ; Yu, Xiao ; Gu, Xiaodong ; Wei, Zhengyuan ; Ma, Xiaoxue ; Zhang, Miao</creatorcontrib><description>Code comment has been an important part of computer programs, greatly facilitating the understanding and maintenance of source code. However, high-quality code comments are often unavailable in smart contracts, the increasingly popular programs that run on the blockchain. In this paper, we propose a Multi-Modal Transformer-based (MMTrans) code summarization approach for smart contracts. Specifically, the MMTrans learns the representation of source code from the two heterogeneous modalities of the Abstract Syntax Tree (AST), i.e., Structure-based Traversal (SBT) sequences and graphs. The SBT sequence provides the global semantic information of AST, while the graph convolution focuses on the local details. The MMTrans uses two encoders to extract both global and local semantic information from the two modalities respectively, and then uses a joint decoder to generate code comments. Both the encoders and the decoder employ the multi-head attention structure of the Transformer to enhance the ability to capture the long-range dependencies between code tokens. We build a dataset with over 300K &lt;method, comment&gt; pairs of smart contracts, and evaluate the MMTrans on it. The experimental results demonstrate that the MMTrans outperforms the state-of-the-art baselines in terms of four evaluation metrics by a substantial margin, and can generate higher quality comments.</description><identifier>EISSN: 2643-7171</identifier><identifier>EISBN: 1665414030</identifier><identifier>EISBN: 9781665414036</identifier><identifier>DOI: 10.1109/ICPC52881.2021.00010</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Code Summarization ; Convolution ; Convolutional codes ; Graph Convolution ; Maintenance engineering ; Measurement ; Semantics ; Smart contracts ; Structure-based Traversal ; Syntactics ; Transformer</subject><ispartof>2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), 2021, p.1-12</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c300t-4fa5e198d04d423eafa3e62c8c4ae057a282eed35daa4bb6bae1857d77d1837e3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9463060$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,23930,23931,25140,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9463060$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yang, Zhen</creatorcontrib><creatorcontrib>Keung, Jacky</creatorcontrib><creatorcontrib>Yu, Xiao</creatorcontrib><creatorcontrib>Gu, Xiaodong</creatorcontrib><creatorcontrib>Wei, Zhengyuan</creatorcontrib><creatorcontrib>Ma, Xiaoxue</creatorcontrib><creatorcontrib>Zhang, Miao</creatorcontrib><title>A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts</title><title>2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC)</title><addtitle>ICPC</addtitle><description>Code comment has been an important part of computer programs, greatly facilitating the understanding and maintenance of source code. However, high-quality code comments are often unavailable in smart contracts, the increasingly popular programs that run on the blockchain. In this paper, we propose a Multi-Modal Transformer-based (MMTrans) code summarization approach for smart contracts. Specifically, the MMTrans learns the representation of source code from the two heterogeneous modalities of the Abstract Syntax Tree (AST), i.e., Structure-based Traversal (SBT) sequences and graphs. The SBT sequence provides the global semantic information of AST, while the graph convolution focuses on the local details. The MMTrans uses two encoders to extract both global and local semantic information from the two modalities respectively, and then uses a joint decoder to generate code comments. Both the encoders and the decoder employ the multi-head attention structure of the Transformer to enhance the ability to capture the long-range dependencies between code tokens. We build a dataset with over 300K &lt;method, comment&gt; pairs of smart contracts, and evaluate the MMTrans on it. The experimental results demonstrate that the MMTrans outperforms the state-of-the-art baselines in terms of four evaluation metrics by a substantial margin, and can generate higher quality comments.</description><subject>Code Summarization</subject><subject>Convolution</subject><subject>Convolutional codes</subject><subject>Graph Convolution</subject><subject>Maintenance engineering</subject><subject>Measurement</subject><subject>Semantics</subject><subject>Smart contracts</subject><subject>Structure-based Traversal</subject><subject>Syntactics</subject><subject>Transformer</subject><issn>2643-7171</issn><isbn>1665414030</isbn><isbn>9781665414036</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2021</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotzs1Kw0AUBeBREKy1T6CLeYHEe2cmM9NlCP4UWiq0rstN5gYj-SmTdKFPb0BXZ3E-DkeIR4QUEdZPm-K9yJT3mCpQmAIAwpW4Q2szgwY0XIuFskYnDh3eitU4fs1GK9DGuYXY53J3aacm2Q2BWnmM1I_1EDuOSUkjB1kMgeXh0nUUmx-amqGX-fkcB6o-5QzlYS6mWfVTpGoa78VNTe3Iq_9cio-X52Pxlmz3r5si3yaVBpgSU1PGuPYBTDBKM9Wk2arKV4YYMkfKK-ags0BkytKWxOgzF5wL6LVjvRQPf7sNM5_OsZlvfJ_WxmqwoH8Bh8dP_Q</recordid><startdate>202105</startdate><enddate>202105</enddate><creator>Yang, Zhen</creator><creator>Keung, Jacky</creator><creator>Yu, Xiao</creator><creator>Gu, Xiaodong</creator><creator>Wei, Zhengyuan</creator><creator>Ma, Xiaoxue</creator><creator>Zhang, Miao</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>202105</creationdate><title>A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts</title><author>Yang, Zhen ; Keung, Jacky ; Yu, Xiao ; Gu, Xiaodong ; Wei, Zhengyuan ; Ma, Xiaoxue ; Zhang, Miao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c300t-4fa5e198d04d423eafa3e62c8c4ae057a282eed35daa4bb6bae1857d77d1837e3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Code Summarization</topic><topic>Convolution</topic><topic>Convolutional codes</topic><topic>Graph Convolution</topic><topic>Maintenance engineering</topic><topic>Measurement</topic><topic>Semantics</topic><topic>Smart contracts</topic><topic>Structure-based Traversal</topic><topic>Syntactics</topic><topic>Transformer</topic><toplevel>online_resources</toplevel><creatorcontrib>Yang, Zhen</creatorcontrib><creatorcontrib>Keung, Jacky</creatorcontrib><creatorcontrib>Yu, Xiao</creatorcontrib><creatorcontrib>Gu, Xiaodong</creatorcontrib><creatorcontrib>Wei, Zhengyuan</creatorcontrib><creatorcontrib>Ma, Xiaoxue</creatorcontrib><creatorcontrib>Zhang, Miao</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang, Zhen</au><au>Keung, Jacky</au><au>Yu, Xiao</au><au>Gu, Xiaodong</au><au>Wei, Zhengyuan</au><au>Ma, Xiaoxue</au><au>Zhang, Miao</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts</atitle><btitle>2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC)</btitle><stitle>ICPC</stitle><date>2021-05</date><risdate>2021</risdate><spage>1</spage><epage>12</epage><pages>1-12</pages><eissn>2643-7171</eissn><eisbn>1665414030</eisbn><eisbn>9781665414036</eisbn><coden>IEEPAD</coden><abstract>Code comment has been an important part of computer programs, greatly facilitating the understanding and maintenance of source code. However, high-quality code comments are often unavailable in smart contracts, the increasingly popular programs that run on the blockchain. In this paper, we propose a Multi-Modal Transformer-based (MMTrans) code summarization approach for smart contracts. Specifically, the MMTrans learns the representation of source code from the two heterogeneous modalities of the Abstract Syntax Tree (AST), i.e., Structure-based Traversal (SBT) sequences and graphs. The SBT sequence provides the global semantic information of AST, while the graph convolution focuses on the local details. The MMTrans uses two encoders to extract both global and local semantic information from the two modalities respectively, and then uses a joint decoder to generate code comments. Both the encoders and the decoder employ the multi-head attention structure of the Transformer to enhance the ability to capture the long-range dependencies between code tokens. We build a dataset with over 300K &lt;method, comment&gt; pairs of smart contracts, and evaluate the MMTrans on it. The experimental results demonstrate that the MMTrans outperforms the state-of-the-art baselines in terms of four evaluation metrics by a substantial margin, and can generate higher quality comments.</abstract><pub>IEEE</pub><doi>10.1109/ICPC52881.2021.00010</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2643-7171
ispartof 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), 2021, p.1-12
issn 2643-7171
language eng
recordid cdi_ieee_primary_9463060
source IEEE Xplore All Conference Series
subjects Code Summarization
Convolution
Convolutional codes
Graph Convolution
Maintenance engineering
Measurement
Semantics
Smart contracts
Structure-based Traversal
Syntactics
Transformer
title A Multi-Modal Transformer-based Code Summarization Approach for Smart Contracts
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T12%3A22%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20Multi-Modal%20Transformer-based%20Code%20Summarization%20Approach%20for%20Smart%20Contracts&rft.btitle=2021%20IEEE/ACM%2029th%20International%20Conference%20on%20Program%20Comprehension%20(ICPC)&rft.au=Yang,%20Zhen&rft.date=2021-05&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.eissn=2643-7171&rft.coden=IEEPAD&rft_id=info:doi/10.1109/ICPC52881.2021.00010&rft.eisbn=1665414030&rft.eisbn_list=9781665414036&rft_dat=%3Cieee_CHZPO%3E9463060%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c300t-4fa5e198d04d423eafa3e62c8c4ae057a282eed35daa4bb6bae1857d77d1837e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=9463060&rfr_iscdi=true