Loading…
Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets
Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to obtain authorship-related features. We perform a systematic analysis of works in the area of malw...
Saved in:
Published in: | ACM computing surveys 2024-08, Vol.56 (8), p.1-36, Article 212 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-a267t-36430d624b198843d5a851a0c7d2ef9e9f2224da95a309820f8009056a5fcfdb3 |
container_end_page | 36 |
container_issue | 8 |
container_start_page | 1 |
container_title | ACM computing surveys |
container_volume | 56 |
creator | Gray, Jason Sgandurra, Daniele Cavallaro, Lorenzo Blasco Alis, Jorge |
description | Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to obtain authorship-related features. We perform a systematic analysis of works in the area of malware authorship attribution. We identify key findings and some shortcomings of current approaches and explore the open research challenges. To mitigate the lack of ground-truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 17,513 malware labeled to 275 threat actor groups. |
doi_str_mv | 10.1145/3653973 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3090665955</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3090665955</sourcerecordid><originalsourceid>FETCH-LOGICAL-a267t-36430d624b198843d5a851a0c7d2ef9e9f2224da95a309820f8009056a5fcfdb3</originalsourceid><addsrcrecordid>eNo90D1PwzAQBmALgUQpiJ3JEhIsBM6fidnaQqFSKxaYo2tit67SpNjJ0H9PUAvTDfe8d9JLyDWDR8akehJaCZOKEzJgSqVJKiQ7JQMQGhIQAOfkIsYNAHDJ9IAsZqWtW-_2vl7RUdeumxDXfkd9TRdY-cI3XaRjX2PwNj7TqcW2CzY-0Mkaq8rWKxvpHX3BFqNt4yU5c1hFe3WcQ_I1ff2cvCfzj7fZZDRPkOu0TYSWAkrN5ZKZLJOiVJgphlCkJbfOWOM457JEo1CAyTi4DMCA0qhc4cqlGJLbw91daL47G9t803Sh7l_mfQC0VkapXt0fVBGaGIN1-S74LYZ9ziD_7So_dtXLm4PEYvuP_pY_HqVhtw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3090665955</pqid></control><display><type>article</type><title>Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets</title><source>Business Source Ultimate</source><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Gray, Jason ; Sgandurra, Daniele ; Cavallaro, Lorenzo ; Blasco Alis, Jorge</creator><creatorcontrib>Gray, Jason ; Sgandurra, Daniele ; Cavallaro, Lorenzo ; Blasco Alis, Jorge</creatorcontrib><description>Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to obtain authorship-related features. We perform a systematic analysis of works in the area of malware authorship attribution. We identify key findings and some shortcomings of current approaches and explore the open research challenges. To mitigate the lack of ground-truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 17,513 malware labeled to 275 threat actor groups.</description><identifier>ISSN: 0360-0300</identifier><identifier>EISSN: 1557-7341</identifier><identifier>DOI: 10.1145/3653973</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>Applied computing ; Computer science ; Datasets ; Evidence collection, storage and analysis ; General and reference ; Intelligence gathering ; Investigation techniques ; Malware ; Malware and its mitigation ; Pseudonymity, anonymity and untraceability ; Security and privacy ; Surveys and overviews ; Threat evaluation</subject><ispartof>ACM computing surveys, 2024-08, Vol.56 (8), p.1-36, Article 212</ispartof><rights>Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from</rights><rights>Copyright Association for Computing Machinery Aug 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a267t-36430d624b198843d5a851a0c7d2ef9e9f2224da95a309820f8009056a5fcfdb3</cites><orcidid>0000-0003-4392-9023 ; 0000-0001-5238-8068 ; 0000-0003-3518-7023 ; 0000-0002-3878-2680</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Gray, Jason</creatorcontrib><creatorcontrib>Sgandurra, Daniele</creatorcontrib><creatorcontrib>Cavallaro, Lorenzo</creatorcontrib><creatorcontrib>Blasco Alis, Jorge</creatorcontrib><title>Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets</title><title>ACM computing surveys</title><addtitle>ACM CSUR</addtitle><description>Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to obtain authorship-related features. We perform a systematic analysis of works in the area of malware authorship attribution. We identify key findings and some shortcomings of current approaches and explore the open research challenges. To mitigate the lack of ground-truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 17,513 malware labeled to 275 threat actor groups.</description><subject>Applied computing</subject><subject>Computer science</subject><subject>Datasets</subject><subject>Evidence collection, storage and analysis</subject><subject>General and reference</subject><subject>Intelligence gathering</subject><subject>Investigation techniques</subject><subject>Malware</subject><subject>Malware and its mitigation</subject><subject>Pseudonymity, anonymity and untraceability</subject><subject>Security and privacy</subject><subject>Surveys and overviews</subject><subject>Threat evaluation</subject><issn>0360-0300</issn><issn>1557-7341</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo90D1PwzAQBmALgUQpiJ3JEhIsBM6fidnaQqFSKxaYo2tit67SpNjJ0H9PUAvTDfe8d9JLyDWDR8akehJaCZOKEzJgSqVJKiQ7JQMQGhIQAOfkIsYNAHDJ9IAsZqWtW-_2vl7RUdeumxDXfkd9TRdY-cI3XaRjX2PwNj7TqcW2CzY-0Mkaq8rWKxvpHX3BFqNt4yU5c1hFe3WcQ_I1ff2cvCfzj7fZZDRPkOu0TYSWAkrN5ZKZLJOiVJgphlCkJbfOWOM457JEo1CAyTi4DMCA0qhc4cqlGJLbw91daL47G9t803Sh7l_mfQC0VkapXt0fVBGaGIN1-S74LYZ9ziD_7So_dtXLm4PEYvuP_pY_HqVhtw</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Gray, Jason</creator><creator>Sgandurra, Daniele</creator><creator>Cavallaro, Lorenzo</creator><creator>Blasco Alis, Jorge</creator><general>ACM</general><general>Association for Computing Machinery</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-4392-9023</orcidid><orcidid>https://orcid.org/0000-0001-5238-8068</orcidid><orcidid>https://orcid.org/0000-0003-3518-7023</orcidid><orcidid>https://orcid.org/0000-0002-3878-2680</orcidid></search><sort><creationdate>20240801</creationdate><title>Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets</title><author>Gray, Jason ; Sgandurra, Daniele ; Cavallaro, Lorenzo ; Blasco Alis, Jorge</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a267t-36430d624b198843d5a851a0c7d2ef9e9f2224da95a309820f8009056a5fcfdb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Applied computing</topic><topic>Computer science</topic><topic>Datasets</topic><topic>Evidence collection, storage and analysis</topic><topic>General and reference</topic><topic>Intelligence gathering</topic><topic>Investigation techniques</topic><topic>Malware</topic><topic>Malware and its mitigation</topic><topic>Pseudonymity, anonymity and untraceability</topic><topic>Security and privacy</topic><topic>Surveys and overviews</topic><topic>Threat evaluation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gray, Jason</creatorcontrib><creatorcontrib>Sgandurra, Daniele</creatorcontrib><creatorcontrib>Cavallaro, Lorenzo</creatorcontrib><creatorcontrib>Blasco Alis, Jorge</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>ACM computing surveys</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gray, Jason</au><au>Sgandurra, Daniele</au><au>Cavallaro, Lorenzo</au><au>Blasco Alis, Jorge</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets</atitle><jtitle>ACM computing surveys</jtitle><stitle>ACM CSUR</stitle><date>2024-08-01</date><risdate>2024</risdate><volume>56</volume><issue>8</issue><spage>1</spage><epage>36</epage><pages>1-36</pages><artnum>212</artnum><issn>0360-0300</issn><eissn>1557-7341</eissn><abstract>Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to obtain authorship-related features. We perform a systematic analysis of works in the area of malware authorship attribution. We identify key findings and some shortcomings of current approaches and explore the open research challenges. To mitigate the lack of ground-truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 17,513 malware labeled to 275 threat actor groups.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3653973</doi><tpages>36</tpages><orcidid>https://orcid.org/0000-0003-4392-9023</orcidid><orcidid>https://orcid.org/0000-0001-5238-8068</orcidid><orcidid>https://orcid.org/0000-0003-3518-7023</orcidid><orcidid>https://orcid.org/0000-0002-3878-2680</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0360-0300 |
ispartof | ACM computing surveys, 2024-08, Vol.56 (8), p.1-36, Article 212 |
issn | 0360-0300 1557-7341 |
language | eng |
recordid | cdi_proquest_journals_3090665955 |
source | Business Source Ultimate; Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list) |
subjects | Applied computing Computer science Datasets Evidence collection, storage and analysis General and reference Intelligence gathering Investigation techniques Malware Malware and its mitigation Pseudonymity, anonymity and untraceability Security and privacy Surveys and overviews Threat evaluation |
title | Identifying Authorship in Malicious Binaries: Features, Challenges & Datasets |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T11%3A14%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identifying%20Authorship%20in%20Malicious%20Binaries:%20Features,%20Challenges%20&%20Datasets&rft.jtitle=ACM%20computing%20surveys&rft.au=Gray,%20Jason&rft.date=2024-08-01&rft.volume=56&rft.issue=8&rft.spage=1&rft.epage=36&rft.pages=1-36&rft.artnum=212&rft.issn=0360-0300&rft.eissn=1557-7341&rft_id=info:doi/10.1145/3653973&rft_dat=%3Cproquest_cross%3E3090665955%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a267t-36430d624b198843d5a851a0c7d2ef9e9f2224da95a309820f8009056a5fcfdb3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3090665955&rft_id=info:pmid/&rfr_iscdi=true |