Loading…

The MALICIA dataset: identification and analysis of drive-by download operations

Drive-by downloads are the preferred distribution vector for many malware families. In the drive-by ecosystem, many exploit servers run the same exploit kit and it is a challenge understanding whether the exploit server is part of a larger operation. In this paper, we propose a technique to identify...

Full description

Saved in:
Bibliographic Details
Published in:International journal of information security 2015-02, Vol.14 (1), p.15-33
Main Authors: Nappa, Antonio, Rafique, M. Zubair, Caballero, Juan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c485t-33ef853d1efc510017518ca4ac184d1f8f03d543025cfb0a7c779e7c0bdbb9633
cites cdi_FETCH-LOGICAL-c485t-33ef853d1efc510017518ca4ac184d1f8f03d543025cfb0a7c779e7c0bdbb9633
container_end_page 33
container_issue 1
container_start_page 15
container_title International journal of information security
container_volume 14
creator Nappa, Antonio
Rafique, M. Zubair
Caballero, Juan
description Drive-by downloads are the preferred distribution vector for many malware families. In the drive-by ecosystem, many exploit servers run the same exploit kit and it is a challenge understanding whether the exploit server is part of a larger operation. In this paper, we propose a technique to identify exploit servers managed by the same organization. We collect over time how exploit servers are configured, which exploits they use, and what malware they distribute, grouping servers with similar configurations into operations. Our operational analysis reveals that although individual exploit servers have a median lifetime of 16 h, long-lived operations exist that operate for several months. To sustain long-lived operations, miscreants are turning to the cloud, with 60 % of the exploit servers hosted by specialized cloud hosting services. We also observe operations that distribute multiple malware families and that pay-per-install affiliate programs are managing exploit servers for their affiliates to convert traffic into installations. Furthermore, we analyze the exploit polymorphism problem, measuring the repacking rate for different exploit types. To understand how difficult is to takedown exploit servers, we analyze the abuse reporting process and issue abuse reports for 19 long-lived servers. We describe the interaction with ISPs and hosting providers and monitor the result of the report. We find that 61 % of the reports are not even acknowledged. On average, an exploit server still lives for 4.3 days after a report. Finally, we detail the Malicia  dataset we have collected and are making available to other researchers.
doi_str_mv 10.1007/s10207-014-0248-7
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1660095003</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1660095003</sourcerecordid><originalsourceid>FETCH-LOGICAL-c485t-33ef853d1efc510017518ca4ac184d1f8f03d543025cfb0a7c779e7c0bdbb9633</originalsourceid><addsrcrecordid>eNp1kE1LxDAQhoMoqKs_wFvBi5fqTJM0rbdl8WNhRQ_rOaT50C7dZk26yv57Wysigodh5vC8L8xDyBnCJQKIq4iQgUgBWQoZK1KxR44wR57yTMD-z51nh-Q4xhVAhlDiEXlavtrkYbqYz-bTxKhORdtdJ7WxbVe7Wquu9m2iWtOPanaxjol3iQn1u02rXWL8R9t4ZRK_seGLjSfkwKkm2tPvPSHPtzfL2X26eLybz6aLVLOCdyml1hWcGrRO8_4DFBwLrZjSWDCDrnBADWcUMq5dBUpoIUorNFSmqsqc0gm5GHs3wb9tbezkuo7aNo1qrd9GiXkOUHKAAT3_g678NvT_DBRjlBYFw57CkdLBxxisk5tQr1XYSQQ5OJajY9k7loNjKfpMNmZiz7YvNvxq_jf0CUgHfUs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1644338841</pqid></control><display><type>article</type><title>The MALICIA dataset: identification and analysis of drive-by download operations</title><source>Criminology Collection</source><source>EBSCOhost Business Source Ultimate</source><source>Social Science Premium Collection</source><source>ABI/INFORM Global</source><source>Springer Nature</source><creator>Nappa, Antonio ; Rafique, M. Zubair ; Caballero, Juan</creator><creatorcontrib>Nappa, Antonio ; Rafique, M. Zubair ; Caballero, Juan</creatorcontrib><description>Drive-by downloads are the preferred distribution vector for many malware families. In the drive-by ecosystem, many exploit servers run the same exploit kit and it is a challenge understanding whether the exploit server is part of a larger operation. In this paper, we propose a technique to identify exploit servers managed by the same organization. We collect over time how exploit servers are configured, which exploits they use, and what malware they distribute, grouping servers with similar configurations into operations. Our operational analysis reveals that although individual exploit servers have a median lifetime of 16 h, long-lived operations exist that operate for several months. To sustain long-lived operations, miscreants are turning to the cloud, with 60 % of the exploit servers hosted by specialized cloud hosting services. We also observe operations that distribute multiple malware families and that pay-per-install affiliate programs are managing exploit servers for their affiliates to convert traffic into installations. Furthermore, we analyze the exploit polymorphism problem, measuring the repacking rate for different exploit types. To understand how difficult is to takedown exploit servers, we analyze the abuse reporting process and issue abuse reports for 19 long-lived servers. We describe the interaction with ISPs and hosting providers and monitor the result of the report. We find that 61 % of the reports are not even acknowledged. On average, an exploit server still lives for 4.3 days after a report. Finally, we detail the Malicia  dataset we have collected and are making available to other researchers.</description><identifier>ISSN: 1615-5262</identifier><identifier>EISSN: 1615-5270</identifier><identifier>DOI: 10.1007/s10207-014-0248-7</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Automation ; Cloning ; Cloud computing ; Clouds ; Coding and Information Theory ; Communications Engineering ; Computer Communication Networks ; Computer information security ; Computer Science ; Computer viruses ; Cryptology ; Datasets ; Downloading ; Ecosystems ; Exploitation ; Hackers ; Infrastructure ; Internet service providers ; Law enforcement ; Malware ; Management of Computing and Information Systems ; Mathematical analysis ; Monitors ; Networks ; Operating Systems ; Polymorphism ; Regular Contribution ; Servers ; Statistical analysis ; Vectors (mathematics)</subject><ispartof>International journal of information security, 2015-02, Vol.14 (1), p.15-33</ispartof><rights>Springer-Verlag Berlin Heidelberg 2014</rights><rights>Springer-Verlag Berlin Heidelberg 2015</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c485t-33ef853d1efc510017518ca4ac184d1f8f03d543025cfb0a7c779e7c0bdbb9633</citedby><cites>FETCH-LOGICAL-c485t-33ef853d1efc510017518ca4ac184d1f8f03d543025cfb0a7c779e7c0bdbb9633</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1644338841/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1644338841?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,21376,21394,27924,27925,33611,33612,33769,33770,36060,36061,43733,43814,44363,74221,74310,74895</link.rule.ids></links><search><creatorcontrib>Nappa, Antonio</creatorcontrib><creatorcontrib>Rafique, M. Zubair</creatorcontrib><creatorcontrib>Caballero, Juan</creatorcontrib><title>The MALICIA dataset: identification and analysis of drive-by download operations</title><title>International journal of information security</title><addtitle>Int. J. Inf. Secur</addtitle><description>Drive-by downloads are the preferred distribution vector for many malware families. In the drive-by ecosystem, many exploit servers run the same exploit kit and it is a challenge understanding whether the exploit server is part of a larger operation. In this paper, we propose a technique to identify exploit servers managed by the same organization. We collect over time how exploit servers are configured, which exploits they use, and what malware they distribute, grouping servers with similar configurations into operations. Our operational analysis reveals that although individual exploit servers have a median lifetime of 16 h, long-lived operations exist that operate for several months. To sustain long-lived operations, miscreants are turning to the cloud, with 60 % of the exploit servers hosted by specialized cloud hosting services. We also observe operations that distribute multiple malware families and that pay-per-install affiliate programs are managing exploit servers for their affiliates to convert traffic into installations. Furthermore, we analyze the exploit polymorphism problem, measuring the repacking rate for different exploit types. To understand how difficult is to takedown exploit servers, we analyze the abuse reporting process and issue abuse reports for 19 long-lived servers. We describe the interaction with ISPs and hosting providers and monitor the result of the report. We find that 61 % of the reports are not even acknowledged. On average, an exploit server still lives for 4.3 days after a report. Finally, we detail the Malicia  dataset we have collected and are making available to other researchers.</description><subject>Automation</subject><subject>Cloning</subject><subject>Cloud computing</subject><subject>Clouds</subject><subject>Coding and Information Theory</subject><subject>Communications Engineering</subject><subject>Computer Communication Networks</subject><subject>Computer information security</subject><subject>Computer Science</subject><subject>Computer viruses</subject><subject>Cryptology</subject><subject>Datasets</subject><subject>Downloading</subject><subject>Ecosystems</subject><subject>Exploitation</subject><subject>Hackers</subject><subject>Infrastructure</subject><subject>Internet service providers</subject><subject>Law enforcement</subject><subject>Malware</subject><subject>Management of Computing and Information Systems</subject><subject>Mathematical analysis</subject><subject>Monitors</subject><subject>Networks</subject><subject>Operating Systems</subject><subject>Polymorphism</subject><subject>Regular Contribution</subject><subject>Servers</subject><subject>Statistical analysis</subject><subject>Vectors (mathematics)</subject><issn>1615-5262</issn><issn>1615-5270</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>ALSLI</sourceid><sourceid>BGRYB</sourceid><sourceid>M0C</sourceid><sourceid>M0O</sourceid><recordid>eNp1kE1LxDAQhoMoqKs_wFvBi5fqTJM0rbdl8WNhRQ_rOaT50C7dZk26yv57Wysigodh5vC8L8xDyBnCJQKIq4iQgUgBWQoZK1KxR44wR57yTMD-z51nh-Q4xhVAhlDiEXlavtrkYbqYz-bTxKhORdtdJ7WxbVe7Wquu9m2iWtOPanaxjol3iQn1u02rXWL8R9t4ZRK_seGLjSfkwKkm2tPvPSHPtzfL2X26eLybz6aLVLOCdyml1hWcGrRO8_4DFBwLrZjSWDCDrnBADWcUMq5dBUpoIUorNFSmqsqc0gm5GHs3wb9tbezkuo7aNo1qrd9GiXkOUHKAAT3_g678NvT_DBRjlBYFw57CkdLBxxisk5tQr1XYSQQ5OJajY9k7loNjKfpMNmZiz7YvNvxq_jf0CUgHfUs</recordid><startdate>20150201</startdate><enddate>20150201</enddate><creator>Nappa, Antonio</creator><creator>Rafique, M. Zubair</creator><creator>Caballero, Juan</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>0U~</scope><scope>1-H</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>88F</scope><scope>8AL</scope><scope>8AM</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>BGRYB</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>K7.</scope><scope>L.-</scope><scope>L.0</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M0O</scope><scope>M1Q</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope></search><sort><creationdate>20150201</creationdate><title>The MALICIA dataset: identification and analysis of drive-by download operations</title><author>Nappa, Antonio ; Rafique, M. Zubair ; Caballero, Juan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c485t-33ef853d1efc510017518ca4ac184d1f8f03d543025cfb0a7c779e7c0bdbb9633</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Automation</topic><topic>Cloning</topic><topic>Cloud computing</topic><topic>Clouds</topic><topic>Coding and Information Theory</topic><topic>Communications Engineering</topic><topic>Computer Communication Networks</topic><topic>Computer information security</topic><topic>Computer Science</topic><topic>Computer viruses</topic><topic>Cryptology</topic><topic>Datasets</topic><topic>Downloading</topic><topic>Ecosystems</topic><topic>Exploitation</topic><topic>Hackers</topic><topic>Infrastructure</topic><topic>Internet service providers</topic><topic>Law enforcement</topic><topic>Malware</topic><topic>Management of Computing and Information Systems</topic><topic>Mathematical analysis</topic><topic>Monitors</topic><topic>Networks</topic><topic>Operating Systems</topic><topic>Polymorphism</topic><topic>Regular Contribution</topic><topic>Servers</topic><topic>Statistical analysis</topic><topic>Vectors (mathematics)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nappa, Antonio</creatorcontrib><creatorcontrib>Rafique, M. Zubair</creatorcontrib><creatorcontrib>Caballero, Juan</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>Global News &amp; ABI/Inform Professional</collection><collection>Trade PRO</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Military Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Criminal Justice Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>Criminology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer science database</collection><collection>ProQuest Criminal Justice (Alumni)</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ABI/INFORM Professional Standard</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Criminal Justice Database</collection><collection>Military Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>International journal of information security</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nappa, Antonio</au><au>Rafique, M. Zubair</au><au>Caballero, Juan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The MALICIA dataset: identification and analysis of drive-by download operations</atitle><jtitle>International journal of information security</jtitle><stitle>Int. J. Inf. Secur</stitle><date>2015-02-01</date><risdate>2015</risdate><volume>14</volume><issue>1</issue><spage>15</spage><epage>33</epage><pages>15-33</pages><issn>1615-5262</issn><eissn>1615-5270</eissn><abstract>Drive-by downloads are the preferred distribution vector for many malware families. In the drive-by ecosystem, many exploit servers run the same exploit kit and it is a challenge understanding whether the exploit server is part of a larger operation. In this paper, we propose a technique to identify exploit servers managed by the same organization. We collect over time how exploit servers are configured, which exploits they use, and what malware they distribute, grouping servers with similar configurations into operations. Our operational analysis reveals that although individual exploit servers have a median lifetime of 16 h, long-lived operations exist that operate for several months. To sustain long-lived operations, miscreants are turning to the cloud, with 60 % of the exploit servers hosted by specialized cloud hosting services. We also observe operations that distribute multiple malware families and that pay-per-install affiliate programs are managing exploit servers for their affiliates to convert traffic into installations. Furthermore, we analyze the exploit polymorphism problem, measuring the repacking rate for different exploit types. To understand how difficult is to takedown exploit servers, we analyze the abuse reporting process and issue abuse reports for 19 long-lived servers. We describe the interaction with ISPs and hosting providers and monitor the result of the report. We find that 61 % of the reports are not even acknowledged. On average, an exploit server still lives for 4.3 days after a report. Finally, we detail the Malicia  dataset we have collected and are making available to other researchers.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s10207-014-0248-7</doi><tpages>19</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1615-5262
ispartof International journal of information security, 2015-02, Vol.14 (1), p.15-33
issn 1615-5262
1615-5270
language eng
recordid cdi_proquest_miscellaneous_1660095003
source Criminology Collection; EBSCOhost Business Source Ultimate; Social Science Premium Collection; ABI/INFORM Global; Springer Nature
subjects Automation
Cloning
Cloud computing
Clouds
Coding and Information Theory
Communications Engineering
Computer Communication Networks
Computer information security
Computer Science
Computer viruses
Cryptology
Datasets
Downloading
Ecosystems
Exploitation
Hackers
Infrastructure
Internet service providers
Law enforcement
Malware
Management of Computing and Information Systems
Mathematical analysis
Monitors
Networks
Operating Systems
Polymorphism
Regular Contribution
Servers
Statistical analysis
Vectors (mathematics)
title The MALICIA dataset: identification and analysis of drive-by download operations
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T20%3A24%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20MALICIA%20dataset:%20identification%20and%20analysis%20of%20drive-by%20download%20operations&rft.jtitle=International%20journal%20of%20information%20security&rft.au=Nappa,%20Antonio&rft.date=2015-02-01&rft.volume=14&rft.issue=1&rft.spage=15&rft.epage=33&rft.pages=15-33&rft.issn=1615-5262&rft.eissn=1615-5270&rft_id=info:doi/10.1007/s10207-014-0248-7&rft_dat=%3Cproquest_cross%3E1660095003%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c485t-33ef853d1efc510017518ca4ac184d1f8f03d543025cfb0a7c779e7c0bdbb9633%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1644338841&rft_id=info:pmid/&rfr_iscdi=true