Loading…
Identifying runtime libraries in statically linked linux binaries
Vulnerabilities in unpatched applications can originate from third-party dependencies in statically linked applications, as they must be relinked each time to take advantage of libraries that have been updated to fix any vulnerability. Despite this, malware binaries are often statically linked to en...
Saved in:
Published in: | Future generation computer systems 2025-03, Vol.164, p.107602, Article 107602 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Vulnerabilities in unpatched applications can originate from third-party dependencies in statically linked applications, as they must be relinked each time to take advantage of libraries that have been updated to fix any vulnerability. Despite this, malware binaries are often statically linked to ensure they run on target platforms and to complicate malware analysis. In this sense, identification of libraries in malware analysis becomes crucial to help filter out those library functions and focus on malware function analysis. In this paper, we introduce MANTILLA, a system for identifying runtime libraries in statically linked Linux-based binaries. Our system is based on radare2 to identify functions and extract their features (independent of the underlying architecture of the binary) through static binary analysis and on the K-nearest neighbors supervised machine learning model and a majority rule to predict final values. MANTILLA is evaluated on a dataset consisting of binaries built for different architectures (MIPSeb, ARMel, Intel x86, and Intel x86-64) and different runtime libraries (uClibc, glibc, and musl), achieving very high accuracy. We also evaluate it in two case studies. First, using a dataset of binary files belonging to the binutils collection and second, using an IoT malware dataset. In both cases, good accuracy results are obtained both in terms of runtime library detection (94.4% and 95.5%, respectively) and architecture identification (100% and 98.6%, respectively).
•We present MANTILLA, a system that automatically identifies the runtime library within a given binary using static binary analysis and KNN classification.•We evaluate MANTILLA on a dataset of real-world (statically linked) IoT malware, observing that majority of them prefer using uClibc to glibc and musl.•Our system achieves accuracy results close to 95.5% on runtime library identification.•In addition, we also evaluate our system with a dataset created with the tools from the binutils collection, achieving an accuracy of 94.4%. |
---|---|
ISSN: | 0167-739X |
DOI: | 10.1016/j.future.2024.107602 |