Loading…
OSSFP: Precise and Scalable C/C++ Third-Party Library Detection Using Fingerprinting Functions
Third-party libraries (TPLs) are frequently used in software to boost efficiency by avoiding repeated developments. However, the massive using TPLs also brings security threats since TPLs may introduce bugs and vulnerabilities. Therefore, software composition analysis (SCA) tools have been proposed...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Third-party libraries (TPLs) are frequently used in software to boost efficiency by avoiding repeated developments. However, the massive using TPLs also brings security threats since TPLs may introduce bugs and vulnerabilities. Therefore, software composition analysis (SCA) tools have been proposed to detect and manage TPL usage. Unfortunately, due to the presence of common and trivial functions in the bloated feature dataset, existing tools fail to precisely and rapidly identify TPLs in C/C++ real-world projects. To this end, we propose OSSFP, a novel SCA framework for effective and efficient TPL detection in large-scale real-world projects via generating unique fingerprints for open source software. By removing common and trivial functions and keeping only the core functions to build the fingerprint index for each TPL project, OSSFP significantly reduces the database size and accelerates the detection process. It also improves TPL detection accuracy since noises are excluded from the fingerprints. We applied OSSFP on a large data set containing 23,427 C/C++ repositories, which included 585,683 versions and 90 billion lines of code. The result showed that it could achieve 90.84% of recall and 90.34% of precision, which outperformed the state-of-the-art tool by 35.31% and 3.71%, respectively. OSSFP took only 0.12 seconds on average to identify all TPLs per project, which was 22 times faster than the other tool. OSSFP has proven to be highly scalable on large-scale datasets. |
---|---|
ISSN: | 1558-1225 |
DOI: | 10.1109/ICSE48619.2023.00034 |