Loading…

Discovering API usage specifications for security detection using two-stage code mining

An application programming interface (API) usage specification, which includes the conditions, calling sequences, and semantic relationships of the API, is important for verifying its correct usage, which is in turn critical for ensuring the security and availability of the target program. However,...

Full description

Saved in:
Bibliographic Details
Published in:Cybersecurity (Singapore) 2024-12, Vol.7 (1), p.30-23, Article 30
Main Authors: Yin, Zhongxu, Song, Yiran, Zong, Guoxiao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:An application programming interface (API) usage specification, which includes the conditions, calling sequences, and semantic relationships of the API, is important for verifying its correct usage, which is in turn critical for ensuring the security and availability of the target program. However, existing techniques either mine the co-occurring relationships of multiple APIs without considering their semantic relationships, or they use data flow and control flow information to extract semantic beliefs on API pairs but difficult to incorporate when mining specifications for multiple APIs. Hence, we propose an API specification mining approach that efficiently extracts a relatively complete list of the API combinations and semantic relationships between APIs. This approach analyzes a target program in two stages. The first stage uses frequent API set mining based on frequent common API identification and filtration to extract the maximal set of frequent context-sensitive API sequences. In the second stage, the API relationship graph is constructed using three semantic relationships extracted from the symbolic path information, and the specifications containing semantic relationships for multiple APIs are mined. The experimental results on six popular open-source code bases of different scales show that the proposed two-stage approach not only yields better results than existing typical approaches, but also can effectively discover the specifications along with the semantic relationships for multiple APIs. Instance analysis shows that the analysis of security-related API call violations can assist in the cause analysis and patch of software vulnerabilities.
ISSN:2523-3246
2523-3246
DOI:10.1186/s42400-024-00224-w