RepairMaster: Enhancing LLM-Based Automated Vulnerability Repair through Cross-Fragment Information Fusion, Structure-Aware Fine-Tuning, and Bimodal Semantic Retrieval
Main article
Abstract
Software vulnerabilities in C/C++ codebases pose critical security threats, and the manual effort required for vulnerability remediation creates a productivity bottleneck in secure software development lifecycles. Large Language Model (LLM)-based Automated Program Repair (APR) holds substantial promise for accelerating vulnerability remediation, but existing approaches face two fundamental limitations that restrict their practical effectiveness in open-world deployment scenarios: (1) the inherent complexity of real-world vulnerability logic, which involves multi-fragment interdependencies across function boundaries and complex control flow patterns that exceed the contextual reasoning capacity of naive code-feeding approaches; and (2) the underutilization of the rich historical patch knowledge accumulated in vulnerability databases, which contains directly relevant repair strategies that could substantially guide LLM generation but requires sophisticated retrieval to access effectively. To address these challenges, this paper proposes RepairMaster, a comprehensive LLM-based vulnerability repair framework integrating three complementary innovations. The Cross-Fragment Information Fusion (CFIF) module enables the LLM to reason across multiple related code fragments---callee functions, global variable definitions, type declarations---that provide essential context for understanding the vulnerability root cause. The Structure-Aware Fine-Tuning (S-AST) mechanism incorporates simplified Abstract Syntax Tree, Control Flow Graph, and Program Dependence Graph structural representations into the fine-tuning objective, enabling the model to learn repair patterns at the code structure level beyond token sequences. The Bimodal Semantic Retrieval Enhancement (BSRE) module retrieves relevant historical patches using joint code embedding and natural language description similarity, providing the LLM with contextually matched repair examples from a database of 5,800+ vulnerable C/C++ functions from 1,700 real-world projects. Evaluation on the benchmark dataset demonstrates EM improvement from 20.00% to 31.76%, BLEU from 25.70% to 29.12%, and CodeBLEU from 39.40% to 43.68% compared to the best prior methods. Validation on real CVE vulnerabilities achieves CodeBLEU = 28.74%, confirming practical applicability.
