NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Evaluating Large Language Models for Real World Vulnerability Repair in C/C++ Code
Published
Author(s)
Lan Zhang, Qingtian Zou, Anoop Singhal, Xiaoyan Sun, Peng Liu
Abstract
The advent of Large Language Models (LLMs) has enabled advancement in automated code generation, translation, and summarization. Despite their promise, evaluating the use of LLMs in repairing real-world code vulnerabilities remains underexplored. In this study, we address this gap by evaluating the capability of advanced LLMs, such as ChatGPT-4 and Claude, in fixing memory corruption vulnerabilities in real-world C/C++ code. We meticulously curated 223 real-world C/C++ code snippets encompassing a spectrum of memory corruption vulnerabilities, ranging from straightforward memory leaks to intricate buffer errors. Our findings demonstrate the proficiency of LLMs in rectifying simple memory errors like leaks, where fixes are confined to localized code segments. However, their effectiveness diminishes when addressing complicated vulnerabilities necessitating reasoning about cross-cutting concerns and deeper program semantics. Furthermore, we explore techniques for augmenting LLM performance by incorporating additional knowledge. Our results shed light on both the strengths and limitations of LLMs in automated program repair on genuine code, underscoring the need for advancements in reasoning abilities for handling complex code repair tasks.
Proceedings Title
IWSPA 2024: Proceedings of the 10th ACM International Workshop on Security and Privacy Analytics
Conference Dates
June 21, 2024
Conference Location
PORTO, PT
Conference Title
CODASPY 2024: Fourteenth ACM Conference on Data and Application Security and Privacy
Zhang, L.
, Zou, Q.
, Singhal, A.
, Sun, X.
and Liu, P.
(2024),
Evaluating Large Language Models for Real World Vulnerability Repair in C/C++ Code, IWSPA 2024: Proceedings of the 10th ACM International Workshop on Security and Privacy Analytics, PORTO, PT, [online], https://doi.org/10.1145/3643651.3659892, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=957853
(Accessed October 9, 2025)