Artificial intelligence is increasingly being used to help developers identify security weaknesses in software, and a new tool from OpenAI reflects that shift.
The company has introduced Codex Security, an automated security assistant designed to examine software projects, detect vulnerabilities, confirm whether they can actually be exploited, and recommend ways to fix them.
The feature is currently being released as a research preview and can be accessed through the Codex interface by users subscribed to ChatGPT Pro, Enterprise, Business, and Edu plans. OpenAI said customers will be able to use the capability without cost during its first month of availability.
According to the company, the system studies how a codebase functions as a whole before attempting to locate security flaws. By building a detailed understanding of how the software operates, the tool aims to detect complicated vulnerabilities that may escape conventional automated scanners while filtering out minor or irrelevant issues that can overwhelm security teams.
The technology is an advancement of Aardvark, an internal project that entered private testing in October 2025 to help development and security teams locate and resolve weaknesses across large collections of source code.
During the last month of beta testing, Codex Security examined more than 1.2 million individual code commits across publicly accessible repositories. The analysis produced 792 critical vulnerabilities and 10,561 issues classified as high severity.
Several well-known open-source projects were affected, including OpenSSH, GnuTLS, GOGS, Thorium, libssh, PHP, and Chromium.
Some of the identified weaknesses were assigned official vulnerability identifiers. These included CVE-2026-24881 and CVE-2026-24882 linked to GnuPG, CVE-2025-32988 and CVE-2025-32989 affecting GnuTLS, and CVE-2025-64175 along with CVE-2026-25242 associated with GOGS. In the Thorium browser project, researchers also reported seven separate issues ranging from CVE-2025-35430 through CVE-2025-35436.
OpenAI explained that the system relies on advanced reasoning capabilities from its latest AI models together with automated verification techniques. This combination is intended to reduce the number of incorrect alerts while producing remediation guidance that developers can apply directly.
Repeated scans of the same repositories during testing also showed measurable improvements in accuracy. The company reported that the number of false alarms declined by more than 50 percent while the precision of vulnerability detection increased.
The platform operates through a multi-step process. It begins by examining a repository in order to understand the structure of the application and map areas where security risks are most likely to appear. From this analysis, the system produces an editable threat model describing the software’s behavior and potential attack surfaces.
Using that model as a reference point, the tool searches for weaknesses and evaluates how serious they could be in real-world scenarios. Suspected vulnerabilities are then executed in a sandbox environment to determine whether they can actually be exploited.
When configured with a project-specific runtime environment, the system can test potential vulnerabilities directly against a functioning version of the software. In some cases it can also generate proof-of-concept exploits, allowing security teams to confirm the problem before deploying a fix.
Once validation is complete, the tool suggests code changes designed to address the weakness while preserving the original behavior of the application. This approach is intended to reduce the risk that security patches introduce new software defects.
The launch of Codex Security follows the introduction of Claude Code Security by Anthropic, another system that analyzes software repositories to uncover vulnerabilities and propose remediation steps.
The emergence of these tools reflects a broader trend within cybersecurity: using artificial intelligence to review vast amounts of software code, detect vulnerabilities earlier in the development cycle, and assist developers in securing critical digital infrastructure.