OpenAI just rolled out the full version of GPT 5.5 Cyber, a specialized AI built for cybersecurity. And they're pretty excited about one thing: it beat Anthropic’s Claude Mythos 5 on CyberGym, which is a tough benchmark that tests whether AI can spot known software vulnerabilities. This launch follows a preview version from a few months ago and really shows OpenAI is putting more energy into cybersecurity as a major use case for advanced AI.
This new model doesn’t just find basic vulnerabilities. OpenAI says GPT 5.5 Cyber digs into big codebases, finds security-critical components, checks if risky code is actually exploitable, and helps teams figure out how to fix issues. They designed it to be powerful and more flexible for expert, authorized cybersecurity work, while still keeping all the reasoning and coding skills you’d expect from the regular GPT 5.5.
HIGHLIGHTS
OpenAI’s new numbers show GPT 5.5 Cyber leading the pack on CyberGym, a benchmark that tests how well AI can identify known software vulnerabilities. GPT 5.5 Cyber scored 85.6%, beating the general GPT 5.5 model at 81.8% and nudging past Claude Mythos 5, which landed at 83.8%. This isn’t the widest gap, but it’s enough for OpenAI to claim an edge - especially since CyberGym is a key test in the world of cybersecurity AI.
What does all this mean? Specialized training for cybersecurity really helps. OpenAI wants to show that narrowing the AI’s focus pays off, especially when it comes to catching real-world security flaws.
These results also highlight the bigger story: AI companies are now racing to build models that cater to security professionals. In that field, accuracy and technical know-how matter a lot more than general conversation skills. And with this benchmark, OpenAI can say their specialized model is leading - at least for now.
For its research into security AI OpenAI is the kind of company to compare the specialist model against a generalist model - that comparison in this case is of GPT 5.5 Cyber with a generalist version of GPT 5.5. What Open AI claims that with the cyber security specialised model the gains it’s made over GPT 5.5 the cyber version scores 39.5% against a figure of 25.95% when measured on ExploitGym. This is a test which ascertains whether an AI can produce a known exploit and develop the ability to deliver the necessary outcome of unauthorized code execution within.
As per SEC-bench Pro the specialist model performs with a score of 69.8%, compared to 63.1%.
As one can likely deduce what has been shown in these tests indicates that rather than the system only being able to recognize bugs in code it has also learned about broader, wider security risks and in depth details on exploit development paths which may arise from said code and its use cases. To that end a number of cybersecurity teams could, using such a system as an assistant, be capable of drilling into much larger code base sets whilst having the capacity to better judge risk with potential fixes identified. OpenAI itself asserts that the system could “help security professionals deal with sophisticated and expansive software domains while still having the flexibility needed for extended and complex analysis.”
Despite highlighting benchmark gains, OpenAI stressed that evaluation results alone do not determine whether a cybersecurity model is useful in practice. The company said real-world performance depends on whether an AI system can identify genuine vulnerabilities, separate actionable findings from noise, and help security teams deploy fixes safely.
OpenAI noted that it continues to assess GPT 5.5 Cyber on complex repositories and real remediation workflows. According to the company, practical security work often involves understanding software dependencies, tracing vulnerability paths and validating fixes across large systems, tasks that can be difficult to capture through benchmark testing alone.
The company added that it is continuing evaluations as coordinated disclosure processes conclude, suggesting that real-world testing remains an ongoing part of development for the model.
In conjunction with the rollout, OpenAI also unveiled the Daybreak Cyber Partner Program, which is designed to embed its GPT 5.5 Cyber capabilities within solutions enterprises and security teams consume and utilize. Through the Daybreak Cyber Program, participating vendors can tap into the company's model with Trusted Access for Cyber. Some of the initial partners listed include Akamai, Accenture, Palo Alto Networks, Zscaler, Cisco, Wiz, and IBM, Cloudflare, Proofpoint, CrowdStrike, and SentinelOne.
The vast array of partners, it seems, is an indicator that OpenAI wishes to broaden its technology outside of direct applications.
In the world of enterprise AI, we're seeing this become commonplace as model vendors need partners in order to get them into existing trusted technology ecosystems of the firms themselves.
In parallel, OpenAI revealed it’s been “closely engaging with relevant US government institutions” across cybersecurity and AI safety including CAISI, ONCD and OSTP. The company positions “GPT 5.5 Cyber” as a model built with verified defenders requiring next-level security functionality. In other words, for general use, you're better off with trusted access GPT 5.5.
These updates leverage GPT 5.5's already significant power in areas such as programming, error identification, data analysis, internet searching and prolonged task execution.
The model can now manage complex, unattended task workflows, featuring improved decision-making ability including verification of outcomes and flexible tool deployment. The GPT 5.5 Cyber leverages these abilities in the cybersecurity space which currently has great demand for AI based solutions.
1. Why is CyberGym important to assess the capabilities of AI in cybersecurity?
CyberGym checks whether the AI can repeat and create documented and known software vulnerabilities, suggesting if it has good command over authentic software bugs.
2. What is the difference between the normal GPT 5.5 model and the GPT 5.5 Cyber model?
GPT 5.5 Cyber has been developed with a view to handle specific tasks in the field of authorized cybersecurity like identifying vulnerabilities, providing recommendations for remediation of bugs, and evaluating how efficiently it can trigger exploits, especially on a much broader scale, and in terms of overall code exploration.
3. Why has OpenAI begun measuring its GPT 5.5 Cyber model against Claude’s 5 in benchmarking tests?
As a part of determining where AI security capabilities really stand, the two models have been picked because they have the same aims to address complex cyber security concerns that enterprises commonly face when it comes to large scale code review as well as the validation of potential bugs.
4. What are some things security experts will be able to use the new model for?
OpenAI states that security organizations can leverage its new GPT 5.5 Cyber to analyse vast bodies of code, classify which components are significant to the overall safety and security of a product, evaluate whether a potential vulnerability exists and even come up with a list of recommendations on the various routes a company can use to fix any security holes.
5. How does the Daybreak Cyber Partner Program help make the GPT 5.5 Cyber technology more widely used?
By enlisting enterprise security partners and independent security solution vendors as part of its program, OpenAI can see its Cyber GPT feature incorporated within their product suites and services which is one way for them to integrate these newly obtained capabilities in their existing tools and systems.
6. What reason does OpenAI provide for claims that there is more to score comparisons in cybersecurity than performance metrics alone?
The underlying notion behind such assertions by OpenAI is that what an AI model brings to the table should ultimately include a system’s capability to detect usable cyber vulnerability, minimize the incidence of inaccurate outcomes and offer organizations guidance for managing and fixing cyber risks in an optimal manner.
7. What does this announcement signal about where AI is heading in terms of its role in cybersecurity?
This advancement marks a major milestone, especially for a number of sectors and security solutions which are already showing a marked interest in this new style of model, which is set to help security analysts navigate the whole process of searching for, classifying and rectifying system exploits in a much more efficient manner.
- OpenAI has launched the full version of GPT 5.5 Cyber for verified cybersecurity defenders.
- GPT 5.5 Cyber scored 85.6% on CyberGym, ahead of Claude Mythos 5's 83.8%.
- The model also outperformed standard GPT 5.5 on CyberGym, ExploitGym and SEC-bench Pro.
- OpenAI says the model can analyse large code repositories and assist with remediation workflows.
- The company has launched the Daybreak Cyber Partner Program with firms including Cisco, IBM, Cloudflare and CrowdStrike.
- GPT 5.5 Cyber is being released through a limited rollout focused on verified defenders.
GPT 5.5 Cyber edges past Claude Mythos 5
What does all this mean? Specialized training for cybersecurity really helps. OpenAI wants to show that narrowing the AI’s focus pays off, especially when it comes to catching real-world security flaws.
These results also highlight the bigger story: AI companies are now racing to build models that cater to security professionals. In that field, accuracy and technical know-how matter a lot more than general conversation skills. And with this benchmark, OpenAI can say their specialized model is leading - at least for now.
Improvements extend beyond a single benchmark
As per SEC-bench Pro the specialist model performs with a score of 69.8%, compared to 63.1%.
As one can likely deduce what has been shown in these tests indicates that rather than the system only being able to recognize bugs in code it has also learned about broader, wider security risks and in depth details on exploit development paths which may arise from said code and its use cases. To that end a number of cybersecurity teams could, using such a system as an assistant, be capable of drilling into much larger code base sets whilst having the capacity to better judge risk with potential fixes identified. OpenAI itself asserts that the system could “help security professionals deal with sophisticated and expansive software domains while still having the flexibility needed for extended and complex analysis.”
OpenAI says real-world security work matters more than scores
OpenAI noted that it continues to assess GPT 5.5 Cyber on complex repositories and real remediation workflows. According to the company, practical security work often involves understanding software dependencies, tracing vulnerability paths and validating fixes across large systems, tasks that can be difficult to capture through benchmark testing alone.
The company added that it is continuing evaluations as coordinated disclosure processes conclude, suggesting that real-world testing remains an ongoing part of development for the model.
New partner programme expands OpenAI's cybersecurity ambitions
The vast array of partners, it seems, is an indicator that OpenAI wishes to broaden its technology outside of direct applications.
In the world of enterprise AI, we're seeing this become commonplace as model vendors need partners in order to get them into existing trusted technology ecosystems of the firms themselves.
Part of a wider push into cybersecurity and AI safety
These updates leverage GPT 5.5's already significant power in areas such as programming, error identification, data analysis, internet searching and prolonged task execution.
The model can now manage complex, unattended task workflows, featuring improved decision-making ability including verification of outcomes and flexible tool deployment. The GPT 5.5 Cyber leverages these abilities in the cybersecurity space which currently has great demand for AI based solutions.
Frequently Asked Questions
CyberGym checks whether the AI can repeat and create documented and known software vulnerabilities, suggesting if it has good command over authentic software bugs.
2. What is the difference between the normal GPT 5.5 model and the GPT 5.5 Cyber model?
GPT 5.5 Cyber has been developed with a view to handle specific tasks in the field of authorized cybersecurity like identifying vulnerabilities, providing recommendations for remediation of bugs, and evaluating how efficiently it can trigger exploits, especially on a much broader scale, and in terms of overall code exploration.
3. Why has OpenAI begun measuring its GPT 5.5 Cyber model against Claude’s 5 in benchmarking tests?
As a part of determining where AI security capabilities really stand, the two models have been picked because they have the same aims to address complex cyber security concerns that enterprises commonly face when it comes to large scale code review as well as the validation of potential bugs.
4. What are some things security experts will be able to use the new model for?
OpenAI states that security organizations can leverage its new GPT 5.5 Cyber to analyse vast bodies of code, classify which components are significant to the overall safety and security of a product, evaluate whether a potential vulnerability exists and even come up with a list of recommendations on the various routes a company can use to fix any security holes.
5. How does the Daybreak Cyber Partner Program help make the GPT 5.5 Cyber technology more widely used?
By enlisting enterprise security partners and independent security solution vendors as part of its program, OpenAI can see its Cyber GPT feature incorporated within their product suites and services which is one way for them to integrate these newly obtained capabilities in their existing tools and systems.
6. What reason does OpenAI provide for claims that there is more to score comparisons in cybersecurity than performance metrics alone?
The underlying notion behind such assertions by OpenAI is that what an AI model brings to the table should ultimately include a system’s capability to detect usable cyber vulnerability, minimize the incidence of inaccurate outcomes and offer organizations guidance for managing and fixing cyber risks in an optimal manner.
7. What does this announcement signal about where AI is heading in terms of its role in cybersecurity?
This advancement marks a major milestone, especially for a number of sectors and security solutions which are already showing a marked interest in this new style of model, which is set to help security analysts navigate the whole process of searching for, classifying and rectifying system exploits in a much more efficient manner.