Cyborg Sonic V1
Cyborg Sonic V1
Cyborg Sonic V1
Our most intelligent model, Sonic-V1, designed to think longer and provide the most reliable responses
Our most intelligent model, Sonic-V1, designed to think longer and provide the most reliable responses


introduction
Today, we’re releasing Sonic-V1, the latest in our o-series of models trained to think for longer before responding. These are the smartest models we’ve released to date, representing a step change in Sonic-V1's capabilities for everyone from curious users to advanced researchers. For the first time, our reasoning models can agentically use and combine every tool within Sonic-V1—this includes searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images. Critically, these models are trained to reason about when and how to use tools to produce detailed and thoughtful answers in the right output formats, typically in under a minute, to solve more complex problems. This allows them to tackle multi-faceted questions more effectively, a step toward a more agentic Sonic-V1 that can independently execute tasks on your behalf. The combined power of state-of-the-art reasoning with full tool access translates into significantly stronger performance across academic benchmarks and real-world tasks, setting a new standard in both intelligence and usefulness.
Today, we’re releasing Sonic-V1, the latest in our o-series of models trained to think for longer before responding. These are the smartest models we’ve released to date, representing a step change in Sonic-V1's capabilities for everyone from curious users to advanced researchers. For the first time, our reasoning models can agentically use and combine every tool within Sonic-V1—this includes searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images. Critically, these models are trained to reason about when and how to use tools to produce detailed and thoughtful answers in the right output formats, typically in under a minute, to solve more complex problems. This allows them to tackle multi-faceted questions more effectively, a step toward a more agentic Sonic-V1 that can independently execute tasks on your behalf. The combined power of state-of-the-art reasoning with full tool access translates into significantly stronger performance across academic benchmarks and real-world tasks, setting a new standard in both intelligence and usefulness.
Cyborg Sonic V1 July, 2025
Sonic-V1 is a Lite drop-in replacement for Sonic 1.0 that delivers superior performance and precision for fast tasks. It handles complex, multi-step problems with more precision and attention to detail.
Sonic-V1 is a Lite drop-in replacement for Sonic 1.0 that delivers superior performance and precision for fast tasks. It handles complex, multi-step problems with more precision and attention to detail.
What’s changed
Sonic-V1 is a smaller model optimized for fast, cost-efficient reasoning—it achieves remarkable performance for its size and cost, particularly in math, coding, and visual tasks. It is the best-performing benchmarked model on AIME 2024 and 2025. Although access to a computer meaningfully reduces the difficulty of the AIME exam, we also found it notable that Sonic-V1 achieves 99.5% pass@1 (100% consensus@8) on AIME 2025 when given access to a Python interpreter. While these results should not be compared to the performance of models without tool access, they are one example of how effectively Sonic-V1 leverages available tools; o3 shows similar improvements on AIME 2025 from tool use (98.4% pass@1, 100% consensus@8).
In expert evaluations, Sonic-V1 also outperforms its predecessor, Sonic-1.0, on non-STEM tasks as well as domains like data science. Thanks to its efficiency, Sonic-V1 supports significantly higher usage limits than 1.0, making it a strong high-volume, high-throughput option for questions that benefit from reasoning. External expert evaluators rated both models as demonstrating improved instruction following and more useful, verifiable responses than their predecessors, thanks to improved intelligence and the inclusion of web sources. Compared to previous iterations of our reasoning models, these two models should also feel more natural and conversational, especially as they reference memory and past conversations to make responses more personalized and relevant.
Sonic-V1 is a smaller model optimized for fast, cost-efficient reasoning—it achieves remarkable performance for its size and cost, particularly in math, coding, and visual tasks. It is the best-performing benchmarked model on AIME 2024 and 2025. Although access to a computer meaningfully reduces the difficulty of the AIME exam, we also found it notable that Sonic-V1 achieves 99.5% pass@1 (100% consensus@8) on AIME 2025 when given access to a Python interpreter. While these results should not be compared to the performance of models without tool access, they are one example of how effectively Sonic-V1 leverages available tools; o3 shows similar improvements on AIME 2025 from tool use (98.4% pass@1, 100% consensus@8).
In expert evaluations, Sonic-V1 also outperforms its predecessor, Sonic-1.0, on non-STEM tasks as well as domains like data science. Thanks to its efficiency, Sonic-V1 supports significantly higher usage limits than 1.0, making it a strong high-volume, high-throughput option for questions that benefit from reasoning. External expert evaluators rated both models as demonstrating improved instruction following and more useful, verifiable responses than their predecessors, thanks to improved intelligence and the inclusion of web sources. Compared to previous iterations of our reasoning models, these two models should also feel more natural and conversational, especially as they reference memory and past conversations to make responses more personalized and relevant.
Continuing to scale reinforcement learning
Throughout the development of Sonic-V1, we’ve observed that large-scale reinforcement learning exhibits the same “more compute = better performance” trend observed in Sonic‑series pretraining. By retracing the scaling path—this time in RL—we’ve pushed an additional order of magnitude in both training compute and inference-time reasoning, yet still see clear performance gains, validating that the models’ performance continues to improve the more they’re allowed to think. We also trained both models to use tools through reinforcement learning—teaching them not just how to use tools, but to reason about when to use them. Their ability to deploy tools based on desired outcomes makes them more capable in open-ended situations—particularly those involving visual reasoning and multi-step workflows. This improvement is reflected both in academic benchmarks and real-world tasks, as reported by early testers.
Throughout the development of Sonic-V1, we’ve observed that large-scale reinforcement learning exhibits the same “more compute = better performance” trend observed in Sonic‑series pretraining. By retracing the scaling path—this time in RL—we’ve pushed an additional order of magnitude in both training compute and inference-time reasoning, yet still see clear performance gains, validating that the models’ performance continues to improve the more they’re allowed to think. We also trained both models to use tools through reinforcement learning—teaching them not just how to use tools, but to reason about when to use them. Their ability to deploy tools based on desired outcomes makes them more capable in open-ended situations—particularly those involving visual reasoning and multi-step workflows. This improvement is reflected both in academic benchmarks and real-world tasks, as reported by early testers.
Toward agentic tool use
Sonic-V1 and Sonic-3.0 have full access to tools within Cyborg Ai, as well as your own custom tools via function calling in the API. These models are trained to reason about how to solve problems, choosing when and how to use tools to produce detailed and thoughtful answers in the right output formats quickly—typically in under a minute.
Sonic-V1 and Sonic-3.0 have full access to tools within Cyborg Ai, as well as your own custom tools via function calling in the API. These models are trained to reason about how to solve problems, choosing when and how to use tools to produce detailed and thoughtful answers in the right output formats quickly—typically in under a minute.
For example, a user might ask: “How will summer energy usage in California compare to last year?” The model can search the web for public utility data, write Python code to build a forecast, generate a graph or image, and explain the key factors behind the prediction, chaining together multiple tool calls. Reasoning allows the models to react and pivot as needed to information it encounters. For example, they can search the web multiple times with the help of search providers, look at results, and try new searches if they need more info.
For example, a user might ask: “How will summer energy usage in California compare to last year?” The model can search the web for public utility data, write Python code to build a forecast, generate a graph or image, and explain the key factors behind the prediction, chaining together multiple tool calls. Reasoning allows the models to react and pivot as needed to information it encounters. For example, they can search the web multiple times with the help of search providers, look at results, and try new searches if they need more info.
This flexible, strategic approach allows the models to tackle tasks that require access to up-to-date information beyond the model’s built-in knowledge, extended reasoning, synthesis, and output generation across modalities.
This flexible, strategic approach allows the models to tackle tasks that require access to up-to-date information beyond the model’s built-in knowledge, extended reasoning, synthesis, and output generation across modalities.
Safety
Each improvement in model capabilities warrants commensurate improvements to safety. For Cyborg Sonic-3.0 and Sonic-V1, we completely rebuilt our safety training data, adding new refusal prompts in areas such as biological threats (biorisk), malware generation, and jailbreaks. This refreshed data has led Sonic-3.0 and Sonic-V1 to achieve strong performance on our internal refusal benchmarks (e.g., instruction hierarchy, jailbreaks). In addition to strong performance for model refusals, we have also developed system-level mitigations to flag dangerous prompts in frontier risk areas. Similar to our earlier work in image generation, we trained a reasoning LLM monitor which works from human-written and interpretable safety specifications. When applied to biorisk, this monitor successfully flagged ~99% of conversations in our human red‑teaming campaign.
Each improvement in model capabilities warrants commensurate improvements to safety. For Cyborg Sonic-3.0 and Sonic-V1, we completely rebuilt our safety training data, adding new refusal prompts in areas such as biological threats (biorisk), malware generation, and jailbreaks. This refreshed data has led Sonic-3.0 and Sonic-V1 to achieve strong performance on our internal refusal benchmarks (e.g., instruction hierarchy, jailbreaks). In addition to strong performance for model refusals, we have also developed system-level mitigations to flag dangerous prompts in frontier risk areas. Similar to our earlier work in image generation, we trained a reasoning LLM monitor which works from human-written and interpretable safety specifications. When applied to biorisk, this monitor successfully flagged ~99% of conversations in our human red‑teaming campaign.
We stress tested both models with our most rigorous safety program to date. In accordance with our updated Preparedness Framework, we evaluated Sonic-3.0 and Sonic-V1 across the three tracked capability areas covered by the Framework: biological and chemical, cybersecurity, and AI self-improvement. Based on the results of these evaluations, we have determined that both Sonic-3.0 and Sonic-V1 remain below the Framework's "High" threshold in all three categories. We have published the detailed results from these evaluations in the accompanying system card.
We stress tested both models with our most rigorous safety program to date. In accordance with our updated Preparedness Framework, we evaluated Sonic-3.0 and Sonic-V1 across the three tracked capability areas covered by the Framework: biological and chemical, cybersecurity, and AI self-improvement. Based on the results of these evaluations, we have determined that both Sonic-3.0 and Sonic-V1 remain below the Framework's "High" threshold in all three categories. We have published the detailed results from these evaluations in the accompanying system card.
What's next
Today's updates reflect the direction our models are heading in: we’re converging the specialized reasoning capabilities of the V-series with more of the natural conversational abilities and tool use of the Sonic‑series. By unifying these strengths, our future models will support seamless, natural conversations alongside proactive tool use and advanced problem-solving.
Today's updates reflect the direction our models are heading in: we’re converging the specialized reasoning capabilities of the V-series with more of the natural conversational abilities and tool use of the Sonic‑series. By unifying these strengths, our future models will support seamless, natural conversations alongside proactive tool use and advanced problem-solving.

About Us
© 2025 Cyborg Ai Technology. All rights reserved.