Monday, April 28, 2025
HomeAIOpenAI pushes AI agent capabilities with new developer API

OpenAI pushes AI agent capabilities with new developer API

Share

The AI industry is doing its best to will “agents”—pieces of AI-driven software that can perform multistep actions on your behalf—into reality. Several tech companies, including Google, have emphasized agentic features recently, and in January, OpenAI CEO Sam Altman wrote that 2025 would be the year AI agents “join the workforce.”

OpenAI is working to make that promise into a reality. On Tuesday, OpenAI unveiled a new “Responses API” designed to help software developers create AI agents that can perform tasks independently using the company’s AI models. The Responses API will eventually replace the current Assistants API, which OpenAI plans to retire in the first half of 2026.

With the new offering, users can develop custom AI agents that scan company files with a file search utility that rapidly checks company databases (with OpenAI promising not to train its models on these files) and navigate websites—similar to functions available through OpenAI’s Operator agent, whose underlying Computer-Using Agent (CUA) model developers can also access to enable automation of tasks like data entry and other operations.

However, OpenAI acknowledges that its CUA model is not yet reliable for automating tasks on operating systems and can make unintended mistakes. The company describes the new API as an early iteration that it will continue to improve over time.

Developers using the Responses API can access the same models that power ChatGPT Search: GPT-4o search and GPT-4o mini search. These models can browse the web to answer questions and cite sources in their responses.

That’s notable because OpenAI says the added web search ability dramatically improves the factual accuracy of its AI models. On OpenAI’s SimpleQA benchmark, which aims to measure confabulation rate, GPT-4o search scored 90 percent, while GPT-4o mini search achieved 88 percent—both substantially outperforming the larger GPT-4.5 model without search, which scored 63 percent.

Despite these improvements, the technology still has significant limitations. Aside from issues with CUA properly navigating websites, the improved search capability doesn’t completely solve the problem of AI confabulations, with GPT-4o search still making factual mistakes 10 percent of the time.

Alongside the Responses API, OpenAI released the open source Agents SDK, providing developers free tools to integrate models with internal systems, implement safeguards, and monitor agent activities. This toolkit follows OpenAI’s earlier release of Swarm, a framework for orchestrating multiple agents.

These are still early days in the AI agent field, and things will likely improve rapidly. However, at the moment, the AI agent movement remains vulnerable to unrealistic claims, as demonstrated earlier this week when users discovered that Chinese startup Butterfly Effect’s Manus AI agent platform failed to deliver on many of its promises, highlighting the persistent gap between promotional claims and practical functionality in this emerging technology category.

Popular

AI is Transforming the World of Work: Are We Ready for It?

Hey there, friend! Have you noticed how everyone’s buzzing about AI lately? It’s like we can’t escape it—it’s in the news, at...

Huaweis lobbying lands it in a bribery scandal with EU politicians

Huawei is at the center of a fresh scandal in Europe, following reports that lobbyists representing the Chinese tech titan bribed members of...

Related Articles

OmniRetail shakes up Africas B2B e-commerce market with $20M Series A

When Deepankar Rustagi last raised money for OmniRetail in 2022, excitement was high...

From coding tests to billion-dollar startups, Ali Partovis eight-year experiment is paying off

In Silicon Valley, where the same high-wattage names tend to dominate the headlines,...

4chan is back online, says its been starved of money

4chan is partly back online after a hack took the infamous image-sharing site...

Welcome to Chat Haus, the coworking space for AI chatbots

Nestled between an elementary school and a public library in Brooklyn’s Greenpoint neighborhood...

Amazons big book sale just happens to overlap with Independent Bookstore Day

Amazon is raising eyebrows with the timing of its big book sale for...

Google will stop supporting early Nest thermostats on October 25

Google announced this week that beginning on October 25, it will no longer...

Googles DeepMind UK team reportedly seeks to unionize

Around 300 London-based members of Google’s AI-focused DeepMind team are seeking to unionize...

Week in Review: Cluely helps you cheat on everything

Welcome back to Week in Review! We’ve got tons of news for you...