What does PhD-level AI mean? OpenAIs rumored $20,000 agent plan explained.

The AI industry has a new buzzword: “PhD-level AI.” According to a report from The Information, OpenAI may be planning to launch several specialized AI “agent” products including a $20,000 monthly tier focused on supporting “PhD-level research.” Other reportedly planned agents include a “high-income knowledge worker” assistant at $2,000 monthly and a software developer agent at $10,000 monthly.

OpenAI has not yet confirmed these prices, but they have mentioned PhD-level AI capabilities before. So what exactly constitutes “PhD-level AI”? The term refers to models that supposedly perform tasks requiring doctoral-level expertise. These include agents conducting advanced research, writing and debugging complex code without human intervention, and analyzing large datasets to generate comprehensive reports. The key claim is that these models can tackle problems that typically require years of specialized academic training.

Companies like OpenAI base their “PhD-level” claims on performance in specific benchmark tests. For example, OpenAI’s o1 series models reportedly performed well in science, coding, and math tests, with results similar to human PhD students on challenging tasks. The company’s Deep Research tool, which can generate research papers with citations, scored 26.6 percent on “Humanity’s Last Exam,” a comprehensive evaluation covering over 3,000 questions across more than 100 subjects.

OpenAI’s latest advancement along these lines comes from their o3 and o3-mini models, announced in December. These models build upon the o1 family launched earlier last year. Like o1, the o3 models use what OpenAI calls “private chain of thought,” a simulated reasoning technique where the model runs through an internal dialog and iteratively works through issues before presenting a final answer.

This approach ostensibly mirrors how human researchers spend time thinking about complex problems rather than providing immediate answers. According to OpenAI, the more time you put into this inference-time compute, the better answers you get. So here’s the key point: For $20,000, a customer would presumably be buying tons of thinking time for the AI model to work on difficult problems.

According to OpenAI, o3 earned a record-breaking score on the ARC-AGI visual reasoning benchmark, reaching 87.5 percent in high-compute testing—comparable to human performance at an 85 percent threshold. The model also scored 96.7 percent on the 2024 American Invitational Mathematics Exam, missing just one question, and reached 87.7 percent on GPQA Diamond, which contains graduate-level biology, physics, and chemistry questions.

On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent—suggesting a leap in mathematical reasoning capabilities over the previous model.

Benchmarks vs. real-world value

Ideally, potential applications for a true PhD-level AI model would include analyzing medical research data, supporting climate modeling, and handling routine aspects of research work.

The high price points reported by The Information, if accurate, suggest that OpenAI believes these systems could provide substantial value to businesses. The publication notes that SoftBank, an OpenAI investor, has committed to spending $3 billion on OpenAI’s agent products this year alone—indicating significant business interest despite the costs.

Meanwhile, OpenAI faces financial pressures that may influence its premium pricing strategy. The company reportedly lost approximately $5 billion last year covering operational costs and other expenses related to running its services.

News of OpenAI’s stratospheric pricing plans come after years of relatively affordable AI services that have conditioned users to expect powerful capabilities at relatively low costs. ChatGPT Plus remains $20 per month and Claude Pro costs $30 monthly—both tiny fractions of these proposed enterprise tiers. Even ChatGPT Pro’s $200/month subscription is relatively small compared to the new proposed fees. Whether the performance difference between these tiers will match their thousandfold price difference is an open question.

Despite their benchmark performances, these simulated reasoning models still struggle with confabulations—instances where they generate plausible-sounding but factually incorrect information. This remains a critical concern for research applications where accuracy and reliability are paramount. A $20,000 monthly investment raises questions about whether organizations can trust these systems not to introduce subtle errors into high-stakes research.

In response to the news, several people quipped on social media that companies could hire an actual PhD student for much cheaper. “In case you have forgotten,” wrote xAI developer Hieu Pham in a viral tweet, “most PhD students, including the brightest stars who can do way better work than any current LLMs—are not paid $20K / month.”

While these systems show strong capabilities on specific benchmarks, the “PhD-level” label remains largely a marketing term. These models can process and synthesize information at impressive speeds, but questions remain about how effectively they can handle the creative thinking, intellectual skepticism, and original research that define actual doctoral-level work. On the other hand, they will never get tired or need health insurance, and they will likely continue to improve in capability and drop in cost over time.

Euro zone economy grew faster than initially thought in Q4

One huge leap for market as Costs to top area business liability advances

Jack Daniels hits out at Canada pulling US alcohol

Global coffee trade grinding to a halt, hit hard by brutal price hikes

Dealmaking depressions in UK amidst greater taxes and trade unpredictability

Euro zone economy grew faster than initially thought in Q4

One huge leap for market as Costs to top area business liability advances

Jack Daniels hits out at Canada pulling US alcohol

Global coffee trade grinding to a halt, hit hard by brutal price hikes

Dealmaking depressions in UK amidst greater taxes and trade unpredictability

What does PhD-level AI mean? OpenAIs rumored $20,000 agent plan explained.

Share

Benchmarks vs. real-world value

Measles outbreak hits 208 cases as federal response goes off the rails

Judge allows authors AI copyright lawsuit against Meta to move forward

Google scrubs mentions of diversity and equity from responsible AI team webpage

New DOJ proposal still calls for Google to divest Chrome, but allows for AI investments

Blood Typers is a terrifically tense, terror-filled typing tutor

Popular

Torrenting from a corporate laptop doesnt feel right: Meta emails unsealed

Microsoft brings a DeepSeek model to its cloud

OpenAI scrubs diversity commitment web page from its site

Mistral board member and a16z VC Anjney Midha says DeepSeek gainedt live AIs GPU hunger

Why Some Workers Wont Have Time to Adapt to AI

OpenAI Rejects Elon Musks $97.4 Billion Takeover Bid

Related Articles

Tammy Nam joins AI-powered ad startup Creatopy as CEO

Measles outbreak hits 208 cases as federal response goes off the rails

Judge allows authors AI copyright lawsuit against Meta to move forward

Google scrubs mentions of diversity and equity from responsible AI team webpage

New DOJ proposal still calls for Google to divest Chrome, but allows for AI investments

Blood Typers is a terrifically tense, terror-filled typing tutor

Week in Review: OpenAI could charge $20K a month for an AI agent

New research shows bigger animals get more cancer, defying decades-oldbelief

About Us

Popular Category

Editor Picks

Tammy Nam joins AI-powered ad startup Creatopy as CEO

Lets Chat About Veggies Seniors Might Wanna Skip!

What does PhD-level AI mean? OpenAIs rumored $20,000 agent plan explained.

Share

Benchmarks vs. real-world value

Related posts:

Popular

Related Articles

About Us

Popular Category

Editor Picks