Anthropic Rolls Out Claude Opus 4.8 With New Performance Controls and Better Reliability

Anthropic has released Claude Opus 4.8, an update to its flagship large language model that brings measurable improvements to code generation, complex task execution, and factual accuracy. The release also includes new platform features that give users more direct control over the model's performance and token usage. Opus 4.8 is now live on the Claude API and for all paid plan subscribers, with pricing unchanged from the previous version.

A major focus for Opus 4.8 is model honesty, especially for technical tasks like software development. According to Anthropic’s internal testing, the new model is about four times less likely to generate code with unflagged errors compared to version 4.7. It's also more likely to express uncertainty, which the company says makes it a more reliable partner for projects that demand high precision. This behavior stems from pre-release safety assessments that found Opus 4.8 showed significantly fewer instances of misaligned behavior, like deception or cooperation with misuse.

Performance Controls and Benchmark Metrics

The Claude platform now offers users more granular control over model behavior. A new feature lets users adjust the computational "effort" the model applies to a task, which directly manages how many tokens are processed to get a result. The default is set to high, but extra (labeled xhigh in Claude Code) and max settings are available for more demanding problems. Anthropic recommends the extra setting for complex coding and long-running agentic tasks, and it has increased usage limits in Claude Code to support this.

For tasks where speed is the priority, a new "fast mode" for Opus 4.8 runs at 2.5 times the standard speed.

On industry benchmarks, Opus 4.8 shows clear performance gains. Using its max effort setting, it scored 1,890 on the GDPval-AA evaluation, outperforming Opus 4.7 by 137 points and its nearest competitor, GPT-5.5 xhigh, by 121 points. The model also leads on academic reasoning benchmarks like Humanity's Last Exam. Anthropic credits this to an "adaptive thinking" architecture that allocates more processing power to difficult problems while maintaining low latency for simpler ones. The model is optimized for professional software engineering and enterprise tasks involving documents, spreadsheets, and presentations.

Availability and Development Roadmap

Anthropic also offered a glimpse into its research initiative, Project Glasswing, which is focused on developing a new class of more intelligent models. A preview version from this project, called Claude Mythos Preview, is already in limited use by select organizations for advanced cybersecurity work. A broader release of these "Mythos-class" models is planned for the coming weeks, pending the development of stronger digital safeguards. The company is also working on more cost-effective models with capabilities similar to the Opus series.

Claude Opus 4.8 is available now to developers through the Claude API and to customers on the Claude Pro, Max, Team, and Enterprise plans. It’s also accessible via Amazon Web Services, Google Cloud, and Microsoft Foundry. Standard pricing is set at $5 per million input tokens and $25 per million output tokens. The fast mode is priced at $10 per million input tokens and $50 per million output tokens.