Anthropic unveiled its latest AI model, Claude 4.5, on Monday, highlighting major improvements in coding, finance, and scientific reasoning as the startup deepens its focus on enterprise AI.
Backed by Alphabet and Amazon, Anthropic is competing with rivals to develop models capable of running software and completing complex, multi-step tasks — a foundation for AI agents that can operate on behalf of humans.
Chief Product Officer Mike Krieger said the Claude Sonnet 4.5 model demonstrated the ability to build a full web app in internal tests. One enterprise customer even reported the chatbot coding autonomously for 30 hours straight, a sharp leap from the seven-hour run achieved by the earlier Claude Opus 4 model.
Unlike rivals chasing viral consumer adoption, Anthropic is deliberately targeting power users and business clients. Claude 4.5 has shown stronger results in handling finance and scientific problems and scored 60% on a benchmark testing operating-system skills, compared with about 40% for previous models.
“It’s a lot more visceral when you just see the model using a computer the way a person does if you’re not a coder,” said Jared Kaplan, Chief Science Officer.
Microsoft Expands Anthropic Partnership
On the same day, Microsoft announced new Microsoft 365 Copilot features powered by Anthropic’s models, including “Agent Mode” in Excel and Word and an “Office Agent” within Copilot chat, with PowerPoint integration coming soon.
Microsoft recently revealed plans to integrate Anthropic models into Copilot, diversifying beyond its longtime partner OpenAI.
Workplace-Focused AI
Founded by former OpenAI executives, Anthropic has positioned Claude as a workplace tool with built-in safety guardrails to reduce risky outputs. The company has been marketing its enhanced coding and data-analysis skills to regulated industries and enterprise teams looking for AI that can reliably work across multiple software platforms.
Krieger emphasized that Anthropic’s goal is sustained, reliable performance on long-duration tasks rather than flashy short-term demos.