Back Research Notes The Opus 4.5 Inflection Point: When AI Crossed the Threshold Published on January 7, 2026 By Jordi Visser In March 2025, Anthropic CEO Dario Amodei made a bold prediction at the Council on Foreign Relations that most observers dismissed as another sign of AI being overhyped and a bubble. Speaking about the trajectory of AI in coding and programming, he stated: “We are not far from the world. I think we’ll be there in 3 to 6 months where AI is writing 90% of the code and then in 12 months we may be in a world where AI is writing essentially all of the code.” At the time, we were still in the dismissive world of AI post-DeepSeek. His forecast seemed audacious, even reckless. Yet those who understand how AI frontier labs operate knew to take Amodei seriously: when he speaks publicly about future capabilities, he’s not speculating, he’s describing what he’s already witnessing in unreleased models. The labs consistently work 3-9 months ahead of public releases which in the exponential world of AI equates to years, meaning Amodei was likely already seeing signs of the 90% threshold in internal testing when he made that March prediction. While many people began enjoying the holidays after a long year of tariff overreaction and missing the market moves, something remarkable happened between Thanksgiving and New Year’s Day that proved he hadn’t been exaggerating at all, he had been conservative, perhaps even holding back the full extent of what he knew was coming. The catalyst was Claude Opus 4.5, released by Anthropic on November 24th, and more specifically, the updated version of Claude Code that accompanied it. Roughly nine months after Amodei’s CFR remarks and almost exactly at the back end of his 3-6 month window, his own company shipped a model that would validate his forecast in the most dramatic way possible. What unfolded over the holiday break wasn’t just incremental progress; it was an inflection point that sent shockwaves through the AI community, captured in a cascading series of revelations from some of the industry’s most respected voices. For those of you reading this who don’t currently use AI in your life, your window is shutting for being relevant on many levels. The Ten-Day Cascade Between December 26 and January 3, the span of a single holiday week, an xAI co-founder, an Anthropic AGI researcher, a Google principal engineer, a former Google Gemini lead, and the founder of Midjourney all independently concluded that Opus 4.5 plus Claude Code had fundamentally changed what was possible. This wasn’t coordinated marketing or orchestrated hype. It was organic recognition from technical leaders who suddenly found themselves experiencing capabilities they hadn’t anticipated arriving for months or years. The recognition began quietly on December 26th with a simple tweet from Igor Babushkin, co-founder of xAI and former researcher at both Google DeepMind and OpenAI. His assessment was understated: “Opus 4.5 is pretty good.” But it was the response from Andrej Karpathy, OpenAI co-founder and renowned AI researcher, that signaled something more profound was occurring. Karpathy, commenting in the same week Opus 4.5 dominated AI circles replied: “It’s very good. People who aren’t keeping up even over the last 30 days already have a deprecated worldview on this topic.” This statement deserves emphasis. Just two months earlier, Karpathy had been talking about AGI being a decade away and coding models as performing at “elementary-grade student” levels. Now he was declaring that a single month of inattention was enough to render one’s understanding obsolete. For someone of Karpathy’s stature and technical depth, this represented a genuine psychological break, a recognition that the pace itself had fundamentally shifted. The same day, Kernion, an Anthropic researcher with four years at the company, posted something even more startling: “I’m trying to figure out what to care about next. I joined Anthropic four plus years ago motivated by the dream of building AGI. I was convinced from studying philosophy of mind that we’re approaching sufficient scale and that anything can be learned in a reinforcement learning environment. And so now I feel like Opus 4.5 is as much AGI as I ever hoped for and I’m not sure I know what I want to spend my waking hours focused on.” This was not a casual observer or hype merchant. This was a researcher at the company that built the model, someone who had dedicated years to achieving artificial general intelligence, now openly questioning what comes next because he believed they had essentially arrived. He elaborated: “To use Claude Code is to see Claude write arbitrary software, run into errors, reliably fix them, make helpful suggestions, and perfectly follow any given instructions.” The Creator’s Testament Perhaps the most striking validation came from inside Anthropic itself. Boris Cherny, the creator of Claude Code, provided concrete evidence that demolished any remaining skepticism about what had been achieved. Over a recent 30-day period, he reported that Claude Code using Opus 4.5 wrote 100% of his contributions to Claude Code itself: 259 pull requests, 497 commits, roughly 40,000 lines of code added and 38,000 lines removed, “with no human-written additions.” This bears repeating: the person who built Claude Code was now using Claude Code to write all the code for Claude Code. The tool had become capable of improving itself without human coding intervention. Cherny framed this as a turning point for software engineering, stating that “coding is no longer the limiting factor; the bottleneck is execution and guidance,” deciding what to build, reviewing, and integrating, not typing code. The technical details reveal the transformation’s depth. Claude Code ran for minutes, hours, even days at a time across approximately 1,600 sessions and 325 million tokens, using stop hooks for long-lived tasks rather than short prompt-reply loops. This wasn’t a parlor trick of generating simple scripts. This was sustained, autonomous software development at a professional level, handling the full complexity of a production codebase. The Google Validation If there was any doubt about whether this represented a genuine breakthrough or merely internal enthusiasm, it evaporated on January 2nd when Janna Dogen, a principal engineer at Google who leads work on the Gemini API, made a confession that stunned the industry. She tweeted: “I’m not joking, and this isn’t funny. We have been trying to build distributed agent orchestrators at Google since last year. There are various options. Not everyone is aligned. I gave Claude Code a description of the problem. It generated what we built last year in an hour.” This deserves emphasis: a principal engineer at one of the world’s most sophisticated technology companies, embedded in Google’s own frontier AI efforts, with access to virtually unlimited resources, acknowledged that Claude Code had replicated a year’s worth of her team’s work in sixty minutes. When pressed for details, she clarified: “It wasn’t a very detailed prompt and it contained no real details given I cannot share anything proprietary. I was building a toy version on top of some of the existing ideas to evaluate Claude Code. It was a three paragraph description.” When someone asked when Google’s own Gemini would reach this capability level, her response was telling: “We are working hard right now on the models and the harness.” The acknowledgment was implicit, Google was behind, scrambling to catch up to what Anthropic had achieved. The Insider Perspective The validation continued with Ronin Anil, a former Google engineer who had led work on the Gemini models and worked on the Google Brain team on foundational training algorithms research. He didn’t mince words: “I used to be a Google engineer too leveled up all the way and feel if I had agentic coding and particularly Opus I would have saved myself the first si