The Intelligence Era Arrives: Disney and the Next Frontier of AI Productivity

Back Research Notes The Intelligence Era Arrives: Disney and the Next Frontier of AI Productivity Published on May 28, 2025 By Jordi Visser “We keep moving forward, opening new doors, and doing new things, because we’re curious and curiosity keeps leading us down new paths.” – Walt Disney Last week at the 2025 Google I/O developer conference, Google unveiled Veo 3, its most advanced AI video generator to date. Introduced during the keynote address, Veo 3 showcased a range of groundbreaking features, including integrated audio generation, improved prompt adherence, and photorealistic video realism—all created from simple natural language inputs. But unlike prior AI milestones that centered on language, Veo 3 signals something far more profound: the transition from AI as a chatbot to AI as a system that begins to think, sense, and reason like a human. This is no longer about generating text—it’s about simulating how we experience the world through sight, sound, and movement, all processed contextually and simultaneously. Veo 3 creates immersive, sensory-rich environments by aligning audio, visual, and motion cues with a deep understanding of scene dynamics. It represents a foundational shift in the development of intelligent systems, setting the stage for embodied reasoning—where AI agents don’t just respond, but interpret and act based on a rich, multi-sensory awareness of their environment. But the excitement around Veo 3 also exposes the underlying constraint in this new frontier: infrastructure. While some investors had speculated that breakthroughs like DeepSeek might reduce the demand for semiconductors and data centers, Veo 3 puts that notion to rest. Its staggering compute requirements—driven by the complexity of generating coherent video with synchronized audio, realistic motion physics, and temporal continuity—highlight the widening gap between AI’s capabilities and the global supply of GPUs. This is why Veo 3 currently limits users to 8-second clips and is available only through Google’s $250-per-month AI Ultra plan. If made freely accessible, the current global inventory of GPUs would be insufficient to meet demand. But this next phase of AI—marked by sensory synthesis and embodied reasoning—is also broadening the semiconductor stack itself. What began with language-based chatbots running on general-purpose GPUs now requires high-bandwidth memory, optical interconnects, specialized accelerators, and custom silicon designed for multimodal processing. The infrastructure challenge is no longer just about scaling compute—it’s about reengineering the entire AI hardware pipeline to support machines that see, hear, and think like humans. In this new phase of AI—where real-time, multimodal inference enables intelligence to blend seamlessly into physical and digital workflows—the most profound productivity gains won’t be found in the next app or language model, but in organizations that rely on large human workforces, spatial interaction, and rich storytelling. Companies like Disney are uniquely positioned to benefit. With operations that span immersive physical environments, high-cost creative production, global customer interaction, and real-time media delivery, Disney is a microcosm of the broader U.S. economy. It operates not just in pixels and data but in theme parks, retail supply chains, live performances, and branded experiences. These require the kind of intelligent adaptation, personalization, and automation that Veo 3-like models now make possible. Disney has not only anticipated this shift—it has spent the last several years investing heavily in AI tools across its parks, studios, streaming platforms, and merchandising systems. The company offers a lens into how AI’s next chapter—centered on inference, not just training—will transform labor-intensive, guest-facing, and emotionally resonant businesses and impact GDP for the country. Disney has not only anticipated this moment but has positioned itself as one of the leading legacy companies embracing the agentic, inference-driven AI future. Since 2020, its leadership has been openly bullish about AI’s role in boosting productivity, enhancing creativity, and personalizing customer experiences. “AI may, in fact, be the most powerful technology that our company has ever seen,” said CEO Bob Iger at Disney’s 2025 annual shareholder meeting, adding that it is already helping Disney become more efficient and “we’re only just beginning to deploy it for those purposes.” Iger’s tone has consistently reflected both excitement and caution—emphasizing the need to safeguard creative integrity and intellectual property while embracing AI’s potential to “create efficiencies and ways for us to provide better service to customers.” For Disney, the imperative to use AI is not abstract—it’s operational and people-centered. AI has become a tool to help cast members work smarter, deliver richer guest experiences, and improve the efficiency of storytelling itself. The result is a clear trend already emerging in financial metrics, including revenue per employee and margins, as Disney begins to harness the real-world power of AI now entering the inference era. Disney’s studio division—spanning Pixar, Marvel Studios, Lucasfilm, and Walt Disney Pictures—stands to gain immediate and dramatic productivity enhancements from AI-powered video generation technologies like Veo 3. The advent of real-time, multi-modal inference—where AI can generate entire scenes from text and image prompts—has direct implications for Disney’s blockbuster-heavy production model. Traditional VFX-laden films like Avengers: Endgame can cost upwards of $10,000 to $20,000 per second of finished footage. In contrast, Veo 3’s estimated cost ranges from $0.39 to $0.75 per second. Assuming a 90-minute film with a 30:1 footage-to-final ratio, Veo 3 could cut production costs to as low as $63,000 for the same output—a staggering shift. While not a wholesale replacement for studio-quality final production, this enables previsualization, rough cuts, and secondary content generation at scale. Disney is already leveraging AI across ILM and Pixar to streamline VFX, automate rotoscoping, and reduce render times. The striking de-aging of Harrison Ford in Indiana Jones 5 shows not only time and cost savings, but a deeper potential: actors no longer need to age out of roles. With permission and protection, AI-created digital doubles can keep characters and franchises alive indefinitely. “AI is just another tool in the toolbox,” said ILM’s Rob Bredow, emphasizing its role in accelerating—not replacing—the creative process. From creative development to post-production and marketing, generative AI is becoming a force multiplier for Disney’s storytelling engine. In the world of streaming and digital media, personalization is not just a convenience—it’s a competitive advantage. Disney’s adoption of AI in its Direct-to-Consumer platforms (Disney+, Hulu, ESPN+) and media channels is central to transforming passive content delivery into a deeply personalized entertainment ecosystem. With AI-enabled recommendation engines, Disney can dynamically curate content based on viewing behavior, age group, mood, and even franchise affinity—delivering experiences that feel tailored to each individual viewer. Unlike its competitors, Disney has a brand relationship that begins at childhood. AI allows that bond to grow alongside the audience, adapting recommendations from Moana and Frozen to Marvel and Star Wars, keeping the brand relevant as preferences evolve. Advertising, too, benefits from inference at scale. Disney’s unified ad platform enables precise targeting across streaming and broadcast, using viewer data, location, and even offline experiences (like park visits) to match the right message with the right person. Netflix has long set the standard for streaming personalization and churn reduction, but AI gives Disney the tools to close that gap—by increasing content disco