New AI Lab Backed by Billions Wants Machines That Listen and Watch in Real Time

Mira Murati, an AI researcher who worked at OpenAI, started a new company called Thinking Machines Lab. On May 11, the company announced a new type of artificial intelligence that can listen, watch, and talk to you all at the same time without delay. The announcement also brought news of major financial backing: a $2 billion funding round that valued the startup at $12 billion, plus a partnership with NVIDIA, the company that makes the computer chips AI systems run on.
This new AI approach is different from what most companies are doing today. Right now, when you use an AI tool that handles both video and sound, the system processes each type separately—a little like a waiter who has to listen to your order, write it down, and then mentally translate it into the kitchen's language before starting to cook. Thinking Machines designed their system from scratch to handle audio, video, and text all together at once, and to respond immediately.
How It Works
The challenge Thinking Machines is solving has frustrated AI developers for years. Today's AI systems that handle multiple types of input—sound, video, text—usually pause slightly between receiving information and giving an answer. That pause is called latency, and it makes conversations feel stilted. For a customer service chatbot or a robot that helps you design something on screen, that delay matters.
Current systems break the work into steps: first they convert audio to text, then they analyze the video separately, then the AI combines both pieces and generates a response. Each step adds a small delay. Thinking Machines built their model to skip those middle steps and process everything together from the start.
The company released a research version of their technology for other researchers to experiment with, along with tools to customize it for specific tasks.
The Money and the Partnership
NVIDIA, which manufactures the specialized computer hardware that trains and runs AI systems, is investing in Thinking Machines and committing to provide at least one gigawatt of computing power. A gigawatt is roughly what a mid-sized power plant produces. That gives you a sense of how much computational firepower this company intends to use.
The $2 billion funding and $12 billion valuation place Thinking Machines among the most richly funded startups in AI history. Investors are betting on the track record of the founding team—Murati herself, plus researchers who previously worked at OpenAI. The speed of this funding reflects how competitive the field has become. A few years ago, building AI required large companies. Now, a talented team with a novel idea can attract billions in capital within months.
There is a historical parallel here worth noting. In the 2000s and 2010s, companies like Amazon and Google spent enormous amounts on building cloud computing data centers, but that expansion happened gradually over many years. Today, AI startups need that same scale of computing power almost from day one. The race to deploy cutting-edge AI is happening on a much faster clock.
What Sets Them Apart
The major AI labs at OpenAI, Google, and Anthropic have all built impressive systems that handle multiple types of input. But most started by building text-focused systems first, then adding the ability to process images and sound afterward—like adding extra rooms to a house that was designed to be small. Thinking Machines is designing the house from scratch with multiple rooms in mind.
The company is also planning to release some of their software as open source—meaning free, publicly available code that anyone can use and improve. Most well-funded AI startups keep their technology proprietary, locked behind paid services. Thinking Machines is taking a different route, which could make their tools more accessible but also more complicated to turn into a profitable business.
Sharing Their Work
Thinking Machines has started publishing technical blog posts and code online, explaining how they built parts of their system and inviting feedback from the wider research community. This approach echoes how OpenAI began—as a research organization focused on advancing AI knowledge—before shifting more toward building commercial products.
The company's first research post addressed how to make AI models give the same answer when asked the same question. That sounds simple but is actually a persistent problem: language models can be unpredictable, which makes them risky in situations where consistency is essential, like handling a customer complaint or checking a bank balance.
Murati's announcement generated strong interest online, with social media posts receiving over 250,000 views within hours. That suggests many people working in technology are watching closely to see if this approach delivers on its promise.
The Road Ahead
Real-time listening and watching sounds powerful on paper. Building it at a scale large enough to serve millions of people, however, involves challenges that go beyond just designing the AI itself. The system needs to work on phones and computers with limited battery power. Networks need to handle the constant flow of video and audio. Companies need to integrate this into their products without major rewrites. All of these practical hurdles often prove harder than the original technology breakthrough.
The next phase will show whether Thinking Machines can move from promising research demonstrations to systems that actually work reliably in the real world, handling the messy variability of actual human interaction at the speed customers expect.


