Alibaba-backed AI startup PixVerse launches real-time video tool that lets users direct scenes on the fly
PixVerse wants to turn video creation into a live experience. No waiting. No render bars. No export screens. Just instant control as scenes unfold.
The Alibaba-backed AI startup on Tuesday rolled out a real-time AI video tool that lets users steer what happens mid-generation. Think of it like directing a scene as it’s being filmed. Characters can cry, dance, freeze, or strike a pose on command, with changes happening instantly while the video keeps rolling.
Co-founder Jaden Xie calls it a shift in how people interact with AI video. “Real-time AI video generation can create ‘new business models,’” he told CNBC in an interview translated from Chinese. He pointed to interactive micro-dramas where viewers influence storylines, or “infinite” video games that never hit a scripted wall.
PixVerse launched in 2023 and raised more than $60 million last fall in a round led by Alibaba, with Antler joining in. Xie says another funding round is nearly closed, though he declined to share details. More than half of the incoming investors are based outside China, he added.
Chinese AI Startup PixVerse Challenges OpenAI’s Sora With Real-Time Video Generation, Hits 16M Users

The timing matters. China-based teams have quietly taken the lead in AI video generation. Data from benchmarking firm Artificial Analysis shows that seven of the top eight video models come from Chinese companies, with Israeli startup Lightricks as the lone exception. Many offer faster speeds and lower pricing than OpenAI’s premium Sora 2 Pro.
“Sora still defines the quality ceiling in video generation, but it is constrained by generation time and API cost,” said Wei Sun, a principal analyst at Counterpoint. “Chinese players are taking a different path. They are turning video generation into a scalable, low-cost, high-throughput production tool,” he added.
Beijing-based Shengshu showed what that path looks like last month. Working with researchers from Tsinghua University, the startup claimed its TurboDiffusion framework can generate videos 100 to 200 times faster with minimal quality loss.
PixVerse is betting speed alone isn’t enough. Its new tool is integrated into the company’s social-style sharing platform, which reached 16 million monthly active users in October. Real-time generation removes the gap between creation and distribution, Xie said, changing how people consume AI content.
Growth targets are aggressive. Xie aims to reach 200 million registered users by mid-year, up from 100 million last August. Headcount could double to nearly 200 employees by year’s end.
Most users sit outside China, accessing PixVerse through its web platform and mobile app. That global reach is part of the strategy.
Compared to Chinese tools, “most of the U.S. products are relatively simplistic and minimal” in interface and experience, said Alyssa Lee, chief of staff at DataHub and former Bessemer VP. She sees scenario-focused AI tools carving out clearer revenue paths, pointing to Adobe as a legacy player under pressure. Adobe’s stock has stalled in recent months, she said, signaling that its all-in-one creative suite could get picked apart by specialized AI marketing tools.
PixVerse reported $40 million in annual recurring revenue in October. It isn’t alone. Kling, an AI video product built by TikTok rival Kuaishou, generated close to $100 million in revenue during the first three quarters of 2025, based on CNBC’s calculations from public filings.
For now, Xie says PixVerse prioritizes product over profit. He claims the company has enough capital to operate for a decade.
Quality concerns still linger. Critics often dismiss AI video as “slop,” a flood of low-grade content. Xie pushes back, comparing today’s output to early computer graphics.
“At the beginning, there will be good and bad [content], but gradually the fittest will surely survive … and then some people will improve the technology, and truly meet human needs for emotional and spiritual value.”
PixVerse is betting that the future arrives faster when creators stay in the driver’s seat — frame by frame, in real time.

