Huawei’s Ascend 910C-powered system reportedly outperforms Nvidia’s H100 on key metrics

Just a week after announcing plans for mass shipments of its Ascend 910C AI chip to help fill the void left by Nvidia in China, Huawei has taken things even further. A new analysis from SemiAnalysis shows that Huawei’s CloudMatrix 384 system, powered by Ascend 910C chips, has pulled ahead of Nvidia’s latest GB200 NVL72 rack system, which uses H100 GPUs, in several key areas.
The GB200 NVL72, unveiled by Nvidia earlier this year, is a massive configuration of 72 Blackwell GPUs and 36 Grace CPUs. It brings serious upgrades over the H100, offering up to 30 times faster AI task processing and 25 times better energy efficiency, particularly for workloads like large language model inference.

Huawei’s Ascend 910C AI chip
Huawei and China Now Have AI System Capabilities That Can Beat Nvidia’s
On April 16, 2025, SemiAnalysis revealed that Huawei’s CloudMatrix 384 can outperform even Nvidia’s GB200 NVL72 — an achievement that’s not just about speed, but also about signaling a major shift in the U.S.-China tech race.
“A full CloudMatrix system can now deliver 300 PFLOPs of dense BF16 compute, almost double that of the GB200 NVL72. With more than 3.6x aggregate memory capacity and 2.1x more memory bandwidth, Huawei and China now have AI system capabilities that can beat Nvidia’s,” SemiAnalysis reported.
According to SemiAnalysis, while Huawei’s Ascend chip can technically be fabricated at SMIC, it’s far from a fully domestic product. It still relies heavily on global supply chains—using HBM memory from Korea, primary wafer production from TSMC, and chip fabrication equipment sourced from the U.S., the Netherlands, and Japan. SemiAnalysis notes that while China has made progress, much of Huawei’s success skates close to existing export controls. The report argues the U.S. government should shift its focus to these supply chain loopholes if it wants to slow China’s advances in AI.
Huawei may be behind by a generation in chip technology, but its system-level strategy is ahead of what Nvidia and AMD currently offer. So, what powers Huawei’s CloudMatrix 384?
The CloudMatrix 384 stitches together 384 Ascend 910C chips in a full mesh network. The logic is straightforward: even though each Ascend chip delivers about one-third the performance of Nvidia’s Blackwell GPUs, the sheer scale, packing five times more chips, more than makes up for it.
Huawei Prepares Its Next Strike: The Ascend 910D
The performance news surfaced just as The Wall Street Journal reported that Huawei is getting ready to test its new Ascend 910D chip, built to compete directly with Nvidia’s H100. The company has already approached several Chinese tech firms to begin trials of the 910D, aiming to deliver something stronger than the H100.
“Huawei Technologies is gearing up to test its newest and most powerful artificial-intelligence processor, which the company hopes could replace some higher-end products of U.S. chip giant Nvidia,” The Wall Street Journal reported.
Initial samples of the 910D are expected by late May, with mass production following soon after, according to WSJ. The move is seen as another clear effort by Huawei to build a homegrown alternative to Nvidia’s AI dominance, especially after tightening U.S. export controls locked Chinese firms out of Nvidia’s most advanced products.
Meanwhile, a CNBC segment added some confusion, incorrectly suggesting that Huawei’s big performance jump came from the 910D. SemiAnalysis quickly pointed out that the credit belongs to the 910C chip, which has been in production and is already running in real-world deployments.
The CloudMatrix 384 vs GB200 NVL72: By the Numbers
Huawei’s CloudMatrix 384 isn’t just a set of powerful chips—it’s a full system built to train and run trillion-parameter AI models. The architecture connects 384 Ascend 910C chips in an all-optical mesh network. When SemiAnalysis lined it up against Nvidia’s GB200 NVL72, the results were clear.
Here’s where Huawei’s CloudMatrix 384 pulls ahead:
-
Compute Power: 300 petaFLOPs of BF16 performance, roughly double GB200 NVL72’s ~150 petaFLOPs.
-
Memory Capacity: 49.2 terabytes of HBM, about 3.6 times the GB200’s 13.8 terabytes of HBM3e.
-
Memory Bandwidth: 1229 terabytes per second, over twice the GB200’s 576 terabytes per second.
These gains matter because AI workloads are heavily dependent on compute strength, memory size, and bandwidth. Faster training and quicker model inference can mean the difference between staying ahead—or falling behind—in fields like scientific research, autonomous vehicles, and generative AI.
Since the GB200 NVL72 already delivers massive improvements over the H100, Huawei’s edge here suggests that CloudMatrix 384 can outperform H100-based systems too, such as Nvidia’s DGX H100, which delivers around 16 petaFLOPs BF16 across eight H100 GPUs.
But there’s a trade-off Huawei can’t avoid. CloudMatrix 384 consumes nearly four times more energy than Nvidia’s GB200 NVL72 and delivers 2.3 times worse performance per watt. Huawei wins on brute force, but Nvidia still holds the crown for efficiency—a major advantage in power-hungry data centers.
Clearing Up the Chip Mix-Up
The rush to report Huawei’s breakthrough created some confusion. CNBC, during an April 28 segment, cited SemiAnalysis’ findings but mistakenly credited the performance lead to the new Ascend 910D chip.
“According to SemiAnalysis, their solutions now outperform Nvidia in several key metrics,” CNBC’s Kristina Partsinevelos said during an interview (video below)
In reality, as the Wall Street Journal and SemiAnalysis clarified, the 910D is still in the testing phase, with no verified performance data yet. The system that made waves—CloudMatrix 384—is built with the Ascend 910C, which has been shipping to customers like ByteDance.
As for raw chip performance, Nvidia’s H100 still outmatches a single Ascend 910C. A lone H100 delivers 4 petaFLOPs FP8 and 3.35 terabytes per second of bandwidth. The 910C hits about 2.4 petaFLOPs FP16, or around 60% of H100’s performance.
Huawei’s advantage comes from scale, not individual chip strength. The way it links hundreds of 910C chips together in a high-speed optical network is what allows CloudMatrix 384 to surpass H100-based systems. Without that scale, the 910C alone wouldn’t be enough.
The Sanctions Factor
Huawei’s rise isn’t just about technical milestones—it’s about surviving and innovating under sanctions.
Since the U.S. blacklisted the company in 2019, Huawei has been blocked from accessing top chipmakers like TSMC and high-end memory suppliers. Yet it found workarounds. According to SemiAnalysis, Huawei indirectly uses TSMC’s 7nm process by working through third parties like Sophgo. It also sources HBM2E memory chips from Samsung using intermediaries, staying two generations behind Nvidia’s newer HBM3e technology.
While Huawei pushed ahead, U.S. restrictions tightened the noose around Nvidia’s China business. By 2025, not only were H100 exports banned, but the downgraded H20 chips were blocked too, slashing Nvidia’s potential China revenue and forcing a $5.5 billion write-off, according to reports from Reuters and DCD.
Huawei seized the moment, planning to ship about 1 million 910C chips in 2025 to meet surging demand inside China.
Markets responded fast. After the news broke about Huawei’s performance gains, Nvidia’s stock dropped by around 2%, a sign that investors are watching this tech battle very closely.
Challenges Still Ahead
Despite the big performance wins, Huawei isn’t home free. Big challenges remain:
-
Energy Demands: Running data centers with 3.9x more power consumption isn’t easy—or cheap.
-
Technology Gap: Huawei’s chips are built using older 7nm technology and slower HBM2E memory compared to Nvidia’s advanced HBM3e.
-
Software Ecosystem: Nvidia’s CUDA framework remains a core part of AI development, while Huawei’s software still trails behind.
The upcoming Ascend 910D could be the next test. According to WSJ, the 910D is expected to directly challenge the H100 when it becomes available. But for now, it’s the Ascend 910C—and the system-level design of CloudMatrix 384—that’s giving Nvidia real competition.
This fight isn’t just about chips anymore. It’s about who can build better systems, scale faster, and survive the pressures of global politics.
Sources:
-
SemiAnalysis, “Huawei AI CloudMatrix 384: China’s Answer to Nvidia GB200 NVL72,” April 16, 2025
-
The Wall Street Journal, April 27, 2025 (Ascend 910D testing)
-
CNBC, April 2025 (market commentary and 910D error)
-
Reuters, DCD (Nvidia sanctions, H20 write-off)
-
Tom’s Hardware (Ascend 910C performance benchmarks)
🚀 Want Your Story Featured?
Get in front of thousands of founders, investors, PE firms, tech executives, decision makers, and tech readers by submitting your story to TechStartups.com.
Get Featured