Huawei is set to begin testing its next-generation HiSilicon Ascend 910D AI processor, with the goal of outperforming Nvidia’s H100, according to a Reuters report. While the Ascend 910D will be slower than Nvidia’s Blackwell B200 and B300 GPUs on a chip-to-chip basis, Huawei’s strategy of building pods with hundreds of processors could enable it to compete against Nvidia’s current and upcoming GPU-based systems.
Huawei’s Approach to AI Processing
Huawei plans to start large-scale shipments of its dual-chiplet Ascend 910C AI processors to Chinese customers as early as next month. The majority of these processors were reportedly produced by TSMC for a third-party company. The Ascend 910D’s production details remain unclear, with uncertainty surrounding whether it will be manufactured by China-based SMIC or if Huawei will find a way to circumvent U.S. sanctions.
Performance Comparison
The company’s latest Ascend 910C offers around 780 BF16 TFLOPS, significantly less than Nvidia’s H100, which delivers approximately 2,000 BF16 TFLOPS. To reach H100 performance levels, Huawei will need to redesign the Ascend 910D’s internal architecture and potentially increase the number of compute chiplets. Huawei’s CloudMatrix 384 system, featuring 384 Ascend 910C processors, can reportedly outperform Nvidia’s GB200 NVL72 in certain workloads but at the cost of higher power consumption.
Challenges Ahead
Without access to leading-edge process technologies, maintaining a competitive position will be challenging for Huawei. Nvidia is set to introduce its Rubin GPUs in 2026, made on TSMC’s N3 process, offering higher performance-per-watt than current Blackwell GPUs. The Rubin GPUs are expected to deliver around 8,300 TFLOPS of FP8 training performance, roughly twice that of the B200.
Future Prospects
Despite potential performance differences, Huawei’s Ascend 910D is likely to become China’s go-to AI training processor. The strategic importance of AI means that power consumption will not be a limiting factor, as the number of deployed units can offset efficiency. The main challenge for China will be producing enough processors, either domestically or through overseas proxy companies.