Huawei has unveiled the CloudMatrix 384, its most advanced AI computing system to date. It combines 384 Ascend 910C chips into a single cluster that rivals, and in some ways exceeds, Nvidia’s latest AI platform. The launch took place at the 2025 World AI Conference in Shanghai and positions Huawei as a key player in large-scale AI infrastructure—especially as the US tightens restrictions on Nvidia’s exports to China.

This article breaks down what CloudMatrix 384 is, how it performs, what makes it different from competitors, and why it matters in the global AI race.

What Is CloudMatrix 384?

CloudMatrix 384 is a high-performance AI cluster built by Huawei. It connects 384 of Huawei’s own Ascend 910C chips using an all-optical interconnect. The system is designed to support the most demanding AI workloads, including large language models and multi-modal inference.

Unlike traditional GPU clusters that focus on raw chip speed, Huawei’s approach relies on system-level performance. By tightly integrating chips, memory, and compute logic, the platform delivers faster output and better bandwidth for massive AI tasks.

Key Features and System Design

The system uses a unified bus architecture, which allows direct communication between chips at high speed. This is paired with 192 Kunpeng CPUs and 48 terabytes of HBM memory, making it suitable for training and deploying large-scale foundation models.

CloudMatrix 384 supports advanced parallelism strategies like expert parallel (EP320) and uses token optimization for inference acceleration. The onboard software, called CloudMatrix-Infer, handles peer-to-peer token dispatch and memory-efficient model serving.

Core Specs and Capabilities of CloudMatrix 384

Component	Specification	Impact
Accelerators	384 Ascend 910C NPUs	High compute power for training/inference
Memory	48 TB HBM	Supports large context windows in LLMs
CPUs	192 Kunpeng cores	Coordinates scheduling and task management
Interconnect	All-optical supernode	Low latency, high bandwidth between chips
Peak Compute (BF16)	Up to 300 PFLOPS	Exceeds Nvidia GB200 NVL72 (180 PFLOPS)

How It Compares to Nvidia

Huawei openly admits that a single Ascend 910C is not as powerful as Nvidia’s best chips. But CloudMatrix makes up for that with scale. By integrating more chips with better system design, Huawei claims to deliver superior performance at the cluster level.

CloudMatrix 384 achieves 3.6 times more memory and over 2 times more memory bandwidth than Nvidia’s GB200 NVL72. Its peak compute power is nearly 66 percent higher. However, it comes at a cost—power usage is significantly higher, with the system drawing around 559 kilowatts.

CloudMatrix 384 vs Nvidia GB200 NVL72

Feature	Huawei CloudMatrix 384	Nvidia GB200 NVL72	Winner
Number of AI Chips	384 Ascend 910C	72 Nvidia GB200	Huawei (more chips)
Memory	48 TB HBM	~13 TB HBM	Huawei
Peak Compute (BF16)	300 PFLOPS	180 PFLOPS	Huawei
Power Consumption	~559 kW	~240 kW	Nvidia (more efficient)
Interconnect	Optical Supernode	NVLink Switch	Huawei (lower latency)

Strategic Value for China

CloudMatrix 384 is more than a technical achievement. It represents China’s push to build domestic AI hardware and reduce reliance on US-based companies like Nvidia. With export controls limiting access to advanced chips, Huawei’s system gives China a homegrown alternative.

Huawei invests over ¥180 billion per year in R&D. CloudMatrix shows that the focus has shifted from single-chip performance to full-stack ecosystem control. That includes hardware, compilers, interconnects, and training software—all built in-house.

Use Cases and Workload Targets

Huawei built CloudMatrix 384 for:

Training foundation models
Real-time inference at scale
Multi-modal systems using vision and language
Expert parallel models for higher throughput

This system supports inference speeds of over 6,600 tokens per second per chip for prefill tasks and close to 2,000 tokens per second for decoding. It also maintains token latency under 50 milliseconds and delivers 538 tokens per second within a 15 ms latency cap using INT8 quantization.

Who Should Care

This launch is significant for researchers, enterprises, and governments focused on sovereign AI infrastructure. If you work in AI deployment, edge computing, or national computing policy, CloudMatrix 384 is a case study in scaling up with constraints.

To understand how large systems like this are designed and optimized, professionals should explore programs like the AI Certification. For engineers working with model performance and system bottlenecks, the Data Science Certification offers practical insights. For strategy, enterprise use, and commercialization, the Marketing and Business Certification is ideal.

Final Takeaway

Huawei’s CloudMatrix 384 is a bold response to AI chip restrictions and growing demand for domestic compute infrastructure. By focusing on system architecture and tight integration, Huawei has built an AI cluster that rivals Nvidia’s best—at least at the data center level.

The real test will be adoption. If Chinese tech firms, universities, and government agencies switch to CloudMatrix for large-scale AI tasks, it could change the global balance in AI hardware.

Huawei isn’t just building chips—it’s building control over the AI future, one cluster at a time.

Huawei Launches CloudMatrix 384 AI System

What Is CloudMatrix 384?

Key Features and System Design

Core Specs and Capabilities of CloudMatrix 384

How It Compares to Nvidia

CloudMatrix 384 vs Nvidia GB200 NVL72

Strategic Value for China

Use Cases and Workload Targets

Who Should Care

Final Takeaway

Related Articles

Google Launches Gemma 4 for Faster, Offline Use

ZeroGPT

Humanizer AI

Trending Articles

The Role of Blockchain in Ethical AI Development

Top 5 DeFi Platforms

How Blockchain Secures AI Data

What Is CloudMatrix 384?

Key Features and System Design

Core Specs and Capabilities of CloudMatrix 384

How It Compares to Nvidia

CloudMatrix 384 vs Nvidia GB200 NVL72

Strategic Value for China

Use Cases and Workload Targets

Who Should Care

Final Takeaway

Related Articles

Google Launches Gemma 4 for Faster, Offline Use

ZeroGPT

Humanizer AI

Trending Articles

The Role of Blockchain in Ethical AI Development

Top 5 DeFi Platforms

How Blockchain Secures AI Data

Search Programs