Skip to content

Training and inference of AI large models sound sophisticated, but frankly speaking, it's just "fortune telling" – except it's predicting data, not your love life.

In the AI field, GPUs (graphics processing units) are more important than CPUs (central processing units), and even more importantly, only NVIDIA GPUs are effective, while Intel and AMD lag far behind.


GPU vs CPU: One is a Gang Fight, One is a Lone Wolf

Imagine training an AI large model is like moving bricks.

A CPU is like an "all-rounder" who can do many things: calculations, logic, and management, no matter how complex, they are all proficient, but it has few cores, at most dozens. No matter how fast it moves bricks, it can only move a few or at most dozens at a time, and it's inefficient to work hard.

But what about a GPU? It has a frightening number of cores, easily thousands or tens of thousands. Although each core can only move one brick, there are so many people! With thousands or tens of thousands of underlings working together, the bricks are moved quickly.

The core task of AI training and inference is "matrix calculation" – simply put, a large number of numbers are lined up to do addition, subtraction, multiplication, and division, just like a massive pile of red bricks waiting to be moved, a simple task that can be done without using the brain.

The "massive core parallelism" capability of the GPU comes in handy, allowing it to handle thousands or tens of thousands of small tasks simultaneously, making it dozens or even hundreds of times faster than a CPU.

What about the CPU? It is more suitable for serial complex tasks, such as playing a single-player game or writing a document. There are too many bricks in AI, and it moves a few dozens at a time, and it can't catch up with the GPU even if it is exhausted.


Why Does NVIDIA Dominate? AMD and Intel Cry in the Toilet

Okay, now the question is: NVIDIA is not the only one with GPUs. AMD and Intel also have graphics cards. Why does the AI community use NVIDIA's products with a smile? The answer is simple and rude – NVIDIA not only sells hardware, but also "kidnaps" the entire ecosystem.

First, the software ecosystem is invincible. NVIDIA has a killer feature called CUDA (a programming platform), specially tailored for its GPUs. AI engineers write code to train models, and using CUDA is like having a cheat code, simple and efficient. AMD has its own ROCm, and Intel also has OneAPI, but these two are either not mature enough, or using them is like solving math problems, how can they be as smooth as CUDA?

Second, first-mover advantage + market created by spending money. NVIDIA bet on AI early on, and launched CUDA more than ten years ago, and forcibly cultivated AI researchers into "NVIDIA believers". What about AMD and Intel? By the time they reacted, NVIDIA had already occupied the AI territory firmly. Want to catch up now? Too late.

Third, the hardware is not bad either. NVIDIA's GPUs (such as A100, H100) are specially optimized for AI, with high memory bandwidth and explosive computing power. Although AMD and Intel's graphics cards are great for playing games, they are always a bit short in AI tasks. For example, NVIDIA is an "AI brick-moving excavator", while AMD and Intel are still "household shovels", and the efficiency is too different.


The Rich and Foolish AI Circle

Therefore, the GPU completely defeats the CPU because "many people are strong", and NVIDIA's dominance is a combination of "hardware + software + foresight".

AMD and Intel are not without opportunities, but they have to work harder, otherwise they can only watch NVIDIA continue to count money until their hands cramp.

In the AI industry, burning money is a daily routine. Choosing NVIDIA's GPU is like buying a "cheat code", which is expensive, but you win at the starting line. Isn't it funny? Before AI saves the world, it first saves NVIDIA's stock price!