โšก

Does a Machine Think?

Watch thousands of cores compute simultaneously in real time.

A Mathematical Inquiry into AI ยท Part VII

Does a Machine
Think?

If "thinking" is nothing but matrix multiplication โ€” then what is thought?

Let us begin with the most uncomfortable fact. Everything AI does that looks like "thought" โ€” writing poetry, proving theorems, appearing to empathize with your feelings โ€” is, at its physical core, matrix multiplication. Thousands of chips performing multiplications and additions simultaneously. That is all. If this disturbs you โ€” that is precisely where this exploration begins.

00 โ€” AN UNCOMFORTABLE STARTING POINT

Does Thought Happen Step by Step, or All at Once?

When you solve a math problem, you think step by step. But the moment you see a landscape โ€” billions of neurons fire simultaneously. A CPU works like the first. A GPU works like the second. And for AI's "thinking," the second wins by an overwhelming margin.

One genius vs. thousands of workers โ€” hear the difference
CPU (4 cores, sequential) vs GPU (hundreds of cores, parallel)
CPU progress
0%
GPU progress
0%
GPU speedup
โ€”

Here is the uncomfortable question. โ€” When AI writes a poem, what happens inside is a 4096ร—4096 matrix multiplication. 68 billion multiply-adds. This is the physical substance of "creativity." CPU one at a time: seconds. GPU in parallel: milliseconds. This speed difference is what separates "AI that converses in real time" from "AI that takes 10 minutes per sentence."

01 โ€” THE IDENTITY OF EVERYTHING

Matrix Multiplication: The Physical Identity of Thought

The attention from Part II, the GANs of Part IV, the speech synthesis of Part V, the image recognition of Part VI โ€” the operation underlying everything we have explored in this series is one thing. Matrix multiplication. C = A ร— B. This is the atom of machine thought.

A tone sounds as each element is computed
Matrix multiplication โ€” row ร— column dot product
A size
4ร—3
B size
3ร—4
Operations
0 / 48
C[i,j] = ฮฃ_k A[i,k] ยท B[k,j] โ€” Each element is independent โ†’ perfectly parallelizable

02 โ€” AN ACCIDENTAL REVOLUTION

GPU Architecture

The great irony of the AI revolution: the chip that made it all possible was built for gaming. Teenagers buying graphics cards for more realistic explosions and shadows โ€” nobody imagined these would reshape the intellectual history of humanity.

CPU vs GPU โ€” core count and structure

NVIDIA GPU Evolution

GPUYearCUDA CoresAI Impact
GTX 5802010512Used to train AlexNet
K8020144,992First datacenter AI GPU
V10020175,120 + 640 TensorTransformer training standard
A10020206,912 + 432 TensorGPT-3 training
H100202216,896 + 528 TensorGPT-4, Claude training
B200202418,432 + 1,152 TensorNext-gen model training
B300 (Vera Rubin)2025Undisclosed5th-gen NVLink, peak performance

03 โ€” THE PRICE OF A THOUGHT

FLOPS: How Much Does One "Thought" Cost?

Human thought feels free. Machine "thought" has an exact price tag. Training GPT-4 required approximately 10ยฒโต operations โ€” at an estimated cost of $100 million. Even when you ask an AI "what's the weather today?", billions of matrix multiplications execute. Every thought has a cost in silicon and electricity.

Computation required for AI training (log scale)
GPT-4 training โ‰ˆ 2 ร— 10ยฒโต FLOPS โ‰ˆ 25,000 H100s ร— 3 months

04 โ€” A CHIP BORN ONLY TO MULTIPLY

Tensor Cores: Hardware Dedicated to "Thought"

Eventually, humanity built hardware dedicated solely to matrix multiplication. A Tensor Core performs a 4ร—4 matrix multiply in a single clock cycle. Not general-purpose computing โ€” silicon designed exclusively for AI's "thinking." The moment machine thought became important enough to deserve its own physical organ.

Hear the speed difference between CUDA and Tensor Cores
CUDA cores (one at a time) vs Tensor Cores (4ร—4 at once)
CUDA cores
0%
Tensor cores
0%
Speedup
โ€”

05 โ€” AN UNCOMFORTABLE LAW

Scaling Laws: Can You Buy Intelligence?

In 2020, OpenAI discovered something disconcerting: AI "intelligence" follows a power law of invested compute. Double the GPUs, double the electricity, double the money โ†’ predictable performance gain. This means intelligence is engineerable and purchasable. And this is the mathematical justification for a multi-trillion-dollar GPU arms race.

L(C) โˆ C^(-ฮฑ) โ€” Loss decreases as a power law of compute (ฮฑ โ‰ˆ 0.05)

06 โ€” THE SILICON ARMS RACE

Who Can "Think" the Most?

CompanyPrimary chipsEst. GPU countFlagship model
MetaH100 + + custom MTIA~600,000 H100-equivLlama 4
GoogleTPU v5e/v6 (Trillium)~millions of TPU chipsGemini 3
Microsoft/OpenAIH100/H200 + Azure~500,000+GPT-5.4
AnthropicH100 (AWS/GCP)UndisclosedClaude Opus 4.6
xAIH100 (Memphis cluster)~200,000Grok 3

One H100 costs ~$30,000โ€“40,000. GPT-4 training cost: ~$100M estimated. โ€” AI's "thinking" is not free. Tens of thousands of GPUs consume power for months, multiplying matrices over and over. Electricity alone costs millions. This is why AI has become a game for a handful of giants.

07 โ€” THE CONNECTION

The Human Brain vs. GPUs

Human BrainGPU Cluster
๐Ÿง Units โ€” 86 billion neuronsUnits โ€” Thousands of CUDA/Tensor cores
โšกSpeed โ€” ~100 Hz (slow but massively parallel)Speed โ€” ~2 GHz (fast and massively parallel)
๐Ÿ”ŒPower โ€” ~20W (astonishing efficiency)Power โ€” ~700W/GPU ร— tens of thousands
๐Ÿ’พMemory โ€” ~2.5 PB (estimated)Memory โ€” 80GB/GPU ร— tens of thousands
๐Ÿ“Core op โ€” Synaptic transmission, plasticityCore op โ€” Matrix multiply, backpropagation
๐ŸŽ“Learning โ€” Experience-based, yearsLearning โ€” Data-based, weeks to months

So โ€” Does a Machine
Think?

In this series we have explored a machine's dreams, understanding, memory,
imagination, voice, and vision.
And now the physical truth beneath all of it is revealed โ€”
matrix multiplication.

This means two things.
Machine thought is nothing more than multiplication.
And multiplication alone can do all of this.

Which of these facts is more astonishing
nobody yet knows.

โ† Part VI: See
edu.kimsh.kr