Beat 1 · Concrete
A ball finds the valley
Loss is height. Follow the slope downhill and you settle at the lowest point.
coral = high loss (far) chartreuse = settled at the minimum teal = lowest reachable loss
Beat 2 · Abstract
Error flows backward
Predict forward, then push the error right→left, nudging every weight on the way.
teal = forward prediction coral = error flowing back sand = weights (thickness)
Beat 3 · Interactive
Train it to do XOR
Step the gradient and watch loss fall to rest — learning the problem one perceptron couldn't.
loss — · epoch 0
coral = loss (wrong) chartreuse = solved / settled teal = minimum loss
Footnotes & lineage
1986
Rumelhart, Hinton & Williams
"Learning representations by back-propagating errors" popularised the method and reignited connectionism.
The engine
The chain rule
Backprop is just calculus' chain rule applied layer by layer to assign blame for the error to every weight.
Era 05 callback
Hidden layers solve XOR
A single perceptron can't separate XOR. Add a hidden layer trained by backprop and the wall falls.