Backpropagation Chain Rule Visualiser

TriWei AI Lab

Backpropagation Chain Rule Visualiser

Trace forward activations, inspect gradients at each node, and validate updates against finite differences.

Computational graph Chain rule flow

Illustration of a small two-layer neural network.

How to play + what to look for

Goal: watch the chain rule compute gradients for a tiny 2→2→1 network.
Change inputs x₁, x₂ and target y. Inspect forward activations and backward partials.
Click Step to apply one SGD update using the displayed gradients.
Keyboard: S=Step, R=Reset, G=Gradient check.

Real nets have vectorized ops, batching, and careful numerical stability tricks.

Learning objectives

Concept focus: understand how the chain rule propagates gradients through a simple neural network.
Core definition: the derivative of a composite function is the product of derivatives along the computational graph.
Common mistake: forgetting to multiply by the activation derivative or mixing up the order of matrix dimensions.
Why it matters: backpropagation is the backbone of training deep neural networks and relies on these same principles.
Toy disclaimer: this two-layer network is for illustration only; real models use batches, vectorized operations and more complex architectures.

This lab shows backprop “in the small”: a 2→2→1 network. You can step through forward pass values, see the computational graph, then run the backward pass to compute partial derivatives. A finite-difference gradient check verifies one parameter.

Learning rate (η): 0.20 x₁: x₂: y:

Computational graph

Backprop is the chain rule applied along this graph. See CS231n: optimization-2.

Numbers

Inputs / target

Forward values

Parameters

Gradients

Grad check

Loss: —

Math + Sources

Hidden pre-activation: \(z^{(1)} = W^{(1)}x + b^{(1)}\), activation \(a^{(1)}=\sigma(z^{(1)})\). Output: \(\hat y = W^{(2)} a^{(1)} + b^{(2)}\) (linear). Loss: \(L=\frac{1}{2}(\hat y - y)^2\).

Backprop gradients follow from the chain rule; for a two-layer network, the standard vectorised formulas are summarized in many course notes, e.g. Stanford CS231n lecture slides and notes: CS231n Lecture 4 (PDF).

Collaboration Credits

These interactive labs are the result of a close collaboration between a human author and an AI assistant (ChatGPT). The AI contributed algorithmic refinements, numerical safeguards and visual improvements, while the human designed the pedagogical structure, reviewed all code, and ensured educational accuracy. Mathematical formulas and derivations are referenced to reputable course notes and textbooks. All code runs entirely in the browser; no data is sent to any server.