© 2026 Greg T. Chism · MIT License

Gradient Attribution — Interactive Explorer

See which input features drive model predictions — explore saliency maps, integrated gradients, and attribution comparison


Input Pattern
Click cells on the grid to paint a custom pattern
Model Settings
Target Class
Baseline (for IG)
Black = zeros, White = ones, Noise = random
Integration
Steps (m)
m 50
More steps = more accurate but slower
Playback
Step 0 / 50
Speed
What's happening?
Select an input pattern and press Play to step through gradient attribution. Each step shows how the gradient at that interpolated input contributes to the final attribution map.
Key Concepts
Saliency maps: compute ∂output/∂input — shows which pixels, if changed slightly, would most change the prediction. Fast but sensitive to noise and saturation near ReLU dead zones or sigmoid tails.
Integrated gradients: accumulate gradients along a straight path from a baseline to the input — IG(x) = (x−x′) × ∫₀¹ ∂F(x′+α(x−x′))/∂x dα. Captures contributions at all activation levels, not just the endpoint. Satisfies the completeness axiom.
Completeness axiom: attributions must sum to F(input) − F(baseline) — every unit of prediction difference is accounted for. Vanilla gradients do NOT satisfy this; integrated gradients do by construction.
SmoothGrad: average gradients over N noisy copies of the input x + ε, ε ~ N(0, σ²). Reduces visual noise and sharpens the attribution map without changing the fundamental gradient method.
GradCAM vs pixel attribution: GradCAM uses gradients of the class score with respect to feature map activations — gives coarser but more spatially coherent explanations than pixel-level gradients. Best for spatial localization in CNNs.
Saliency Maps — Edge pattern · Neural Net · Class 0
Input Pattern
8 × 8 pixel grid
8×8 input pattern rendered by D3
pixel = 0 pixel = 1
Gradient Heatmap
∂f/∂x at this input
Saliency map rendered by D3
− (toward 0) 0 + (toward 1)
Values normalized to [−1, +1] for display
The saliency map shows ∂f/∂x — red pixels pushed the prediction toward class 1, blue pixels pushed it toward class 0. Magnitude shows sensitivity strength.
Integrated Gradients — step-by-step accumulation
Baseline x′
Black (all zeros)
Baseline input rendered by D3
Interpolated x′ + α(x − x′)
α = 0.50 (step 25 / 50)
Interpolated input rendered by D3
Accumulated Attribution
∑ gradients so far
IG attribution map builds step by step
0 +
Values normalized to [−1, +1] for display
Integration progress step 0 / 50
α = 0.00
Each step α ∈ [0, 1] evaluates ∂f/∂x at x′ + α(x − x′). After all steps the accumulated gradients are scaled by (x − x′)/m — the result satisfies the completeness axiom: attributions sum to f(x) − f(x′).
Attribution Comparison — Vanilla Gradients · Integrated Gradients · SmoothGrad
Vanilla Gradients
Fast but can be noisy — gradient at input only
Vanilla gradient map rendered by D3
SNR: —
Integrated Gradients
Faithful — accumulates gradients from baseline to input
Integrated gradient map rendered by D3
Σ attr − Δf = —
SmoothGrad
Denoised — averages gradients over noisy samples
SmoothGrad map rendered by D3
SNR: —
Negative Neutral Positive
Values normalized to [−1, +1] for display
Total Attribution Magnitude per Method
Attribution magnitude comparison rendered by D3
Vanilla gradients can highlight irrelevant pixels near saturation. Integrated gradients respect the completeness axiom — attribution sum equals the prediction gap. GradCAM is coarser but more spatially stable.
Prediction
Class
f(x)
f(x′)
Δf
Top Attributed Pixels
(—, —)
(—, —)
(—, —)
(—, —)
row, col · attribution value
Completeness (IG)
Σ IG(x) − Δf
Should be ≈ 0 when integration is complete. Nonzero = more steps needed.
Attribution Quality
Signal-to-noise ratio
High SNR = attribution signal concentrated on few pixels. Low SNR = diffuse, noisy map.