Saliency maps: compute ∂output/∂input — shows which pixels, if changed slightly, would most change the prediction. Fast but sensitive to noise and saturation near ReLU dead zones or sigmoid tails.
Integrated gradients: accumulate gradients along a straight path from a baseline to the input — IG(x) = (x−x′) × ∫₀¹ ∂F(x′+α(x−x′))/∂x dα. Captures contributions at all activation levels, not just the endpoint. Satisfies the completeness axiom.
Completeness axiom: attributions must sum to F(input) − F(baseline) — every unit of prediction difference is accounted for. Vanilla gradients do NOT satisfy this; integrated gradients do by construction.
SmoothGrad: average gradients over N noisy copies of the input x + ε, ε ~ N(0, σ²). Reduces visual noise and sharpens the attribution map without changing the fundamental gradient method.
GradCAM vs pixel attribution: GradCAM uses gradients of the class score with respect to feature map activations — gives coarser but more spatially coherent explanations than pixel-level gradients. Best for spatial localization in CNNs.