Youtube Playlist

Vanishing gradient problem

  1. Deep Networks, Small Gradients:
  2. Chain rule and backpropagation
  3. Culprit 1: Activation Functions
  4. Culprit 2: Weight Initialization
  5. Consequences for Learning
  6. Mitigations and Solutions: