Backpropagation through a layer norm
Derivation of the backpropagation equations for layer normalization.
Pegasus by Satoshi Kamiya, folded by me
Derivation of the backpropagation equations for layer normalization.
Transformers for natural language processing from first principles. This a long post which details a full implementation of transformers and the mathematics behind them. The use case is predicting Amazon review stars based on the review text. The language of...
Interpolation using quaternions.
A detailed foundation of quaternion mathematics.
Rotations in 2D.
Introduction to quaternions and rotations in 3D.
Pinging the world from South Africa.
Wrapping up the Sudoku OCR reader series.
On convolutional neural networks, overly large models and the importance of understanding your data.
Identifying and extracting numbers for the Sudoku OCR Reader.