tensorgrad · Marwan Abouzeid

tensorgrad is a tiny tensor autograd engine for learning how deep learning works under the hood. Powered by NumPy — or CuPy for GPU — it delivers a complete training pipeline in under 600 lines of readable code, including broadcasting, four optimizers, and PyTorch-style APIs.

The goal isn’t speed or feature parity with PyTorch; it’s clarity. Every line of the automatic-differentiation core is meant to be read and hacked on, with full computation graphs you can trace to see exactly how gradients flow.

What’s inside

Reverse-mode autodiff — a Tensor type that records operations and back-propagates through them, with broadcasting and no_grad() semantics borrowed from PyTorch.
A unified optimizer interface — Adam, AdamW, SGD with momentum, and Nesterov accelerated gradient, all sharing the same zero_grad() → backward() → step() call pattern.
Optional GPU acceleration — drop in CuPy and select the backend with an environment variable; it falls back to NumPy when no CUDA driver is present.

End-to-end MNIST demo

A runnable Jupyter notebook (demo.ipynb) walks the full pipeline: download and preprocess MNIST, build a three-layer MLP from tensorgrad layers, train it with Adam (with a live loss curve), visualize the autograd graph with Graphviz, and evaluate accuracy on the test set.

Correctness

An extensive pytest suite (~40 individual assertions) cross-checks every tensorgrad operator and optimizer against PyTorch as the oracle — so the “learning” version stays numerically honest.

Inspired by Andrej Karpathy’s micrograd. Licensed MIT.