2026-07-17 –, S2
Deep learning is often taught through large frameworks and large models, which is great for getting real projects out of the door, but not always great for learning. This talk is about a different practice: building tiny, runnable versions of various modern architectures with minimal dependencies (mostly Python and NumPy) to learn about the ideas through application.
We’ll get our feet wet by building a small Transformer end-to-end and learn about the model architecture that started the craze. Then we switch perspectives, and learn about other architectures, always staying small and nimble, focusing on applying the math and breathing life into formulas. We will look look at multi-scale modelling (in a simplified version of Renormalizing Generative Models), State Spaces, and other scary concepts, until they are not scary at all anymore.
You’ll leave with a model for turning papers into little prototypes that stay true to ideas and the starting point for your own little lab to build models yourself.
Prerequisites: a basic understanding of NumPy and a willingness to look at Greek letters. No deep learning framework knowledge required.
I’m a technologist at heart, though life often leads me into managerial roles. For the last 15 years, I’ve built products, led and advised companies, published papers, and shared my insights at conferences and in articles on the web.