Piotr Rybak
Piotr Rybak has almost 15 years of experience in machine learning. He has gained experience working in academia, start-ups, and larger companies. Currently, he focuses on using computer vision to recognize LEGO bricks. He is an active member of the Polish language models community and has co-authored works such as the KLEJ benchmark and the HerBERT and plT5 models. In his free time, he builds with LEGO bricks.
Session
With the rise of foundation models and zero-shot segmentation, it sometimes feels like fine-tuning classic object detection models is outdated. But is it? There are over 90 000 different LEGO bricks produced in almost 200 colors, and a single photo can easily contain hundreds of bricks. This makes LEGO recognition a perfect stress test for both traditional object detectors and the latest generation of vision models.
During this talk, I will walk you through a practical comparison of approaches to LEGO brick detection. I will start with the classic object detection pipeline: dataset creation, annotation, and training with models like NanoDet and RF-DETR. Then, I will put these detectors up against zero-shot approaches: SAM 3 (Segment Anything Model 3), and vision language models, both closed-source APIs like Gemini and open-source alternatives like Qwen-VL. Along the way, I will share the pitfalls, surprising results, and lessons learned, including cases where a fine-tuned lightweight detector still outperforms models orders of magnitude larger.