Learning Tracks
Structured deep-dives on LLMs, RAG, multimodal models, and more. Pick a track and start exploring.
No tracks found.
NanoGPT Speedrun
Incremental improvements that push GPT pre-training efficiency to its limits — from baseline to SOTA in hours.
RAG Pipelines
From TF-IDF and BM25 to dense bi-encoders, hybrid fusion, rerankers, HNSW indexing, and fine-tuning with contrastive losses — everything you need to build and understand production-grade retrieval-augmented generation systems.
Decoder Models
Autoregressive language models from the ground up — architecture, attention, decoding strategies, and scaling laws.
Encoder Fine-tuning
BERT-style bidirectional models, masked language modelling, and efficient fine-tuning techniques like LoRA and adapters.
Vision-Language Models
Multimodal architectures that align visual and textual representations — CLIP, contrastive pre-training, and VQA.
Vision-Language-Action Models
VLAs for robotics — grounding language and vision into motor policies, from OpenVLA to diffusion-based action prediction.
RLHF & Alignment
Reinforcement learning from human feedback, reward modelling, PPO, DPO, and the techniques that align language models with human preferences.
Agents & Tool Use
LLM-powered agents, function calling, ReAct, multi-agent systems, and the infrastructure for autonomous task execution.
Diffusion Models
Score-based generative models, DDPM, DDIM, classifier-free guidance, and latent diffusion — the architecture behind modern image and video generation.
Mixture of Experts
Sparse MoE layers, routing algorithms, load balancing, and how models like Mixtral and GPT-4 scale to hundreds of billions of parameters efficiently.
State Space Models
Mamba, S4, and the family of structured SSMs that achieve linear-time sequence modelling as a competitive alternative to the Transformer attention mechanism.
Benchmarks & Evaluation
MMLU, HumanEval, HELM, lm-evaluation-harness, and the methodology behind measuring, comparing, and stress-testing language model capabilities.
Inference Optimization
KV-cache, speculative decoding, quantization (GPTQ, AWQ, GGUF), continuous batching, and the engineering behind serving LLMs at scale with low latency.