Stanford Online
5/11/2026

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 10: Inference
Short summary
Stanford's CS336 Language Modeling from Scratch provides foundational training for building and optimizing large language models in production. Lecture 10 focuses specifically on inference optimization—techniques for reducing latency, memory requirements, and computational costs—taught by leading researchers Percy Liang and Tatsunori Hashimoto in an 86-minute technical deep-dive. Essential for engineers and founders building LLM-based applications seeking rigorous technical grounding.
- •Lecture 10 of Stanford's CS336 course focusing on inference optimization techniques
- •Taught by leading ML researchers Percy Liang and Tatsunori Hashimoto
- •86-minute technical deep-dive ideal for product engineers and founders building LLM applications
Generated with AI, which can make mistakes.
Is this a good recommendation for you?



