Back to feed
Stanford Online
Stanford Online
5/11/2026
Stanford CS25: Transformers United V6 I From Next-Token Prediction to Next-Generation Intelligence

Stanford CS25: Transformers United V6 I From Next-Token Prediction to Next-Generation Intelligence

Short summary

Stanford's CS25 seminar covers recent advances in LLM pretraining, showing that front-loading reasoning-rich data yields persistent reasoning gains impossible through post-training alone. The seminar formalizes a two-phase pretraining framework for data selection, blending, and sequencing. Led by Shrimai Prabhumoye (Mistral AI) with Stanford faculty including Christopher Manning and Michael C. Frank.

  • Recent advances show data ordering and reasoning-centric integration are critical in LLM pretraining
  • Front-loading reasoning-rich data yields persistent reasoning gains that post-training cannot replicate
  • Formalizes two-phase pretraining framework for data selection, blending, and sequencing

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more