NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing

Short summary

NVIDIA's Star Elastic packs 30B, 23B, and 12B parameter reasoning models into a single checkpoint via post-training, reducing training tokens by 360× versus separate training. Elastic budget control uses a smaller submodel for reasoning and the full model for final answers, achieving 16% higher accuracy and 1.9× lower latency. FP8 and NVFP4 quantization bring the entire model family to consumer RTX GPUs.

•Single checkpoint contains three nested model sizes (30B, 23B, 12B) trained efficiently in one run
•Elastic budget control improves accuracy 16% and reduces latency 1.9× compared to standard approaches
•Full model family becomes accessible on consumer RTX-class GPUs through quantization

Generated with AI, which can make mistakes.

#research-breakthrough #ai-tools #product-launch

Read full article at MarkTechPost

Is this a good recommendation for you?

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing

Short summary

Explore more