Models & Research

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-…

AI Quick Briefs Editorial Desk · May 9, 2026

What changed

NVIDIA introduced Star Elastic, a new post-training method that packs three nested language models into one checkpoint. The models included are at 30 billion, 23 billion, and 12 billion parameters, all trained together during a single 160 billion token run. Star Elastic uses zero-shot slicing to extract each model variant from a shared checkpoint without separate training cycles or storing multiple sets of weights. This approach cuts the total training tokens by roughly 360 times compared to training each model individually from scratch.

Why builders should care

Training large models is expensive and time-consuming. Star Elastic reduces both the computational load and storage footprint for managing multiple model sizes. Builders and operators can now maintain fewer model artifacts while still selecting from different capacity variants depending on resource constraints or task needs. This consolidation potentially lowers infrastructure cost, simplifies deployment pipelines, and accelerates iteration cycles, especially when experimenting with multi-scale reasoning models.

The practical takeaway

Star Elastic embeds multiple reasoning models inside one checkpoint, enabling quick “slicing” to deploy smaller or larger models without retraining. This makes it easier to balance performance and efficiency on demand. Operators gain flexibility in scaling AI workload sizes with less overhead in maintaining separate checkpoints or retraining models for each scale. This can accelerate production workflows and reduce storage costs, especially for teams managing models close to 10 billion parameters or above.

What to watch next

The application of Star Elastic beyond NVIDIA’s Nemotron Nano v3 could reshape how multi-scale models are trained and distributed across the industry. Watch for new releases applying this framework to other model architectures or logic reasoning tasks. Also track how quickly competing AI teams adopt or adapt this token-efficient approach to nested model training as a way to cut costs and streamline large-scale operations.

AI Quick Briefs Editorial Desk

Read Full Article →