Models & Research

NVIDIA Releases Nemotron-Labs-TwoTower: an Open-Weight Diffusion Language Model Built on a Frozen Autoregre…

· July 1, 2026
NVIDIA Releases Nemotron-Labs-TwoTower: an Open-Weight Diffusion Language Model Built on a Frozen Autoregre…

What happened

NVIDIA released Nemotron-Labs-TwoTower, a new language model that uses diffusion to generate text. It builds on a pretrained autoregressive backbone called Nemotron-3-Nano-30B-A3B. The key point is that Nemotron-Labs-TwoTower is provided with open weights under NVIDIA’s Nemotron Open Model License. This open-weight diffusion language model is designed to overcome a common bottleneck in text generation speed, which exists in standard autoregressive models that produce text one token at a time.

Why it matters

Autoregressive models remain the dominant architecture in language modeling but have a throughput limitation caused by sequential token decoding. This slows down generation speed and raises computing costs for applications requiring rapid or high-volume text output. Diffusion language models work differently by generating multiple tokens through iterative refinement rather than strict sequence-by-sequence prediction. NVIDIA’s approach uses a frozen autoregressive backbone, effectively combining the strengths of pretrained AR features with diffusion’s potential for parallelism. The release as open weights under a permissive license encourages experimentation and integration in projects that need faster language generation without starting from scratch. For builders and companies, this means new options to potentially lower latency, reduce operational costs, and rethink text generation workflows by leveraging diffusion-based architectures built on proven AR foundations.

What to watch next

Watch for independent deployments and benchmarks testing Nemotron-Labs-TwoTower in real-world applications to verify throughput gains and quality trade-offs. The community’s response to open licensing terms will shape how quickly this approach integrates into open source frameworks and commercial offerings. Also follow NVIDIA’s next steps in expanding diffusion model support or applying similar hybrid-backbone strategies to other AI tasks. Adoption will depend on clear demonstrations that Nemotron-Labs-TwoTower can deliver faster output with acceptable accuracy in scenarios from chatbots to automated content creation.

AI Quick Briefs Editorial Desk

Stay ahead of AI Get the most important AI news delivered to your inbox — free.