Models & Research

Meet ‘North Mini Code’: Cohere’s 30B Open-Weight Mixture-of-Experts Model With 3B Active Parameters for Age…

· June 11, 2026
Meet ‘North Mini Code’: Cohere’s 30B Open-Weight Mixture-of-Experts Model With 3B Active Parameters for Age…

What changed

Cohere launched North Mini Code, a 30 billion parameter coding model that uses a mixture-of-experts (MoE) architecture. Instead of activating all 30 billion parameters, it selectively uses around 3 billion active parameters per request. The model runs efficiently on a single Nvidia H100 GPU and supports an unusually long 256,000 token context window, letting it handle very large codebases or long coding conversations in one pass. North Mini Code is Cohere’s first developer-focused coding model and is open-weight, meaning its underlying model weights are public.

Why builders should care

North Mini Code’s MoE approach lets it scale capacity significantly without needing proportionally more compute for each query. This can make large coding models more affordable to run, especially in complex coding tasks that require processing thousands of lines at once. The massive 256K token context length vastly extends what developers can feed the model as input, reducing the need to chunk large code projects artificially or lose context across interactions.

Open-weight access means developers can study, fine-tune, or customize the model without vendor lock-in or licensing feeds. For startups and teams building on top of coding AI, this opens a path to build smarter coding assistants or agentic AI that can act over extended codebases without loading multiple models or sessions.

The practical takeaway

If building or investing in developer tools, North Mini Code’s efficiency and capacity improvements could tighten competition around coding AI. It pressures incumbents that deliver smaller contexts and full-parameter activation models, which are more costly per token. The extended context window shifts expectations about what coding assistants should handle in a single go.

Operators running AI coding stacks should anticipate new options that reduce GPU overhead per query while increasing code comprehension depth and breadth. Open-weight releases may also raise the bar for transparency and customization in proprietary coding AI offerings.

What to watch next

Track how quickly Cohere’s MoE model adoption spreads in coding assistant projects. See if other providers follow suit with longer context windows and mixture-of-expert architectures tailored for developer tools. It will be important to monitor benchmarks comparing North Mini Code’s real-world coding task accuracy and latency versus existing models.

Also watch for ecosystem developments around open-weight coding models, such as fine-tuning tooling, plugins, or integrations with popular IDEs and code repositories. These factors will determine how much of a practical advantage the model brings to builders, not just theory.

AI Quick Briefs Editorial Desk

Stay ahead of AI Get the most important AI news delivered to your inbox — free.