Models & Research

Why LLMs should stop thinking out loud (and what comes after chain-of-thought)

AI Quick Briefs Editorial Desk · June 15, 2026

What changed

Chain-of-thought prompting for large language models is facing growing scrutiny. This method asks models to generate explicit step-by-step reasoning out loud before giving an answer. While it can improve transparency and accuracy, new insights show this approach is slow, expensive, and mostly an illusion. The real reasoning does not happen in the generated text, but rather in a model’s hidden latent space, where thoughts do not need to be spelled out.

Why builders should care

Chain-of-thought can slow both inference and development because models spend costly compute cycles producing intermediate reasoning tokens rather than directly outputting answers. It inflates API usage and latency, raising operational costs. More importantly, it misleads users into thinking the model truly “understands” its answers step-by-step. For builders and AI operators, this means the widely promoted reasoning style sacrifices efficiency and causes misconceptions about LLM capabilities.

The practical takeaway

The future of machine reasoning with LLMs lies in latent-space operations—where models can internally “think” without verbalizing every step. This enables faster, cheaper inferences and opens the door for new architectures that reason more efficiently. Operators can expect a shift toward optimizing internal representational manipulations over surface-level chain-of-thought outputs. Adjusting tooling and expectations now can reduce costs and improve deployment performance.

What to watch next

Stay alert for new model designs and frameworks prioritizing latent-space reasoning. Watch for API updates that let users access or leverage internal reasoning paths without verbose token generation. Also, monitor startups working on “silent” or “compressed” reasoning layers, which could reshape cost and speed dynamics for real-world AI applications. Chain-of-thought may remain useful for explainability, but it will probably lose ground as a practical default for machine reasoning.

AI Quick Briefs Editorial Desk

Read Full Article →