Big Tech

Microsoft and Dell believe the answer to rising cloud token costs sits on every employee’s desk

· May 27, 2026
Microsoft and Dell believe the answer to rising cloud token costs sits on every employee’s desk

The business move

Microsoft and Dell are pushing enterprises to shift AI workloads off expensive cloud tokens and onto AI-powered PCs at every employee’s desk. The idea is to embed AI processing locally on Copilot+ PCs, reducing reliance on costly cloud inference tokens. This agentic AI PC strategy aims to make AI workloads run faster, more securely, and with predictable costs by leveraging on-device processing. Both companies are promoting this approach to ease the budget pressure from surging cloud usage fees as agentic AI moves into enterprise production.

Why it matters

Cloud AI token costs are rising rapidly, outpacing enterprise budgets and creating a new economic challenge for AI adoption. The traditional model of routing all generative AI queries through cloud APIs is becoming financially unsustainable at scale. Moving inference to powerful, secure endpoint devices offers a way to slow the cost curve while maintaining AI performance. This approach also preserves data privacy by keeping sensitive info local, which is critical for regulated industries concerned about cloud exposure. For IT buyers juggling escalating cloud bills and rising AI expectations, agentic AI PCs provide a clear tradeoff: higher upfront device investment in exchange for controlled, predictable operating expenses and better security.

Who gains and who gets squeezed

Companies with large distributed workforces gain more control over AI spending and data governance by shifting AI inference to endpoints. PC vendors like Dell benefit by selling premium Copilot+ machines built for local AI workloads, which command higher margins. Cloud providers face pressure on token revenue as enterprise AI usage fragments between cloud and edge devices. IT departments bear the job of deploying and managing AI-capable PCs but gain tools that scale AI without breaking budgets or compromising security. Some AI SaaS vendors that rely heavily on cloud inference fees may see slower growth or demand for hybrid solutions incorporating local AI hardware.

What to watch next

Watch how rapidly enterprises adopt AI PCs with embedded local inference versus continuing all-in cloud AI consumption. Monitor whether Dell’s Copilot+ PC lineup gains traction as a standard enterprise endpoint, and if Microsoft integrates local AI deeply into its ecosystem. The economic impact on cloud token spend will be critical to track, especially as AI workloads grow and budgets tighten. Future investment in on-device AI inference chips, software optimizations, and hybrid AI architectures will also shape how quickly this shift reduces cloud dependency and changes enterprise AI cost models.

AI Quick Briefs Editorial Desk

Stay ahead of AI Get the most important AI news delivered to your inbox — free.