OpenAI built a networking protocol with AMD, Broadcom, Intel, Microsoft, and NVIDIA to fix AI supercomputer…
OpenAI has partnered with major tech companies including AMD, Broadcom, Intel, Microsoft, and NVIDIA to create a new open source network protocol called MRC. This protocol is designed to improve data transfer between GPUs in AI supercomputers by sending information across hundreds of simultaneous paths, rather than relying on traditional limited pathways. MRC reduces the number of switch layers needed to connect more than 100,000 GPUs from three or four to just two, which cuts power consumption and lowers costs. OpenAI is already using MRC on its Stargate supercomputer.
This development is important because AI supercomputers require fast and efficient communication between thousands of GPUs to process massive datasets quickly. Network bottlenecks slow down performance and increase energy use, which can limit AI research and deployment. By optimizing how data flows within these vast GPU arrays, MRC can accelerate AI training and inference tasks. This could lead to faster development of AI models and more affordable supercomputing resources for businesses and researchers.
The need for this protocol stems from the rapid scaling of AI hardware. As AI models grow larger, the demand for computing power from GPUs skyrockets. Current network designs use multiple layers of switches to route data, but this introduces delays and inefficiencies. MRC’s approach sends data across many parallel paths simultaneously, creating a more direct and scalable network structure. This not only boosts speed but also reduces complexity and energy costs, addressing a key challenge in building the next generation of AI machines.
Looking ahead, this collaboration signals a trend toward more open and shared infrastructure solutions in AI hardware. By pooling expertise from chipmakers, cloud providers, and AI specialists, the industry is tackling hardware bottlenecks collectively rather than in isolation. MRC could become a standard for AI supercomputing networks, helping new projects get off the ground faster. Keep an eye on how this protocol gets adopted beyond OpenAI’s systems and whether it can scale easily to other types of computing workloads.
— AI Quick Briefs Editorial Desk