
When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works .
In any datacenter, whether it's for AI or not, having fast networked communication across nodes is as equally important as the speed of the nodes themselves. When doing AI work, developers are steered to vendor-specific networking libraries like Nvidia's NCCL or AMD's RCCL . Now, in a new paper, a group of South Korean scientists has proposed a new library called HetCCL , a vendor-agnostic approach that allows clusters composed of GPUs from both vendors to operate as one.
Although it can simply be used for communicating between multiple GPUs in one setup, a collective commin a datacenter often ends up using good ol' Remote Direct Memory Access (RDMA) to let applications pass data to a GPU somewhere else in the network. Think of sending network packets directly into a device's memory (in this case GPU VRAM), rather than going through the driver, the TCP/IP stack, the OS networking layer, and burning a metric ton of CPU cycles in the process.
The paper's authors claim that HetCCL is a world-first drop-in replacement for the vendor-specific CCLs, accomplishing multiple feats at once, by enabling cross-platform communication and load balancing. HetCCL's greatest feat is that it can make multi-vendor deployments viable, letting developers use the aggregate compute capacity of Nvidia and AMD server racks for a given task.
China’s GPU cloud consolidates around Baidu and Huawei as domestic AI chips scale up
HPE adopts AMD’s Helios rack architecture for 2026 AI systems — new rack form factor gets its first major partner ahead of 2026 availability
Microsoft built a ‘Community-First AI Infrastructure’ framework for its data center projects
Second, HetCCL purports to be a direct library replacement, apparently requiring only that developers link their application to the HetCCL code rather than their vendor's CCL. The best analogy here is changing a DLL in a game to inject fancy post-processing filters. This way, there should be no source code changes necessary anywhere, from the application all the way to drivers, a fact the HetCCL team proudly calls out.
Third, it implicitly adds support for any future new GPU vendors, as once linked to HetCCL, application code doesn't have to concern itself about whether their data transfer calls to, say, NCCL, will actually end up at Nvidia GPUs. And last but definitely not least, HetCCL accomplishes all of this with minimal overhead, sometimes even outperforming the original CCL thanks to better default tuning parameters.
To illustrate this, the scientists ran tests on a four-node cluster, with 2×4 Nvidia GPUs, and 2×4 AMD GPUs. Do note that the results are not meant to be cross-vendor benchmarks , but rather an illustration of HetCCL's potential with meager test resources. After all, the Nvidia system had PCI 3.0 GPUs while the AMD systems had PCIe 4.0 units; all old hardware by now.
In many cases, the results reach their theoretical maximums by blindingly adding Nvidia and AMD computing power, an impressive achievement, though naturally this could will vary greatly across setups and workloads. Under the right conditions, HetCCL could lead to lower costs for training models, as efficiently using both Nvidia and AMD GPUs simultaneously means that tasks no longer have to be split up between clusters and ultimately wait on each other. There could also be man-hour savings in managing said tasks, too.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Key considerations
- Investor positioning can change fast
- Volatility remains possible near catalysts
- Macro rates and liquidity can dominate flows
Reference reading
- https://www.tomshardware.com/networking/SPONSORED_LINK_URL
- https://www.tomshardware.com/networking/hetccl-makes-clustered-nvidia-and-amd-ai-accelerators-play-nice-with-each-other-via-rdma-vendor-agnostic-collective-communications-library-removes-an-obstacle-to-heterogeneous-ai-data-centers#main
- https://www.tomshardware.com
- Yet another Windows update is wreaking havoc on gaming rigs worldwide — Nvidia recommends uninstalling Windows 11 KB5074109 January update to prevent framerate
- The new Valve Steam Machine is 'on track' to begin shipping early this year, says AMD — CEO suggests new 4K mini gaming PC, powered by semi-custom Zen 4 CPU, to
- CEOs of NVIDIA and Lilly Share ‘Blueprint for What Is Possible’ in AI and Drug Discovery
- Snag this 4K-capable gaming PC with an RTX 5070 Ti and 32GB of DDR5 RAM for less than $2,000 — a 26% saving on a powerhouse of a rig equipped with Intel's 24-co
- ‘Most of you steal your software’ — Bill Gates complained about software piracy 50 years ago, and was openly irked by community's Altair BASIC ‘theft’
Informational only. No financial advice. Do your own research.