博文

目前显示的是 四月, 2023的博文

What is a DPU and What is SmartNIC ?What is the difference between DPU and SmartNIC?

图片
  Committed to the principles of open-source collaboration, Asterfusion is proud to share our DPU -based SmartNIC information and codes with the public soon. We encourage professionals across industries and manufacturers within our ecosystem to connect with us, exploring the innovative potential of DPU in a variety of applications! Click for: What is a DPU and what does it do? What is SmartNIC What is the difference between DPU and SmartNIC? It’s evident that the widespread implementation of different smart network cards within server clusters can significantly decrease data center construction and operational costs. Notwithstanding, when examining solely from the perspective of the network card, there’s still ample room for optimization within this solution. Such as: Incomplete OVS offloading: The SmartNIC is only responsible for offloading the OVS forwarding plane, but the control plane still needs to be processed by the host CPU; and interfaces and related protocols need to...

Ultra-low-latency network behind “ChatGPT”: Besides Infiniband, are there other options?

图片
In 2019, Microsoft pledged to invest up to $1 billion in OpenAI and is committed to building a giant supercomputer for the cutting-edge artificial intelligence startup. Not long ago, Microsoft detailed the unusually expensive supercomputer and the impressive upgrade to Azure in two posts on its official blog. Scott Guthrie, Microsoft’s vice president of cloud computing and artificial intelligence, revealed that the company poured hundreds of millions of dollars into the ambitious project, which combines the power of tens of thousands of Nvidia A100 GPUs with the Azure cloud computing platform. Undeniably,   high-performance computing  power is vital for deep learning models like ChatGPT. However, it is common for people to underestimate the significance of network transmission in accelerating AI training . Particularly during large-scale cluster distributed training, the network plays a crucial role. To train expansive language models effectively, a high-throughput,  low-...