A Low cost alternative to InfiniBand Switch- Asterfusion low latency cloud switch

With the increased use of technologies like the Internet of Things , artificial intelligence, and machine learning (ML), enterprises are producing huge quantities of data, and they need to process and use that data faster in real-time. In the past, enterprise users would choose InfiniBand switches to solve the network needs in HPC scenarios, and now, Asterfusion provides a more cost-effective CX-N low latency switch to offer a new option, Let’s read this article together.

What Is HPC (High-Performance Computing)

High-performance computing is the capability of processing data and performing complex calculations at extremely high speeds. For example, a laptop or desktop computer with a 3 GHz processor can perform about 3 billion calculations per second, which is much faster than any human being can achieves, but it still pales in comparison to HPC solutions that can perform trillions of calculations per second. The general architecture of HPC is mainly composed of computing, storage, and network. The reason why HPC can improve the computing speed is that it adopts “parallel technology”, using multiple computers to work together, using ten, hundreds, or even thousands of computers, which enables them “working in parallel”. Each computer needs to communicate with each other and process tasks cooperatively, which requires a high-speed network with strict requirements on latency and bandwidth.

What is RDMA(Remote Direct Memory Access)

RDMA (Remote Direct Memory Access) is developed to solve the delay of server-side data processing in network transmission. It can directly access the memory of one host or server from the memory of another host or server without using the CPU. It frees up the CPU to do what it’s supposed to do, such as running applications and processing large amounts of data. This both increases bandwidth and reduces latency, jitter, and CPU consumption.

  • InfiniBand is a network protocol specifically designed for RDMA, which guarantees network lossless from the hardware level, with extremely high throughput and low latency. However, InfiniBand switches are dedicated products provided by specific manufacturers and use proprietary protocols, while most existing networks use IP Ethernet network , interoperability requirements cannot be satisfied with InfiniBand. Meanwhile, the closed architecture also has the problem of vendor lock-in, which is especially risky for business systems that require large-scale expansion in the future.
  • iWarp protocol, allowing RDMA on TCP, requiring a special network card that supports iWarp, so that RDMA can be supported on standard Ethernet switches. However, due to the limitation of the TCP protocol, It loses the advantage of RDMA performance.

Asterfusion CX-N low latency switches in HPC Scenario

Based on the understanding of high-performance computing network requirements and RDMA technology, Asterfusion launched the CX-N series of ultra-low latency cloud switches. It uses a fully open, high-performance network hardware + transparent open network system (AsterNOS), to build a low latency, zero packet loss, high-performance Ethernet network for HPC scenarios, which not bound by any vendor.

01.Ultra-Low-Latency Switching ASIC, Reducing Network Forwarding Delay

Extremely cost-effective, CX-N switches have port to port minimum 400ns forwarding delay,the forwarding delay is the same at full rate (10G~400G);

  • Supports RoCEv2 to reduce the delay of transmission protocol;
  • Support DCB, ECN, DCTCP,etc. to deliver low-latency, zero packet loss, non-blocking Ethernet;
  • AFC SDN Controller provides unified management, which can seamless integration into OpenStack based cloud OS or standalone deployment turning clusters of switches into a single virtual fabric.

02.Provides Lossless Networking Using PFC High-Priority Queues

03. Use ECN To Eliminate Network Congestion

ECN (Exploration Congress Notification) is an important means of building a lossless Ethernet that provides end-to-end flow control. Using the ECN feature, once congestion is detected, the network device marks the ECN domain at the IP head of the packet. When ECN-marked packets arrive at their intended destination, congestion notifications are fed back to the traffic sending side. The traffic sender then responds to the congestion notification by limiting the rate of the problematic network packets, thus reducing network latency and jitter of high performance computing clusters.

04. Cooperate With AFC SDN Controller To Ensure Network Is Foolproof

Asterfusion follows the design concept of SDN, fully embraces the strategy of fully open network and high-performance cloud data center, and launched the AFC SDN cloud network controller, which realizes the visualization of network management. AFC displays data such as device status, link status, and alarm information in the form of graphs classified by time, resource, and performance type. In addion,it supports the statistical capabilities of multiple data to provide customers with a comprehensive and intuitive understanding of the overall network conditions.

Asterfusion CX-N Ultra Low Latency Switch Vs. InfiniBand Switch

01 LAB TEST

Comparing Asterfusion CX-N low latency switch and Mellanox IB , the CX-N is more cost-effective.

In addition, the CX-N ‘s delay fluctuation of traversing all bytes is smaller, and the test data is stable at about 0.1us.

Asterfusion Low Latency Data Center Switch Is A Lower Cost Alternative To InfiniBand.

The ultra-low-latency lossless Ethernet built by Asterfusion CX-N ultra-low-latency cloud switches,using traditional Ethernet achieves the performance of expensive InfiniBand switches.

When enterprise users have a limited budget, but at the same time have a high demand for latency in HPC scenarios, which can choose Asterfusion Teralynx based CX-N low latency switches as a chioce. It offers a truly low-latency, zero-packet-loss, high-performance, and cost-effective network for high-performance computing clusters.

评论

此博客中的热门博文

Asterfusion Ultra -low latency switch- Ceph cluster deployment and OpenStack integration

Asterfusion SONiC -based Data Center Switches FAQ

Asterfusion launches 400G Data center Solutions based on SONiC NOS