FPGA Implementation of RoCEv2

FPGA Implementation of RoCEv2 Thumbnail

What is the RoCE v2?

RoCEv2, or RDMA over Converged Ethernet version2 is a network protocol that enable Remote Direct Memory Access (RDMA) over standard Ethernet networks using UDP/IP. It’s designed to deliver high-throughput, low-latency communication while minimizing CPU usage-making it ideal for data centers, high-performance computing, and storage systems.

The link below is a lecture on RDMA. It provides a lot of useful information. https://www.coursera.org/learn/the-fundamentals-of-rdma-programming/home/info


RoCEv2 Packet Format

  1. Ethernet Header
    • Standard Ethernet frame header.
  2. IP Header
    • IPv4 or IPv6 depending on the network configuration.
  3. UDP Header
    • Destination Port: 4791 (reserved for RoCE v2)
  4. InfiniBand Transport Headers
    • BTH (Base Transport Header): Contains OpCode, packet sequence number (PSN), and other control bits.
    • Optional Headers: Depending on the RDMA operation. you may see:
      • RETH (RDMA Extended Transport Header) for RDMA READ/WRITE
      • AETH (Acknowledgement Extended Header)
      • DETH (Datagram Extended Transport Header)
      • Atomic Headers for atomic operations
  5. Payload
    • The actual RDMA data being tranferred.
  6. ICRC (Invariant CRC)
    • Ensures data integrity across the RDMA transport layer.

RoCE v2 Implementation Technologies

🔹 Hardware (FPGA, NIC):

  • Mellanox (NVIDIA) ConnectX-4 and later support RoCE v2
  • AMD/Xilinx FPGAs can use the ERNIC IP (RoCE Engine) for RoCE v2 implementation

🔹 Software (Linux environment):

  • rdma-core library (includes libibverbs and librdmacm)
  • Use ibv_* APIs to create QPs (Queue Pairs) and perform data transfer

FPGA Implementation

For an FPGA Implementation, You need two IP cores: a 100G MAC and an ERNIC.

Integrated CMAC Block for 100 Gb/s Ethernet
  1. Integrated 100G Ethernet Subsystem https://www.amd.com/en/products/adaptive-socs-and-fpgas/intellectual-property/cmac_usplus.html
    • Supprots CAUI-10, CAUI-4, 100GAUI-2, 100GAUI-4
    • 100 Ethernet MAC, and RS-FEC logic + PCS Block
  2. ERNIC (Xilinx IP) https://www.amd.com/en/products/adaptive-socs-and-fpgas/intellectual-property/ef-di-ernic.html
    • RoCE v2 packet parsing/generation
    • Memory interface for DMA
    • UDP/IP encapsulation and decapsulation
    • ICRC generation/check

답글 남기기

이메일 주소는 공개되지 않습니다. 필수 필드는 *로 표시됩니다