Browsed by
Tag: networking

Part 1 – PVRDMA and how to test it in home lab.

Part 1 – PVRDMA and how to test it in home lab.

One of the members of the VMware User Community (VMTN) inspired me to build a configuration where two VMs use PVRDMA network adapters to communicate. An aim I wanted to achive was to establish the communication between VMs without using Host Channel Adapter cards installed in hosts. It’s possible to configure as stated here, in the VMware vSphere documentation.

For virtual machines on the same ESXi hosts or virtual machines using the TCP-based fallback, the HCA is not required.

To do this task I prepared one ESXi host (6.7U1) managed by vCSA (6.7U1). One of the requirements for RDMA is a vDS. First I configured a dedicated vDS for RDMA communication. I didn’t set anything special, just a simple vDS (DSwitch-DVUplinks-34) with a default portgroup (DPortGroup). I equipped it with just one uplink.

In vSphere, a virtual machine can use a PVRDMA network adapter to communicate with other virtual machines that have PVRDMA devices. The virtual machines must be connected to the same vSphere Distributed Switch.

Second I created a VMkernel port (vmk1) dedicated to RDMA traffic in this DPortGroup. I didn’t even assign any IP address to this vmk port (No IPv4 settings).

Third I set Advanced System Setting Net.PVRDMAVmknic on ESXi host and gave it a value pointing to VMkernel port (vmk1). Tag a VMkernel Adapter for PVRDMA.

Then I enabled pvrmda firewall rule on host in Edit Security Profile window. Enable the Firewall Rule for PVRDMA.

The next steps are related with configuration of the VMs. First I created new virtual machine. Then I added another Network adapter to it and connected it to DPortGroup on vDS. For the Adapter Type of this Network adapter I chose PVRDMA and Device Protocol RoCE v2. Assign a PVRDMA Adapter to a Virtual Machine.

Then I installed Fedora 29 on a first VM. I chose it because there are many tools to easily test a communication using RDMA. After the OS installation another network interface showed up on VM. I addressed it in different IP subnet. I’ve used two network interfaces in VMs, a first one to have an access through SSH and a second one to test RDMA communication.

Then I set “Reserve all guest memory (All locked)” in VM’s Edit Settings window.

I had two VMs configured from infrastructure perspective enough to communicate using RDMA.

To do it I had to install appropriate tools. I found them on GitHub. You can find them here.
To use them, I had to install them first. I did it using the procedure described on the previously mentioned page.

dnf install cmake gcc libnl3-devel libudev-devel pkgconfig valgrind-devel ninja-build python3-devel python3-Cython	

Next I installed a git client using the following command.

yum install git

Then I cloned the git project to a local directory.

mkdir /home/rdma
git clone https://github.com/linux-rdma/rdma-core.git /home/rdma

I built it.

cd /home/rdma
bash build.sh

Then I’ve done a VM clone to have a communication partner for the firts one. After cloning I did an appropriate IP reconfiguration in the cloned VM.

Finally I’ve could test the communication using RDMA.

On the VM that functioned as a server I’ve run listener service on the interface mapped to PVRDMA virtual adapter:

cd /home/rdma/build/bin
./rping -s -a 192.168.0.200 -P

On the client side I’ve run a command that has allowed me to connect to server service:

./rping -c -I 192.168.0.100 -a 192.168.0.200 -v

It was working beautifully!

pvrdma

NSX-V VTEP, MAC, ARP Tables content mapping

NSX-V VTEP, MAC, ARP Tables content mapping

It took me a while to figure out what information I see while displaying VTEP, MAC and ARP table on Controller Cluster in NSX. In documentation you can find what information are included in those tables but it might not be seemingly obvious which field contains what kind of data that’s why I decided to make a short reference for myself but maybe it will help also someone else.

To understand those tables I started with Central CLI to display content of each table which was as follows:

tabeleVTEPitp

Now let’s consider what kind of information we’ve got in each table and how they map to particular components in the environment.

VTEP Table – segment to VTEP IP bindings:

VNI – Logical Switch ID based on configured Segment pool

IP – VTEP IP (VMkernel IP) of host on which VM in VNI 6502 is running

Segment – VTEP Segment – in my case that’s only one L3 network which is used

MAC – MAC address of physical NIC configured for VTEP

MAC Table – VM MAC address to VTEP IP (host) mapping:

VNI – Logical Switch ID based on configured Segment pool

MAC – MAC address of VM accessible through VTEP IP displayed in column on the right.

VTEP-IP – IP of a host VTEP on which VM with MAC address from previous column is running.

ARP Table – Virtual Machine MAC to IP mapping:

VNI – Logical Switch ID based on configured Segment pool

IP – IP address of a Virtual Machine connected to that Logical Switch with following VNI

MAC – MAC address of Virtual Machine

 

To make it even easier here you have got a summary diagram with those mappings.

Drawing1

If you want to dig deeper into details how those tables are populated I strongly recommend watching this video from VMworld 2017 which clearly explains it step by step:

Alternative methods to create virtual switch.

Alternative methods to create virtual switch.

Creating virtual switch through GUI is well described in documentation and pretty intuitive using GUI. However, sometimes it might be useful to know how to do it with CLI or Powershell, thus making the process part of a script to automate initial configuration of ESXi after installation.

Here you will find commands which are necessary to create and configure a standard virtual switch using CLI and Powershell. Those examples will describe the process of vSwitch creation for vMotion traffic which involves VMkernel creation.

I. vSwitch configuration through CLI

  1. Create a vSwitch named “vMotion”

esxcli network vswitch standard add -v vMotion

  1. Check whether your newly created vSwitch was configured and is available on the list.

esxcli network vswitch standard list

  1. Add physical uplink (vmnic) to your vSwitch

esxcli network vswitch standard uplink add -u vmnic4 -v vMotion

  1. Designate an uplink to be used as active.

esxcli network vswitch standard policy failover set -a vmnic4 -v vMotion

  1. Add a port group named “vMotion-PG” to previously created vSwitch

esxcli network vswitch standard portgroup add -v vMotion -p vMotion-PG

  1. Add a VMkernel interface to a port group (Optional – not necessary if you are creating a vSwitch just for VM traffic)

esxcli network ip interface add -p vMotion-PG -i vmk9

  1. Configure IP settings of a VMkernel adapter.

esxcli network ip interface ipv4 set -i vmk9 -t static -I 172.20.14.11 -N 255.255.255.0

  1. Tag VMkernel adapter for a vMotion service. NOTE – service tag is case sensitive.

esxcli network ip interface tag add -i vmk9 -t vmotion

Done, your vSwitch is configured and ready to service vMotion traffic.

 

II. vSwitch configuration through PowerCLI

  1. First thing is to connect to vCenter server.

Connect-VIServer -Server vcsa.vclass.local -User administrator@vsphere.local -Password VMware1!

  1. Indicate specific host and create new virtual switch, assigning vmnic at the same time.

$vswitch1 = New-VirtualSwitch -VMHost sa-esx01.vclass.local -Name vMotion -NIC vmnic4

  1. Create port group and add it to new virtual switch.

New-VirtualPortGroup -VirtualSwitch $vswitch1 -Name vMotion-PG

  1. Create and configure VMkernel adapter.

New-VMHostNetworkAdapter -VMHost sa-esx01.vclass.local -PortGroup vMotion-PG -VirtualSwitch vMotion -IP 172.20.11.11 -SubnetMask 255.255.255.0 -vmotionTrafficEnabled $true