Mellanox dual-port HCA, can ping if ib0 pair connected but not if only ib1 pair connected

infinibandping

I have installed 2 mellanox FDR dual-port ConnectX-3 HCA cards (CX354A), each to a separate machines. The machines are connected directly to each other (switchless configuration). Both ports on the cards are connected such that port1 is to port1 and port2 is to port2.
Each ports is configured as follow:

HCA1 port1:  ib0    inet addr:192.168.10.13  Bcast:192.168.10.255  Mask:255.255.255.0
          port2: ib1     inet addr:192.168.10.15  Bcast:192.168.10.255  Mask:255.255.255.0

HCA2 port1: ib0     inet addr:192.168.10.24  Bcast:192.168.10.255  Mask:255.255.255.0
         port2: ib1     inet addr:192.168.10.26  Bcast:192.168.10.255  Mask:255.255.255.0

Running 2 opensm commands on HCA1 as below and ibstat shows that all 4 ports are up and active.

root@HCA1# opensm -g <ib0 GUID> --daemon
root@HCA1# opensm -g <ib1 GUID> --daemon

With the above configured, I can ping from any of the IP to any others from the above.

HOWEVER, when I disconnected cables for port1, ping does not work between the connected port2 pair.
Disconnecting port2 pair and connect only port1 pair, ping works fine even for disconnected port2 IP (?)
What could be the reason for this and how can I fix the problem. Please mention what extra info I should post.

What I'm trying to achieve is to establish a totally isolated link for each port pair and run separated openMPI processes to test and compare bandwidth for two infiniband cables at a same time. Could anyone advise on how this could be done?

As to what I have learnt, I think I need to create different partition key for each port pair. (currently they are using the default pkey 0xffff )
However this default pkey cannot be changed once the infiniband is configured during boot-up. Any suggestion or advice?

Both machines are running CentOS 6.4 and I have installed Mellanox OFED 1.5.3.

These are the output of the ibstat on both machines:

[root@HCA1 Desktop]# ifconfig ib0  
ib0       Link encap:InfiniBand  HWaddr   80:00:00:48:FE:81:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          inet addr:192.168.10.13  Bcast:192.168.10.255  Mask:255.255.255.0  
          inet6 addr: fe80::202:c903:21:8f11/64 Scope:Link  
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1  
          RX packets:4144160 errors:0 dropped:0 overruns:0 frame:0  
          TX packets:4141376 errors:0 dropped:2 overruns:0 carrier:0  
          collisions:0 txqueuelen:1024  
          RX bytes:702746349 (670.1 MiB)  TX bytes:719570861 (686.2 MiB)  


[root@HCA1 Desktop]# ifconfig ib1  
ib1       Link encap:InfiniBand  HWaddr   80:00:00:49:FE:82:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          inet addr:192.168.10.15  Bcast:192.168.10.255  Mask:255.255.255.0  
          inet6 addr: fe80::202:c903:21:8f12/64 Scope:Link  
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1  
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0  
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0  
          collisions:0 txqueuelen:1024  
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)  


[root@HCA2 Desktop]# ifconfig ib0  
ib0       Link encap:InfiniBand  HWaddr   80:00:00:48:FE:81:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          inet addr:192.168.10.24  Bcast:192.168.10.255  Mask:255.255.255.0  
          inet6 addr: fe80::202:c903:21:8f51/64 Scope:Link  
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1  
          RX packets:4141382 errors:0 dropped:0 overruns:0 frame:0  
          TX packets:4144161 errors:0 dropped:2 overruns:0 carrier:0  
          collisions:0 txqueuelen:1024  
          RX bytes:703005597 (670.4 MiB)  TX bytes:719323129 (685.9 MiB)  


[root@HCA2 Desktop]# ifconfig ib1  
ib1       Link encap:InfiniBand  HWaddr   80:00:00:49:FE:82:00:00:00:00:00:00:00:00:00:00:00:00:00:00  
          inet addr:192.168.10.26  Bcast:192.168.10.255  Mask:255.255.255.0  
          inet6 addr: fe80::202:c903:21:8f52/64 Scope:Link  
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1  
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0  
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0  
          collisions:0 txqueuelen:1024  
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)  

The loaded modules are as below:

[root@HCA1 Desktop]# /etc/init.d/openibd status

  HCA driver loaded

Configured IPoIB devices:
ib0 ib1

Currently active IPoIB devices:
ib0
ib1

The following OFED modules are loaded:

  rdma_ucm  
  rdma_cm  
  ib_addr  
  ib_ipoib  
  mlx4_core  
  mlx4_ib  
  mlx4_en  
  ib_mthca  
  ib_uverbs  
  ib_umad  
  ib_ucm  
  ib_sa  
  ib_cm  
  ib_mad  
  ib_core  
  iw_cxgb3  
  iw_nes  

Best Answer

Ok, I'm not entirely familiar with the setup on CentOS but what I think is happening is this. That one or both copies of opensm are working on ib0 link but not other. ib0 being the default for OpenSM.

As I understand it you'll need two copies of opensm running on this particular setup because without a switch binding all HCA's together it's essentially two fabrics and you need to run the subnet manager on both fabrics. You've correctly picked that up but not actually run them correctly (specifically the 2nd instance).

Ping appears to work when both are connected because Linux is passing the ping to the second interface and responding for both IP's. All that's working over ib0 (Pair1).

Under ubuntu which I'm familiar with there is a config file /etc/default/opensm.

It sounds like it's different on CentOS. The format of that file on Ubuntu is used to run opensm with the right ports because you need an opensm subnet manager on each port.

Basically what you want to do is not run

opensm -g --daemon

twice but instead

/usr/sbin/ibstat -p

Which will give output like:

0x001a4bffff0c34e5
0x001a4bffff0c34e6

Then run

opensm -g 0x001a4bffff0c34e5 --daemon 
opensm -g 0x001a4bffff0c34e6 --daemon 

Under Ubuntu the init script actually automates that process for ports=ALL (read from /etc/default/opensm) where ALL is a keyword picked up the by init script.

There is likely an init script for opensm under CentOS. In the mean time the above commands can be used or you can write your own startup script.


UPDATE: I'm not sure if it will make a difference or not but I also have the following two kernel modules loaded which you don't.

ib_ipath
ib_qib

Have you also flashed your HCA's with the latest firmware? This is actually quite important. Don't assume they have the latest out of the factory.

Related Topic