Linux – How to get IP packats forwarded/routed to/from the Infiniband network

infinibandlinuxroutingUbuntu

I have two networks.

One is a standard Ethernet network running IP.
The second network is Infiniband which in addition to some custom protocols can talks IPoIB (IP over Infiniband).

The router that sits on this network can ping both the Infiniband hosts with 10.10.10.x ip addresses, and can also ping local xx.xxx.79.x addresses.

The problem I'm having is that machines sitting on the ethernet network cannot ping or access machines on the infiniband network despite the fact that the router has ipv4 forwarding turned on.

Can you forward packets to/from IP and IPoIB Infiniband networks?

As requested, here is my routing table.

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         xx.xxx.66.61    0.0.0.0         UG    100    0        0 eth2
10.10.10.0      0.0.0.0         255.255.255.0   U     0      0        0 ib0
xx.xxx.66.60    0.0.0.0         255.255.255.252 U     0      0        0 eth2
xx.xxx.79.128   0.0.0.0         255.255.255.128 U     0      0        0 bond0
255.255.255.255 0.0.0.0         255.255.255.255 UH    0      0        0 bond0

eth2 is the public internet, bond0 is a private LAN, ib0 is the infiniband network.
This machine is the default router for both networks.

Note: this link seems to suggest it's possible.
http://www.spinics.net/lists/linux-rdma/msg06784.html
Although I've subsequently set this up on a test machine with old DDR hardware and am still seeing it not route. You also cannot bridge ib0 devices


Update and answer.

The problem I was having was actually related to routing. So yes it does work. In my particular setup, some of the hosts had infiniband and ethernet cards on the same network. So the replies were being passed back to the client ping via a different route. The answer in Linux is to set up reverse path filtering.

net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.all.rp_filter = 2

Then it works.

Best Answer

Your routes are configured correctly.

Check that the hosts on the infiniband network are using this router as their default gateway, or that their default gateway has a route for your ethernet LAN prefix configured via this router.

Check also that you don't have any firewall rules running on the router that would prevent the connection, and that you aren't attempting to do NAT (in this case, you shouldn't need to; the constraint with RFC1918 IPs is that you cannot announce them to the public Internet, not that you cannot use them internally alongside routeable IPs). In this case if you want the infiniband IPs to have Internet access you should NAT them at the border gateway.

It looks like this router might in fact be your border gateway. If that is the case and you are using NAT to provide internet access for the RFC1918 subnet, you should add -j RETURN rules at the beginning of your NAT prerouting and postrouting default chains where the source is an RFC1918 address and the destination is one of your other subnets, to prevent NAT from occurring in this case. Alternatively, only SNAT those packets with an output interface of eth2 and an RFC1918 source address.

You may also want to check the MTU settings on your interfaces, though this shouldn't give problems with the default ping size. The MTU should be set to the largest packet size which the interface can forward on the layer 2 medium without experiencing packet loss. If it is too low, it might also cause connectivity problems. The typical setting for ethernet is 1500 bytes.

Other than this, of course you can forward packets between IP networks regardless of the layer 2 medium, thanks to the layered model. Most of the internet actually works this way (think cable modems, which are layer 2 DOCSIS to ethernet bridges, and also think all the modular cisco routers with MPLS and ethernet interfaces, and a billion other types of devices).

You don't need a bridge for this.

Also, consider not using route from net-tools, because it is as old as time and lacks many features (such as correctly identifying your blackhole route for global broadcast as such). Instead, use ip route from iproute2.

Related Topic