Switch – Connecting two ethernet switches for higher inter-switch bandwidth

switch

I am a novice. Please bear with me.

I have a cluster with 4 racks (A,B,C,D). Each rack consists of 25 machines and an ethernet switch.
Switch-A and -B are connected via Switch-D (Network Trunk)

I found that the inter-switch bandwidth is limited by the slow trunk connections.
(The speed of the network between Switch-A and -B is limited by the ports connected to the Switch-D)

However, there are remaining ports on each switch.
Can I increase the network bandwidth between Switch-A and -B by putting more cables between them? (Plugging cat-6 cables to the ports of Switch-A and -B)

I have no idea how to configure the switch. The switch is IBM System Networking RackSwitch G8000.

Where should I start to resolve this issue?

Best Answer

I'm going to assume that the various servers in question are all in the same subnet (i.e. L2 domain). If this isn't the case then you'll need a different approach.

If so, however, you'll want to logically bundle together a number of links between the switches using link aggregation. Based on a quick Google the switch does apparently support 802.3ad link aggregation (see here on page 111).

This will allow the traffic moving between the two switches to be generally balanced across the various links available. I don't know the upper limit of your particular switch but it's safe to say that the lower limit is 2 and that it can likely support as many as 8 links. The links should be the same speed, obviously. Given the hashing algorithms at work the best balance tends to happen at powers of 2 - so 2, 4 or 8 links will work more efficiently than, say, 3 or 5.

Please also keep in mind that the speed of a single flow between two servers can't be any faster than the speed of one of the links in the bundle. Also keep in mind that traffic isn't going to be perfectly spread amongst the links. It gets more efficient with a larger number but will always be somewhat unbalanced. There may be tweaks to the hashing algorithm in use, but I don't know the IBM platform well enough to comment on particulars.