So, we finally received 2x HPE SN2100M switches for internal use. They will be our new switches for the 3x ProLiant DL325 Gen10 Plus v2 we recently got for our VMWare ESXi Cluster. The switches are “slightly” overkill for what we need, but they are what I got. soooo.
The initial plan was to get 10GbE switches, something like the Aruba 6300M Switch 24p SFP+ (JL658A), but for whatever reason, it kept escalating to these. I am not going to complain.
Anyway.
Here is a topology of what I am going to build in our shop, though it’s not really relevant to this post. I just like topologies. 🙂
These are 16x100GbE QSFP28 Switches (you will need a license to enable 8 of the ports) that support speeds of 1, 10, 25, 40, 50, and 100GbE. The switches only have half the width, which allows us to have a cluster/stack in 1U. There is a special rack mount kit for this.
Here is the rack mount kit.
We can split each of the 100GbE QSFP28 to 4x 25GbE SFP28 with one of the following DAC, which we will need for our servers. Those (luckily) have 25GbE ports.
If we only had 10GbE available on our servers, I think we could use a 40GbE QSFP+ to 4x 10GbE SFP+ DAC.
To connect the switches to each other, I will use a single (don’t ask why) 100GbE DAC. This will be for the MLAG IPL.
For the connection to our existing switches, we will use a QFSP to SFP+ Adapter, since our Switches, 3x Netgear M4300-52G-PoE+, only support up to 10GbE. Never tried one of these before, so it will be interesting to see if it works.
We plug these into the switch and an SFP+ module into the adapter.
One of the neat features this switch has, is the option to run docker images and virtual machines, though I couldn’t figure out how to get it running. Not very likely that we will have a use case for it, but I might create another post if I figure it out.
Also, there is support for a few popular automation tools like Ansible, SaltStack, and Puppet. You can get a list of Ansible modules from this page and a short guide here.
So, what’s the plan? First I need to figure out how to connect to these switches, then we will do the base configuration, set up MLAG, and create a port-channel / LAG spanning both switches, for the uplink to our switches. Lastly, we will figure out, how to set up the switch for the 4x SFP28 DAC.
Let’s start with the connection to the switches.
Setting up the Switch
Connecting to the Switch
I will use a serial cable for the initial connection. We have an RJ45 and Micro-USB port. The baud rate is 115200.
To actually connect, I will use “screen” which is a Linux tool, you can use Putty/Kitty on Windows.
Once in, we have to get through the initial setup. Unfortunately, I don’t have the steps anymore, forgot to take screenshots, but it should be self-explanatory. Set a password, and so on.
The default login is “admin/admin”.
Base setup
Ok, let’s begin. First let’s set the hostname, an IP address, the default gateway for the mgmt0 interface, the DNS server, and the search domain.
## Switch 1 # Switch into the enable and configuration mode. switch > enable switch # configure terminal # Set the hostname switch (config) # hostname RZ-CORE01 # Switch into the mgmt0 interface and set the IP RZ-CORE01 (config) # interface mgmt0 RZ-CORE01 (config interface mgmt0) # ip address 172.16.0.230 255.255.0.0 RZ-CORE01 (config interface mgmt0) # exit # Set the default gateway for the mgmt interface RZ-CORE01 (config) # ip route vrf mgmt 0.0.0.0/0 172.16.0.254 # Set the DNS server RZ-CORE01 (config) # ip name-server vrf mgmt 172.16.0.130 # Set the search domain RZ-CORE01 (config) # ip domain-list example.local
Do the same for the second switch.
## Switch 2 # Switch into the enable and configuration mode. switch > enable switch # configure terminal # Set the hostname switch (config) # hostname RZ-CORE02 # Switch into the mgmt0 interface and set the IP RZ-CORE02 (config) # interface mgmt0 RZ-CORE02 (config interface mgmt0) # ip address 172.16.0.231 255.255.0.0 RZ-CORE02 (config interface mgmt0) # exit # Set the default gateway for the mgmt interface RZ-CORE02 (config) # ip route vrf mgmt 0.0.0.0/0 172.16.0.254 # Set the DNS server RZ-CORE02 (config) # ip name-server vrf mgmt 172.16.0.130 # Set the search domain RZ-CORE02 (config) # ip domain-list example.local
Great. Let’s check it.
# Check the interface IP RZ-CORE01 (config) # show interfaces mgmt0 brief Interface mgmt0 status: Comment : VRF : mgmt Admin up : yes Link up : yes DHCP running : no (Static IP is configured) IP address : 172.16.0.230 Netmask : 255.255.0.0 IPv6 enabled : no Speed : 1000Mb/s (auto) Duplex : full (auto) Interface type : ethernet Interface source : bridge Bonding master : vrf_mgmt MTU : 1500 HW address : 00:00:00:00:00:00
Next, the route and the name server entries.
# Check the route on VRF mgmt RZ-CORE01 (config) # show ip route vrf mgmt VRF Name mgmt: ------------------------------------------------------------------------------------------------------ Destination Mask Flag Gateway Interface Source AD/M ------------------------------------------------------------------------------------------------------ default 0.0.0.0 172.16.0.254 mgmt0 static 1/1 172.16.0.0 255.255.0.0 0.0.0.0 mgmt0 direct 0/0
# Check name servers RZ-CORE01 (config) # show hosts Hostname : RZ-CORE01 Name servers VRF: mgmt Name servers: 172.16.0.130 configured 172.16.0.132 configured Domain names: example.local configured Static IPv4 host mappings: 127.0.0.1 --> localhost Static IPv6 host mappings: ::1 --> localhost6 Automatically map hostname to loopback address : yes Automatically map hostname to IPv6 loopback address: no
Now we can patch the management interface and connect to the devices via SSH.
fedora-kde :: ~ » ssh admin@172.16.0.230
Set NTP
I also want to set NTP, to make sure the logs have the correct timestamp.
## Switch 1 # Set NTP Server RZ-CORE01 (config) # ntp server 0.de.pool.ntp.org RZ-CORE01 (config) # ntp server 1.de.pool.ntp.org
Let’s check the configuration.
# Check NTP server settings RZ-CORE01 (config) # show ntp NTP is administratively : enabled VRF name : mgmt NTP Authentication administratively: disabled NTP server role : enabled Clock is synchronized: Reference: 144.76.59.106 Offset : -0.701 ms Active servers and peers: 144.76.59.106: Configured as : 0.de.pool.ntp.org Conf Type : serv Status : sys.peer(*) Stratum : 2 Offset(msec) : -0.701 Ref clock : 131.188.3.223 Poll Interval (sec): 64 Last Response (sec): 50 Auth state : none 162.159.200.1: Configured as : 1.de.pool.ntp.org Conf Type : serv Status : candidat(+) Stratum : 3 Offset(msec) : -0.734 Ref clock : 10.214.8.6 Poll Interval (sec): 64 Last Response (sec): 50 Auth state : none
Create VLANs
Next, we need a few VLANs. Internally we mainly use them for customers (internal deployments), test environments, and IoT stuff, but we also need one for the MLAG inter-peer link (IPL). This will be used only on these switches.
For the IPL VLAN, I will use the ID 4000, since it’s the one from the documentation, and we do not use it internally, but you can choose whatever you want.
## Switch 1 # Create VLANs RZ-CORE01 (config) # vlan 4000 RZ-CORE01 (config vlan 4000) # name MLAG-IPL RZ-CORE01 (config vlan 4000) # exit RZ-CORE01 (config) # vlan 5 RZ-CORE01 (config vlan 5) # name Technik RZ-CORE01 (config vlan 5) # exit
Do the same on the second switch.
## Switch 2 # Create VLANs RZ-CORE02 (config) # vlan 4000 RZ-CORE02 (config vlan 4000) # name MLAG-IPL RZ-CORE02 (config vlan 4000) # exit RZ-CORE02 (config) # vlan 5 RZ-CORE02 (config vlan 5) # name Technik RZ-CORE02 (config vlan 5) # exit
Again, let’s check.
RZ-CORE01 (config) # show vlan ------------------------------------------------------------------------------ VLAN Name Ports ------------------------------------------------------------------------------ 1 default Eth1/1,Eth1/2,Eth1/3,Eth1/4,Eth1/5,Eth1/6, Eth1/7,Eth1/9,Eth1/10,Eth1/11,Eth1/12, Eth1/13,Eth1/14,Eth1/15,Eth1/16 5 Technik 4000 MLAG-IPL
Setting up MLAG
Ok. Now we can begin with the MLAG setup. I will stay very close to the NVIDIA documentation. For the port-channel ID, I will be using 128, to avoid accidental changes on this port-channel and to make clear that this one is special.
The interface I will be using is the last available (1/8 since we did not license the rest). I tend to use the last ports for uplinks.
## Both Switches # Enable IP routing RZ-CORE01 (config) # ip routing # Enable LACP RZ-CORE01 (config) # lacp # Enable QoS to avoid congestion on the IPL RZ-CORE01 (config) # dcb priority-flow-control enable force # Enable the MLAG protocol RZ-CORE01 (config) # protocol mlag ## Setup the IPL # Configure port-channel RZ-CORE01 (config) # interface port-channel 128 RZ-CORE01 (config interface port-channel 128) # exit # Map a port to the port-channel RZ-CORE01 (config) # interface ethernet 1/8 channel-group 128 mode active # Activate IPL on this port-channel RZ-CORE01 (config) # interface port-channel 128 RZ-CORE01 (config interface port-channel 128) # ipl 1 # Enable QoS on this interface RZ-CORE01 (config interface port-channel 128) # dcb priority-flow-control mode on force RZ-CORE01 (config interface port-channel 128) # exit # Create a VLAN interface and set MTU to 9216 RZ-CORE01 (config) # interface vlan 4000 RZ-CORE01 (config interface vlan 4000) # mtu 9216
From here on, the configuration is slightly different on each switch.
## Switch 1 # Set an IP address for the IPL link RZ-CORE01 (config interface vlan 4000) # ip address 192.168.255.1 /30 # Map the VLAN interface to be used on the IPL. Also set the peer IP address, this is the IP of the second switch. RZ-CORE01 (config interface vlan 4000) # ipl 1 peer-address 192.168.255.2
## Switch 2 # Set an IP address for the IPL link RZ-CORE02 (config interface vlan 4000) # ip address 192.168.255.2 /30 # Map the VLAN interface to be used on the IPL. Also set the peer IP address, this is the IP of the primary switch. RZ-CORE02 (config interface vlan 4000) # ipl 1 peer-address 192.168.255.1
Now we set a virtual IP (VIP) and virtual system MAC for the MLAG. This is optional but recommended. We should also use an IP address from the mgmt0 subnet.
## Switch 1 # Create the VIP RZ-CORE01 (config) # mlag-vip my-vip ip 172.16.0.229 /16 # Create the virtual MAC RZ-CORE01 (config) # mlag system-mac 0e:00:00:00:00:01
## Switch 2 # Create the VIP RZ-CORE02 (config) # mlag-vip my-vip # Create the virtual MAC RZ-CORE02 (config) # mlag system-mac 0e:00:00:00:00:01
For the last step, we need to enable MLAG.
## Switch 1 # Enable MLAG RZ-CORE01 [my-vip: master] (config) # mlag RZ-CORE01 [my-vip: master] (config mlag) # no shutdown
## Switch 2 # Enable MLAG RZ-CORE02 [my-vip: standby] (config) # mlag RZ-CORE02 [my-vip: standby] (config mlag) # no shutdown
We are done with the MLAG setup. Let’s check if everything is up and running.
RZ-CORE01 [my-vip: master] (config) # show mlag Admin status: Enabled Operational status: Up Reload-delay: 30 sec Keepalive-interval: 1 sec Upgrade-timeout: 60 min System-mac: 0e:00:00:00:00:01 MLAG Ports Configuration Summary: Configured: 0 Disabled: 0 Enabled: 0 MLAG Ports Status Summary: Inactive: 0 Active-partial: 0 Active-full: 0 MLAG IPLs Summary: ------------------------------------------------------------------------------------------------------------------ ID Group Vlan Operational Local Peer Up Time Toggle Counter Port-Channel Interface State IP address IP address ------------------------------------------------------------------------------------------------------------------ 1 Po128 4000 Up 192.168.255.1 192.168.255.2 0 days, 00:17:35 7 MLAG Members Summary: --------------------------------------------------------------------- System-id State Hostname --------------------------------------------------------------------- 00:00:00:00:00:00 Up <RZ-CORE01> 00:00:00:00:00:00 Up RZ-CORE02
Seems up and operational.
Creating an MLAG interface
At this point, we can create a LAG interface that spans over both switches. I will also set the switchport mode (VLAN tagging). Since we need to use the QSFP28 to SFP+ adapters, we also need to adjust the port speed on the interface before mapping it to the port-channel.
## Switch 1 # Create an MLAG interface RZ-CORE01 [my-vip: master] (config) # interface mlag-port-channel 1 RZ-CORE01 [my-vip: master] (config interface mlag-port-channel 1) # exit # Bind an port to the MLAG interface RZ-CORE01 [my-vip: master] (config) # interface ethernet 1/6 RZ-CORE01 [my-vip: master] (config interface ethernet 1/6) # shutdown RZ-CORE01 [my-vip: master] (config interface ethernet 1/6) # speed 10G RZ-CORE01 [my-vip: master] (config interface ethernet 1/6) # mlag-channel-group 1 mode active # enables LACP in active mode RZ-CORE01 [my-vip: master] (config interface ethernet 1/6) # no shutdown RZ-CORE01 [my-vip: master] (config interface ethernet 1/6) # exit # Enable the MLAG interface RZ-CORE01 [my-vip: master] (config) # interface mlag-port-channel 1 RZ-CORE01 [my-vip: master] (config interface mlag-port-channel 1) # no shutdown # Set switchport mode RZ-CORE01 [my-vip: master] (config interface mlag-port-channel 1) # switchport mode hybrid RZ-CORE01 [my-vip: master] (config interface mlag-port-channel 1) # switchport hybrid allowed-vlan all RZ-CORE01 [my-vip: master] (config interface mlag-port-channel 1) # exit
Do the same for the second switch.
## Switch 2 # Create an MLAG interface RZ-CORE02 [my-vip: standby] (config) # interface mlag-port-channel 1 RZ-CORE02 [my-vip: standby] (config interface mlag-port-channel 1) # exit # Bind an port to the MLAG interface RZ-CORE02 [my-vip: standby] (config) # interface ethernet 1/6 RZ-CORE02 [my-vip: standby] (config interface ethernet 1/6) # shutdown RZ-CORE02 [my-vip: standby] (config interface ethernet 1/6) # speed 10G RZ-CORE02 [my-vip: standby] (config interface ethernet 1/6) # mlag-channel-group 1 mode active # enables LACP in active mode RZ-CORE02 [my-vip: standby] (config interface ethernet 1/6) # exit # Enable the MLAG interface RZ-CORE02 [my-vip: standby] (config) # interface mlag-port-channel 1 RZ-CORE02 [my-vip: standby] (config interface mlag-port-channel 1) # no shutdown # Set switchport mode RZ-CORE02 [my-vip: standby] (config interface mlag-port-channel 1) # switchport mode hybrid RZ-CORE02 [my-vip: standby] (config interface mlag-port-channel 1) # switchport hybrid allowed-vlan all RZ-CORE02 [my-vip: standby] (config interface mlag-port-channel 1) # exit
Let’s check the interfaces.
RZ-CORE01 [my-vip: master] (config) # show interfaces status --------------------------------------------------------------------------------------------------------------- Port Operational state Admin Speed MTU Description --------------------------------------------------------------------------------------------------------------- mgmt0 Up Enabled 1000Mb/s (auto) 1500 - Po128 Up Enabled 9216 - Mpo1 Up Enabled 9216 - Eth1/1 Down Enabled Unknown 9216 - Eth1/2 Down Enabled Unknown 9216 - Eth1/3 Down Enabled Unknown 9216 - Eth1/4 Down Enabled Unknown 9216 - Eth1/5 Down Enabled Unknown 9216 - Eth1/6 (Mpo1) up Enabled 10G 9216 - Eth1/7 Down Enabled Unknown 9216 - Eth1/8 (Po128) Up Enabled 100G 9216 - Eth1/9 Down Unlicensed Unknown 9216 - Eth1/10 Down Unlicensed Unknown 9216 - Eth1/11 Down Unlicensed Unknown 9216 - Eth1/12 Down Unlicensed Unknown 9216 - Eth1/13 Down Unlicensed Unknown 9216 - Eth1/14 Down Unlicensed Unknown 9216 - Eth1/15 Down Unlicensed Unknown 9216 - Eth1/16 Down Unlicensed Unknown 9216 -
Changing Module-Type to Split mode for break-out cable
Next on the list are the break-out cables. We bought four of them, but for now, we will only use 2. To make the individual 25GbE ports available, we need to change the module type.
RZ-CORE01 [my-vip: master] (config) # interface ethernet 1/1 RZ-CORE01 [my-vip: master] (config interface ethernet 1/1) # shutdown RZ-CORE01 [my-vip: master] (config interface ethernet 1/1) # module-type qsfp-split-4 the following interfaces will be unmapped: 1/1 Type 'YES' to confirm split: YES
Once that’s done, the interface will look like this.
RZ-CORE01 [my-vip: master] (config) # show interfaces status ---------------------------------------------------------------------------------------------------------------- Port Operational state Admin Speed MTU Description ---------------------------------------------------------------------------------------------------------------- mgmt0 Up Enabled 1000Mb/s (auto) 1500 - Po128 Up Enabled 9216 - Mpo1 Up Enabled 9216 - Eth1/1/1 Down Enabled Unknown 9216 - Eth1/1/2 Down Enabled Unknown 9216 - Eth1/1/3 Up Enabled 25G 9216 - Eth1/1/4 Down Enabled Unknown 9216 - Eth1/2 Down Enabled Unknown 9216 - Eth1/3 Down Enabled Unknown 9216 - Eth1/4 Down Enabled Unknown 9216 - Eth1/5 Down Enabled Unknown 9216 - Eth1/6 (Mpo1) Up Enabled 10G 9216 - Eth1/7 Down Enabled Unknown 9216 - Eth1/8 (Po128) Up Enabled 100G 9216 - Eth1/9 Down Unlicensed Unknown 9216 - Eth1/10 Down Unlicensed Unknown 9216 - Eth1/11 Down Unlicensed Unknown 9216 - Eth1/12 Down Unlicensed Unknown 9216 - Eth1/13 Down Unlicensed Unknown 9216 - Eth1/14 Down Unlicensed Unknown 9216 - Eth1/15 Down Unlicensed Unknown 9216 - Eth1/16 Down Unlicensed Unknown 9216 -
And here is the result.
A nice 25Gbps link. This is the third host which isn’t in our productive cluster right now. The second interface will be patched later.
If you want to combine the port back into one, do the following.
RZ-CORE01 [my-vip: master] (config) # interface ethernet 1/1/4 RZ-CORE01 [my-vip: master] (config interface ethernet 1/1/4) # shutdown RZ-CORE01 [my-vip: master] (config interface ethernet 1/1/4) # exit RZ-CORE01 [my-vip: master] (config) # interface ethernet 1/1/3 RZ-CORE01 [my-vip: master] (config interface ethernet 1/1/3) # shutdown RZ-CORE01 [my-vip: master] (config interface ethernet 1/1/3) # exit RZ-CORE01 [my-vip: master] (config) # interface ethernet 1/1/2 RZ-CORE01 [my-vip: master] (config interface ethernet 1/1/2) # shutdown RZ-CORE01 [my-vip: master] (config interface ethernet 1/1/2) # exit RZ-CORE01 [my-vip: master] (config) # interface ethernet 1/1/1 RZ-CORE01 [my-vip: master] (config interface ethernet 1/1/1) # shutdown RZ-CORE01 [my-vip: master] (config interface ethernet 1/1/1) # module-type qsfp The following interfaces will be unmapped: 1/1/1 1/1/2 1/1/3 1/1/4.
Change the password and save the configuration
If you, like me, didn’t change the default password in the initial setup, this is how we can do it.
RZ-CORE01 [my-vip: master] (config) # username admin password 0 SECRET-PASSWORD
Also, do not forget to save the configuration on both switches.
RZ-CORE01 [my-vip: master] (config) # write memory
Done
Alright. That’s it. Here is the end result. The first port connects to the servers. The last (licensed) port 8 is used for the MLAG and the purple fiber optic cable is our uplink to our switches across the room. Port 7 is currently unused and will be used for redundancy if I ever get another 100GbE DAC.
The cabling isn’t great, but you have never seen the server room. 🙂 It’s actually horrendous.
At first, I wasn’t too happy with having another switch with a different OS in our network (I think we have 5 different manufacturers by now), but I do like these switches. The commands are easy enough to understand, to some extent, and the documentation is OK. I have seen better, but it’s serviceable.
By the way, I am aware that having 25GbE links to the ESX hosts is kind of pointless if the uplinks are 2x10GbE only. But even 10GbE is too much for us, we don’t generate that kind of large traffic. Mostly remote sessions and ERP connections. The only use case is for backups, testing purposes, and vMotion.
Well, that’s not entirely true. We did receive an HPE Alletra 6000 Series, if I remember correctly, which should be iSCSI. I’m probably not going to configure the storage, but it is very likely that I will handle the networking. Also, very much overkill.