Hello there.
Today I want to write about a project I had over the weekend (from Friday to Monday). This won’t have much if any, useful technical information. It will probably just be a short, badly written-story.
Over the last weekend, I was at a customer’s site to replace the entire network excluding the Router/Firewall. This customer is located about 230km / 140 miles from me, so they booked a hotel for me to stay in over the weekend.
Friday
The arrival
I arrived there around 11 in the morning and was greeted by one of the two IT guys, I would spend the next 4 days with. Very nice people, which made the whole thing much more enjoyable.
I have been there to plan this project before of course, but this was over a year ago, so a lot of the details were a bit fuzzy for me.
While going through the steps for the next few days, the electricians started to rip everything out of the racks and reorganize the patch panels. The old patch panels were replaced by 1U 24 Port panels, leaving a 1U gap between the first and second panel for the new switches.
This was to get a cleaner look at the network racks. Gone are the days when we use 5m / 16feet patch cables. Here is a picture of one of the almost finished racks.
Building the Lab
After going through everything, we talked a bit. Small talk basically. At around 2 pm we moved to a large empty room, where we would spend most of the time in. We prepared a few tables to test the equipment and configure the switches. This was probably one of the smoothest experiences I had, building and testing network equipment. There were enough tables, power sockets, and a lot of empty space, which allowed us to build the whole network in a single room.
This went on for a few hours (with a lot of talking in between), configuring and tweaking how the customer wanted it. We made sure to test the inter-VLAN routing, device profiles, etc. before we started installing all the switches into the racks.
After having something to eat (an excellent burger with sweet potato fries) and finishing up the switch configuration (we added a few more, which required more stack configuration and testing), I went to the hotel for the night. This was around midnight.
The Network
A bit of background on the network. This customer has two buildings interconnected with single and multi-mode fiber links. We were going to use the single-mode links but unfortunately, the wrong cables were ordered. So had to opt for multi-mode temporarily.
The previous network was a mess. They had several links interconnecting every switch with each other and the core switches. It was an STP nightmare. We reduced a total of 18 links to 4 in one and 8 in the other building. You can see the previous links in the picture above. It’s the fiber ports at the top of the rack, without the protective caps.
For the new setup, we went with 4 Aruba 3810M for the “core-switching”, 2 in a stack in each building. Those are connected to each other with 4x 10Gbe links in a LAG. 2 of the switches also had a flex module with 4x 10Gbe copper interfaces. These were for the server uplinks.
This whole setup was sold with VRRP in mind. But with the topology it didn’t really make sense to me, so abandoned the idea and left the routing to the Sophos SG firewalls. They had two in an active-passive HA cluster.
The firewall has a lot of RJ45 ports, primarily used for the internet uplinks in the new setup, 2x 1Gbe SFP and 2x 10Gbe SFP+ slots. We used the 10Gbe for the primary traffic, like servers and clients, and the 1Gbe for less important stuff like Guest Wifi.
For the access switches, we used Aruba 2930F. Every stack has a 2x 10Gbe LAG uplink to the Aruba 3810M switches.
Saturday
Installing the hardware and adjusting the Sophos configuration
The next day we started around 9am. Beginning with the important stuff, like coffee and small talk. The plan for this day was to configure the Sophos SG, install all the switches, and get the Infrastructure back online. The IT Manager already started with the installation and cabling of the access switches.
I did the core switches with the other employee, testing the connectivity and tweaking the configuration a bit. This is when they noticed that the electricians ripped out the phone system. So one of them worked on that, which took several hours and really messed with the schedule.
Powering up the hardware
After lunch (very nice gnocchi casserole) we started to connect the hardware (storage, server, and so on) and powered them on. It took some time, until we got every system back online, mainly because of some weirdness with the old EMC storage. The goal was to have the next day only for testing, so it was important to have a running baseline.
We had most of the systems around (again) midnight, and decided to end it here for the day.
Sunday
Testing the environment
I arrived early around 8am, to get a few tests running. Like network stability and performance. The network was running stable for the most part. A few adjustments here and there. We verified that the VMs were running and checked a few strange behaviors on some of the Windows systems. Since I had a lot of time that day, I decided to spend a bit on creating a few Ansible playbooks, to set up simple stuff like VLANs and custom device profiles.
A lot of time was spent talking since everything seemed to run smoothly. To give me some peace of mind, I went on a round trip to check every network rack and booted up a few PCs to check connectivity.
We went for lunch (spaetzle with beef and mushrooms), checked a few more things, and decided to end it around 8pm. After a few more hours of just talking in a restaurant, we left for the day.
Monday
Trouble in the morning
The previous day, we decided to arrive at 8:30am but I received a call around 7. Apparently, most of the users couldn’t connect to the network, which was strange since we tested a lot of systems the day before.
I arrived at 7:30 and went straight to one of the systems that had this issue. It took a bit, but we figured out the reason. The network wasn’t the problem. During the configuration of the phone system (after it was ripped out), the phones rebooted and grabbed an IP from the internal network, temporarily. This led to a full DHCP pool, which means none of the systems could receive one. We removed the orphaned leases and expanded the range a bit.
Another issue was a very old proxy configuration, that wasn’t even in use anymore but apparently never caused any problems before. Until we changed the network, that is.
After the storm
Once the issues died down a bit (some phones didn’t connect), we visited every department to check if everything was running. The rest of the day we spend adjusting a few settings, setting up Wifi (since we had time) and me showing one of the IT guys how to navigate the Aruba CLI.
Had lunch (Kebab this time) and at around 4pm I started the trip back home.
This was overall a fun project, primarily because of the people I worked with. It makes a huge difference in how one perceives longer workdays and the stress of getting things done in a set amount of time.
Till next time.