It’s been a while between posts <insert contrite apology and excuses>. Hopefully the length and breadth of this one will make up for the lag.
How many of you out there have had challenges getting a maintenance window to perform a software upgrade on your wireless infrastructure? We all know that the wireless network is becoming as mission-critical as the wired one, and that means getting an outage to perform preventative maintenance like a software update is becoming even more difficult.
Enter the ArubaOS 8 feature: In-Service Upgrades. Using the Cluster feature introduced in ArubaOS software version 8, the resiliency and redundancy that a controller cluster offers also grants the ability to perform software updates to the infrastructure without any outage or service impact.
Quick primer on Clustering:
Two or more Mobility Controllers (now referred to as Managed Devices in AOS8 architecture) can be formed into clusters to provide network resiliency; up to 12 MDs in a cluster. Access points and client devices now form tunnels to both a primary and a standby MD, and should the primary MD become unreachable, fail over to the standby, pre-established tunnel to another MD in the cluster. This topology allows for the obvious network resiliency and hitless client failover, but also seamless campus roaming, client load balancing, and In-Service Upgrades.
Here’s why this feature can be so important, besides the convenience of no more CAB meetings begging for a maintenance window at 2am: this week, on Monday, we had the public revelation of the WPA2 vulnerability known as KRACK (for reference: https://www.krackattacks.com/ ). Every vendor, both infrastructure and client devices, has been working behind the scenes for weeks and months to craft a software patch to mitigate this particular threat. With an AOS8 cluster, Monday morning, while sipping your first coffee, you could have downloaded and updated all your WLAN infrastructure with the patch Aruba had available last week that mitigates the Krack infrastructure risk. So instead of begging for an emergency outage from your boss that night, you could have said “no worries boss, got it covered, we’re good, already patched everything.”
All right, now we’re level-set on what clustering does, why it’s good and how it facilitates in-service upgrades, here’s the meat and potatoes of the procedure. I ran through the full upgrade in my home lab to catch screenshots to explain.
- Software – current 126.96.36.199, going to 188.8.131.52 (Krack-proof)
- Single Mobility Master (the admin and orchestration part of the network) – running in VM
- One 7010 and one 7008 Mobility Controller, configured to be in a cluster.
- Various models of APs, the majority running in Local mode for clients
- Several clients connected to the “corporate” SSID
First step is to get that Mobility Master (MM) upgraded, which is a straightforward process. The MM doesn’t terminate APs or clients, so it can really be taken out of service at any time and upgraded without any end-user impact – be as rough as you want.
A simple file transfer, reboot of the partition holding the new code, and voila, a MM running the new code:
With that part done, we now focus on the Managed Devices, the controllers, and APs. AOS8 now has a hierarchical structure for managing all the nodes, so for this process, you’ll want to initiate the cluster upgrade from the Managed Network folder:
Pick your cluster:
Define the FTP server info and the software version – **note the syntax of the Upgrade to version field**:
Last step, define the partition to write the new code to. Aruba controllers have two boot partitions, so you can load this to the one that is not currently in use, giving you a back-out:
And there you go. That’s literally all you do. Once you click the little blue button “Upgrade Cluster,” there’s no operator interaction, everything is orchestrated and automated.
But rather than just take my word for it, here’s some screenshots and narrative:
The APs need to be balanced across the MDs so that one MD can have the new software downloaded, applied and reboot, without taking any clients or APs down. Once that’s done, in this case all the APs and clients shifted to the 7008, the 7010 started downloading the 184.108.40.206 code from my FTP server:
Which completed successfully, and so the 7008 MC got its download of the code to the inactive boot partition:
With both MDs getting the new software downloaded, the one without any APs or clients, the 7010, now reboots back into service with the partition containing the new code loaded:
While that’s occurring, you can see the active MD, the 7008, has all the APs and clients terminating to it:
Once the 7010 rebooted, it’s now running the new code, so it’s time to move all the APs and clients over to it so the 7008 can also get the upgrade. But those APs first have to be upgraded too! Here’s where an Aruba feature, ClientMatch, helps. The system uses this feature to find a best “alternate” AP for a client to move to, and the client is encouraged (read forced) to roam to another AP without dropping their connection. Once an AP is unloaded of clients, it reboots using the preloaded 220.127.116.11 image:
Now that the 7010 controller and all the APs are all upgraded to the new code, the 7008 will reboot back into service using the 8.1.04 code that was previously downloaded:
Abracadabra, that’s it. We can see from the Mobility Master CLI the status of the cluster upgrade “Successfully completed.” The same message in the GUI, the MM, the MDs, and all the APs are upgraded to the latest code. All whilst not dropping a single client, or interrupting their connection:
After high-fiving myself for a job well done, and the applause dies down, it’s worth noting that the whole process does take some time to fully complete; this small environment took 37 minutes. From my perspective, it could take 8 hours and I wouldn’t really care if it meant I patched my infrastructure without taking it out of service.
Wouldn’t you know it though, Aruba just dropped software version 8.2 with some really cool new features I could sure use. Guess I’ll have to go to CAB and get a maintenance window. Wait a minute….
**Post script: If you want to see this in-service upgrade in action on a much larger scale (2000 or so clients), check out the video in this link below from the Aruba Atmosphere conference in Nashville this past spring (best part starts around the 12 minute mark).