by Bryon D Beilman
"I Am Jack's Complete Lack of Surprise". This is a quote from Fight Club, the cult film (and book) which featured many sayings that start with "I am Jack's ...." . This particular quote gained traction for me during a recent network outage that we worked on for a customer. Why was I surprised, but then later not surprised? Because someone did not follow the basics of change management and it caused quite a few issues.
We received alerts that certain sites were down and upon debugging it, noticed that not only were there issues, but the internal monitors could not contact the outside world and it pointed to a much bigger issue. This customer's revenue depends on serving content from their sites, so they have iuvo to take care of their servers and inside networks, but uses their data center provider to manage the firewall , IDS and external network connections with a dedicated team of Security professionals.
As much as I would love to bash them because this long event literally took an 18 hour period out of our teams's schedule , I want to focus on what went wrong and let this be a lesson to others. The cause of this network issue boils down to the following sequence of events.
I have given technical talks on Change Management and I will admit, that is difficult to keep people engaged, because it is not that interesting and much of it is common sense. But so many companies and teams do this so poorly. Strangely, the contract and SLA process for this managed security service requires the customer to submit changes for a change control event to them and it has to be reviewed, and recorded before it is changed. It appears that it does not apply in the reverse, that the customer is required to approve all changes that the security team makes.
Here are three things that will hopefully help you avoid this type of issue.
In this case, if the shift engineers knew what had changed, when and why, they could have easily reverted the change with much less of an impact. If they had documented what was going to change during the upgrade (turning on a new feature), then they would have received feedback and it would have been caught before it was rolled into production.
Change Management is a required weapon in your arsenal for running IT environments. Insist that your vendors do the same.