4 Steps for Effective Incident Management

by Richard Turkel

You can’t protect against all failure. If you could we’d never get to use the #fail hashtag. However, you can prepare your team to react well to an incident.

Dice's Guest AppearanceYou’ve probably already created your incident management plan, but what about the following four points? Did you consider them, or are they going to cause a problem or holdup when you’re trying to handle the incident?

Perhaps you should have a look, just in case.

Create a Communication Plan

Easy to say, but a little harder to execute. Let’s talk about exactly what this means.

First, it doesn’t mean that you just “plan” to let employees know what’s going on. It means you have concrete, actionable steps to implement. This could mean a mass or targeted email system, or it could mean a 10-minute conference with the heads of affected departments so that they can tell their teams. Whatever it is, it must be concrete and immediately actionable.

Second, offer a workaround if at all possible. Along with that—or especially if you don’t have one—explain in high-level language what the incident is and how long it will take to fix. Employees who understand there’s a problem get frustrated. Employees who understand what it takes to solve the problem at least have an appreciation for the work that goes into fixing it.

Last, be sure to give a timeline—and be honest. You might have determined that the incident is not the most important thing on your list (see the next section). You need to communicate this to the employees, explain why the fix will take the time it will, and provide the workaround for the interim. Even if employees are upset, you’ve at least done your best to communicate and do what’s best for them and the company.

Organize by Mission/Vision/Implementation

Every incident, and your response to it, should be filtered through mission, vision and implementation. It’s easier to work backward on this, though, so let’s start with implementation.

Does the incident prevent employees from implementing—that is, doing their job? For example, employees may still be able to do their work without access to a database for a few hours, while no Internet access brings the company to a virtual standstill. Incidents that interrupt implementation need to be addressed quickly. Those that don’t interrupt it can wait a bit.

How does the incident fit with the company’s mission and vision? It may not be a big problem if your website is down for a day—unless you’ve promised customers 24-hour access to their accounts.

Assign a Cost Analysis Reference Beforehand

Based on your mission/vision/implementation analysis, how much money can you spend fixing an incident? As you know, fixing a problem quickly often requires more money and resources than if you can address it more slowly (In some cases. I’ve probably generalized a bit here). Develop some sort of sliding scale for what you should spend on which sorts of incidents. For example, you’re much more likely to take drastic, and expensive, action during a security threat than when your database goes down for a few hours.

Choose Your Team and Remove Ambiguity

Finally—and the most obvious of these steps—choose and brief your response team long before something happens. It’s no fun to be the team member who isn’t sure what they should be doing during an incident.

This also empowers your team to communicate to other employees. For example, imagine you’re having a major issue and you designate one team member to handle low-priority issues and/or maintenance tasks. She/he can explain to other employees that they’re keeping the company up to par while everyone else works on fixing the problem.

Conclusion

How did you measure up? Does your incident management plan incorporate these strategies, or do you need to revisit and revise?

Wherever you stand in developing your plan, remember that it’s an ongoing process. As the technology you use grows and changes, you may need to revise and tweak these and other strategies. Not doing so could leave you in hot water without a workable plan.

Richard TurkelRichard Turkel is a technology blogger who writes about business technology solutions. He currently writes for knowledge management software provider BMC. In his free time, Rich spends his time on the mountain shredding some powder.

Post a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>