Microsoft’s Simple Slip Crashes Azure Worldwide

On Friday, a lapsed security certificate brought down Microsoft’s Azure worldwide. Would you be shocked to learn that the same thing happened less than a year ago? On Feb. 29, 2012, Azure went down worldwide. The reason … wait for it … a certificate issue.

Windows Azure DownCloud outages like this strike fear in the hearts of IT directors and CIOs. And for good reason: When Azure goes down, critical business applications and data become unavailable, and every minute costs companies untold amounts of revenue and opportunities.

Despite its growth, IT leaders still ask themselves whether they should transition to the cloud. It’s a difficult question. There are so many strengths to moving business functionality onto the Web and having it securely accessible from any device and any location. At the same time, there are clear risks, as shown by these outages. But when a client asks me whether they should move to the cloud, I can only give one response: Yes*. Note the asterisk.

Why the Asterisk?

The answer about moving to the cloud sounds something like a pharmaceutical commercial: Cures your problem. Side effects could include nausea and paralysis. To many, the need to take advantage of the cloud’s benefits trump the risks. Companies don’t make the transition just because the cloud’s popular. The move strengthens their ability to do business. It can save money, and lots of it.

Recently, I recommended that a client hold off on migrating their on-premise SharePoint implementation to Office 365. I had a couple of reasons, but the most important was that Office 365 wasn’t ready for them.

Not only are Office 365′s SharePoint offerings slim compared to the on-premises version, but my client’s invested in integrating SharePoint and another line-of-business system. On-site, SharePoint can handle this type of integration just fine. It’s more limited on Office 365. According to Microsoft, the way to integrate an on-premises application with Office 365 is via Azure. Remember what happened to Azure on Friday?

Even businesses with SharePoint on-premises aren’t isolated from problems when Azure goes down. The service is used by some companies for BLOB storage, remote hosting of extremely large SharePoint-managed files. When Azure goes offline, they feel the impact, as well.

Microsoft Should Have Known Better

The cause of Friday’s problem was simple: Microsoft didn’t renew a certificate. Some argue that this is an easy mistake to make, as if Microsoft was a small business. When you have customers around the world relying on — and paying handsomely for — your ability to provide enterprise-level solutions, there’s really no good excuse. These services require extensive knowledge and competence. Yet Microsoft failed to complete some basic housekeeping.

Did the company understand that security certificates could bring down its entire Azure cloud? Yes. Note there’s no asterisk this time. It’s made worse by the fact that a similar issue did the same thing last year.

Cloud services don’t have a reputation for being completely trustworthy. Notable outages still occur, and in some instances large amounts of data are lost. For years, providers have worked to convince people their their services are reliable. And yet here we are, with a major platform from a major provider failing because of the company’s inability to perform a basic task.

Moving to the Cloud

You undoubtedly know all about disaster recovery, the plan and processes that you follow if a critical system goes down. Nowadays we understand that it’s actually about business continuity, not simply disaster recovery. It’s akin to the old-fashioned credit-card swipers some stores keep as a backup, so they have a way to continue accepting payments even when the electronic system is down.

Business continuity planning is an important part of moving to the cloud, and you have to take the responsibility for it onto your own shoulders. Within the next few years, most businesses will move to one extent or another. When they do, they’ll still need to rely on their staff and partners to develop and implement continuity plans for critical functionalities, cloud-based or not. Which is why my usual advice to organizations considering a move to the cloud is Yes. With an asterisk.

Comments

  1. BY Fred Bosick says:

    I’m a contractor for a major durable goods manufacturer in the Midwest, in a mostly outsourced IT department. Our wslx certificates expire all the time! Luckily most of the outages are for internal websites. They all have a well defined expiration date, and we have these things called “computers” where one can set reminders and they’re all on NTP, so keeping time isn’t a problem. There is no excuse!

    And our ticketing system is down, which is why I have time to write this. It’s pathetic.

Post a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>