After a fierce storm on the East Coast recently took down one part of the Amazon cloud along with several of its high-profile customers, a post-mortem on the incident offers some lessons for competing cloud providers.
CloudCast Weekly Archives
Catch up on other episodes of CloudCast Weekly
- How this Amazon cloud failure is likely to affect what all enterprises -- Amazon customers or otherwise -- expect from cloud providers by way of disaster recovery (DR) planning;
- A new quiz that tests your knowledge on how to control virtual machine sprawl in the cloud.
Download for later:
- Internet Explorer: Right Click > Save Target As
- Firefox: Right Click > Save Link As
The following is a transcript of the podcast.
Jessica Scarpati: You're listening to Cloud Cast Weekly, a podcast by SearchCloudProvider.com. I'm Jessica Scarpati, site editor of SearchCloudProvider.com, and with me in the studio is news writer, Gina Narcisi. Hey, Gina, how's it going?
Gina Narcisi: Good, Jessica. How are you?
Scarpati: Great, thanks. We're here to give you a quick wrap-up of what's new on SearchCloudProvider.com this week. So, Gina, let's talk about your story on the Amazon Cloud outage.
Narcisi: Great, Jessica. My story this week was on the Amazon cloud failures and how the failures highlight reliability issues for cloud providers and how these cloud providers can basically learn from the mistakes of Amazon. I talked a little bit about just the background of the most recent cloud outage that Amazon suffered in their Virginia data centers due to thunderstorms in that area. And it was interesting because other cloud providers in the same area, their service didn't come down. One of them was Joyent.
Scarpati: Yeah, they were quick to point that out on Twitter. They retweeted. Several people pointing that out.
Narcisi: Right. And I talked to a couple of analysts as well just about how, you know, disaster recovery is very important, and when you're using a cloud environment you really need your services to be resilient. And companies like Instagram and Netflix actually went down in the most recent outage, and they rely on their Internet services. And so companies that rely on services like that even just during the week or the weekend, it needs to be up all the time.
Scarpati: Yeah, what I kind of found interesting about this was that what went down was these two data centers that were supporting one availability zone and the whole idea of kind of the availability zone is that they are supposed to be, I guess, built in a way so that if something goes down then it should fail over, but the whole thing kind of came apart, like the whole idea of the availability zones just kind of went to hell after they started getting those backlogs on the control plane. So it just kind of showed. I thought, you know, you could only plan for so much. I mean, how much disaster recovery planning can you do?
Narcisi: It's true. And one of the ideas that one of the analysts I spoke to had was a multi-vendor cloud strategy, basically meaning these big companies like Netflix, maybe they would like to look into using two different cloud providers as just to make doubly sure that this won't happen again and their services won't be knocked out. It's definitely not realistic for everybody, especially smaller operations, and especially since the cloud is still so new for so many people. If they find a cloud they like, they're probably just going to stick with it, and they probably don't want to, you know, look too much further and have different environments going on. And I think that could get very confusing, but it's an idea. But, yeah, basically on Amazon's log, even on their site when they were talking about the outages, it just seemed like it was one thing after another.
Scarpati: Yeah, yeah, I think the word I saw just a lot of the media was using, I think, was "cascading" failures. And that perfectly describes it. So the multi-cloud strategy thing, it's interesting. I'm still not sure myself whether that's going to be a good thing or a bad thing for providers. I mean, does it mean that if you weren't getting a certain amount of business, could you get more now from, you know, these customers who are maybe using Amazon and are looking for the alternative kind of to back up? In the article Nirvanix, I think, was kind of talking about this. Or does that mean you're going to be like splitting your business? Will you lose some of it, or are they going to have to do more work kind of like on federation and integration, you know, supporting customers with that strategy?
Narcisi: Yeah, and it definitely seems like cloud providers sort of differs in their opinion of this, like Nirvanix, like you said. They were kind of all about this idea and, you know. Sure, they're happy with their customers just having their own copy of their own data in somebody else's cloud and also using this as well. So I guess it depends on the cloud provider. You can potentially, you know, gain more business that way, but you also could lose it that way too. So I feel like it's definitely going to be different for everyone.
Scarpati: Damned if you do, damned if you don't. All right, Gina, well, thank you so much as always for joining us.
Narcisi: Thank you, Jessica.
Scarpati: Even though school's out for the summer, we have a pop quiz this week on controlling VM's sprawl in the cloud. Don't worry you don't have to go in cold. This is based off a piece that we ran not too long ago from Amy Larson DeCarlo over at Kern Analysis. She did a great breakdown of myth versus fact, the four myths about VM's sprawl in the cloud. So get your pencils out. There will be a test.
Well, that is all we have for this week. Thank you for tuning in and be sure to check out all the articles we talked out and more on SearchCloudProvider.com. See you next week.