Solutions provider takeaway: Cloud-based storage is capable of more than just backup. Factors such as developer support play a role in the decision as well. The following chapter excerpt helps you guide customers in their cloud storage decisions.
Different clouds for different applications
The definition that we're working with is that clouds are abstraction layers, hiding system details from end users. The evolving goal that most cloud providers seem to be heading toward is an OS-agnostic system, where users can choose applications from a vast library. A key element to be resolved is billing for these library items, but it's not hard to imagine a scenario in which billing would be handled like a mobile-phone account, but with items such as the following:
- Application use charges (since users don't buy apps anymore; perhaps this is what IBM had in mind when it bought SPSS?)
- Temporary and long-term storage charges
- Throughput (input/output) charges
- CPU or compute time
- Idle time (application in memory but suspended)
We all have to keep in mind that most users really don't care about where or even how their processing is done, just that they can do their task with the least amount of hassle. We've been around for a while and witnessed firsthand the transition from Hollerith cards to terminals, PCs and, now, the Internet. Other than initial user resistance, each major technology change has swept through the business landscape and then has become part of the environment.
The direction of the evolution is all about computing turning from a world of customized solutions to the ubiquitous environment of a utility. The U.S. Department of Defense mandated that DoD systems move from one-of-a-kind " stovepipe" systems to commercial off-the-shelf systems that can take advantage of economies of scale. A new Navy submarine launched in 2009 is a good example of a "boat" that uses many commercial off-the-shelf components rather than custom-built systems (unlike the legendary $1,000 toilet seat in the Air Force B1 bomber).
Jim Staten of Forrester Research Inc. provided an example of how the New York Times leverages the cloud. The Times wanted to make its historic archives available for online access. The company needed to process 11 million articles and turn them into PDF files. Initial estimates outlined that hundreds of servers and about 4 terabytes (TB) of storage would be necessary. The IT organization at the Times estimated several months of delay before beginning, the need for a significant budget and the difficulty of locating the computing resources. The project manager gave Amazon Web Services a try and kicked off 100 Elastic Compute Cloud (EC2) instances and 4 TB of S3 storage. The job was finished the next day with a total cost of $240.
Another example comes from the Washington Post. Peter Harkins, a senior engineer at the Washington Post, used Amazon EC2 to launch 200 server instances to process 17,481 pages of nonsearchable PDF images into a searchable online library. With a processing speed of approximately 60 seconds per page, the job was completed within nine hours and provided Web portal access to the public 26 hours later. Harkins ruminates, "EC2 made it possible for this project to happen at the speed of breaking news. I used 1,407 hours of virtual machine time for a final expense of $144.62. (Click here for a database of Hillary Clinton's 1993-2001 schedule.)
Examples like this show how cloud computing techniques can be used to revolutionize PED processes. By increasing the use of automation and focusing our analysis on higher level exploitation tasks, near-real time exploitation and dissemination of critical intelligence products may be enabled in the very near term with cloud computing.
(Source: http://kevinljackson.blogspot.com/2008/10/why-cloudprocessing- exploitation-and.html.)
Amazon Web Services is just the tip of the proverbial iceberg when it comes to a cloud specifically purposed to bring on-demand computing cycles to organizations and users that need them. The trend we're starting to see is for companies to use cloud computing and storage to smooth out usage spikes and avoid upgrading data centers to size capacity for spikes rather than "normal" usage.
In the above-mentioned examples, those EC2 applications were still virtual machines that forced you to choose one operating system over another during the EC2 configuration phase. Some of the conditions for being a true cloud implementation were fulfilled, but not all: There was no sense of a seamless movement of processing from one platform to another and no escape from a deep awareness of where the processing platform was located (in a virtual, if not physical, sense.)
To give a flavor of the variety of operating systems offered by Amazon and just how fast this list is growing; we thought we'd offer a snapshot of what's offered but also indicate where you can find the current listing.
Amazon Machine Images (AMIs) are preconfigured with an ever-growing list of operating systems. We work with our partners and community to provide you with the most choice possible. You are also empowered to use our bundling tools to upload your own operating systems. The operating systems currently available to use with your Amazon EC2 instances include the following:
- Red Hat Enterprise Linux
- Windows Server 2003 Oracle
- Windows Server 2008
- Enterprise Linux
- CentOS Linux
- openSUSE Linux
- Ubuntu Linux
- Fedora Linux
- Gentoo Linux
- Debian Linux
Note: While it is rumored that the Apple Mac OSx will run under VMware, there is still considerable debate whether such an action would put you in violation of the end-user license agreement.
A great amount of work still needs to be done on a job description language of some sort before cloud computing reaches the sort of state that the Web began to enter with the development of the Common Gateway Interface (CGI) in 1993. Reuven Cohen, co-founder and CTO of Enomily Inc., is one of the people looking at the question of how to develop standards for cloud computing. He has approached everything from cloud resource description to cloud identity federation in his blog Elastic Vapor.
We've said before that it's all about abstraction layers and whether you can see through the floor into the inner workings of the environment. Offerings are popping up everywhere in all the shades of gray. Some, such as Amazon's EC2, are close to the foundation hardware; some, such as AppNexus, only partially obscure the foundation; and a few, such as Google, fully obscure the foundation. What we really have today is a market in transition, with vendors feeling around in a speculatory arena trying to figure out what consumers really want.
Data storage space in any organization is like physical space, in that nature abhors a vacuum. Anytime we've been involved with adding data storage space to an organization, we've been amazed at just how quickly it disappears. So therein lies the rub: Where do you find enough temporary storage to do huge projects? In the case of the New York Times' PDF indexing project, the company estimated that it needed 4 TB of storage; so instead of trying to temporarily expand its data center, it turned to Amazon Web Services. This temporary boost in storage capacity is one of the leading applications for storage clouds for enterprise use. For personal use, cloud storage for off-site backup and remote access to critical files has led to acceptance of the idea of cloud-based storage.
For many individual users, the first experience with a storage cloud will come through an encounter with one of the remote storage or backup clouds. Commonly used storage clouds include Boxee, DropBox, Microsoft's Mesh, Apple's MobileMe, and Amazon S3. A frequent encounter with these might include using Amazon's S3 (Simple Storage Service) to back up traveling laptops.
While there is a single Amazon Simple Storage Service (S3) service, and a single programmatic interface to the service, to say that there is a bit of variety in S3 backup tools is an understatement. With names like Jungle Disk, S3 Backup, Brackup, Duplicity, S3Sync, and others, Amazon S3-based backup tools are available for just about every desktop operating system available today.
However, backup is just scratching the surface:
Amazon S3 provides a simple Web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the Web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of websites. The service aims to maximize benefits of scale and to pass those benefits on to developers.
It's all about developer support, and Amazon has poured a huge amount of money into creating a collection of developer support tools that we've not seen since the days of the IBM programmers' library collection. With examples, documents, best-practice guides, a knowledge base, and tools all freely downloadable, Amazon seems determined to make friends with the developer community instead of taxing it with fees, as other systems do.
Another good move by Amazon has been its eclectic approach to programming library support. Instead of just going for the Microsoft "low-hanging fruit" and sticking with C# and the .NET environment, the AWS Software Development Kit (SDK) collection is a smorgasbord of languages and developer systems. Interestingly enough, Amazon also provides support for the Open MPI interface in batch-processing mode to attract Beowulf users. For more information on Amazon's system and its programming click here. p>
On the flipside of this coin is Google's Web-minded approach. With a much more simplistic approach, Google has neatly sidestepped the huge support requirements that Amazon had to build. By concentrating primarily on Python for the development system, Google's approach gives unparalleled integration into the world of Google Services while also leveraging the huge number of Python programmers in the world. Instead of offering everything under the sun, as Amazon does, Google has been building its library of apps over the years as part of an all-encompassing Google environment. Instead of providing a simple storage facility, Google is concentrating on providing storage through the apps in the system. The world of Google is already tied together, already tightly integrated, and already well understood. Google seems to be saying to the market, "Not only have we built it, we're also making it inexpensive to play by pushing you into a single development environment while at the same time opening the entire Google world to you." What we're expecting to see is a collection of personal productivity tools to round out the office automation applications that are already part of the Google desktop.
In an effort to make their cloud solutions ubiquitous, all the major players are making inroads into blurring the line between mobile and desktop. In the past it was clearly computing power that separated the CPU-light mobile world from the bigger, faster computing capabilities of the desktop. Clouds place additional computing capability anywhere, allowing for CPU-hungry apps to run even on CPU-light mobile platforms by separating computing from the user interface. This client/server model has been used for years for network applications, where user interfaces on client machines communicate back to larger applications running on back-end servers to handle the heavy lifting. The key to this approach will be how fast and how far 3G and then 4G wireless networks provide Internet connectivity so that these new mobile platforms can keep the mobile platform connected to the back-end cloud computing environment.
An approach similar to Google's foray into the mobile world with Android has been used by Microsoft and Apple with their My Phone and MobileMe services for mobile devices. The key differences are that the offerings from Microsoft and Apple are much more tightly tied to the operating system, making only minimal user interaction required after the initial relationships between desktop and cloud-based files are established. The downside of this tighter integration is that cross-platform performance is either not available or available only on a minimally functional basis. One of the great unknowns about Google's foray into a cloud-based operating system for mobile platforms is whether it will lead ultimately to wider availability on a variety of platforms or to tighter integration with (and therefore more exclusive ties to) Google's own products and services.
Users can hope that, as the market develops and more open definitions of cloud processes and procedures gain acceptance, it will be easier to find application, processor, and storage cloud services that are tightly integrated into operating environments and available on a greater number of platforms. There have been promising signs of this direction, but the market is, as of this writing, still too immature for users to know for sure which direction will predominate.
With a market in all the shades of gray, only time will tell which approach best represents the consumer.
Email protection clouds
It's funny how sometimes things happen so slowly over time that they slip by your notice. The world of anti-spam has become so cumbersome that almost no one handles his or her own "black list" maintenance anymore. Even if you're using a small firewall that has a checkmark for anti-spam, you're almost certainly already using a cloud service. The number and variety of blacklisting services in the anti-spam world is varied, but the most successful anti-spam systems seem to use a combination of several blacklisting services and in some cases multiple technologies that filter for spam, fraud, phishing, and other email-based malware. We saw a product from CheckPoint around 2004 that provided this type of service in a cloudlike arrangement, but it wasn't until late 2009 that it reappeared in the firewall product line from vendors such as Cisco Systems Inc.
Printed with permission from CRC Press. Copyright 2010. Cloud Computing: Technologies and Strategies of the Ubiquitous Data Center by Brian J. S. Chee and Curtis Franklin Jr. For more information about this title and other similar books, please visit http://www.crcpress.com/.
This was first published in August 2010