Another important consideration is the volatility of the disk storage used by EC2. You cannot use it to store data when an instance is shut down. When an EC2 instance starts, its disk storage is empty; when the instance stops, changes made are cleared. However, to help with storage management, changes to the disk are retained if an instance unexpectedly restarts. If you anticipate running a tool that requires changes to be saved (such as MySQL), you must plan to save those files to S3 before stopping the instance.
A third facet of EC2 you will need to anticipate is that the IP addresses of all instances are assigned dynamically at startup. This is not typically a problem, but if the instance is intended to serve as a web server, for example, you will likely need to configure the AMI to register newly started EC2 instances with a dynamic DNS provider.
Lastly, in order to secure EC2 instances, network access to newly started EC2 instances is locked down by default. Using the EC2 tools, you may set network-access parameters appropriately once the instance has started. If the instance will host a web server, an SSH server, or a remote desktop application, you will need open each of the ports required to enable those services.
Two other offerings from AWS are the Simple Queue Service (SQS) and SimpleDB, which are beyond the scope of this article. However, understand that they are important building blocks for EC2-hosted enterprise applications. SQS is a basic queuing service that you can use to send messages between applications. SimpleDB is a simple structured data access server, similar in most respects to a modern relational database but with a reduced feature set.
AWS in Action
Combining the flexibility of S3 and EC2 (along with SQS and SimpleDB), Amazon has built the foundation for a powerful new server environment. Referring back to the previously mentioned media-sharing site of Company B, one requirement is to prevent users from sharing copyrighted content. To do this, the Company B developers build an application that is able to process a file and decide whether the media is likely to be copyrighted. This process is not only compute intensive, but it can easily get overwhelmed during peak periods. To satisfy the company's goal of scaling costs linearly with demand, the company decided to host the media-sniffing services on two EC2 instances (two are started for high availability). When users upload media, the files are uploaded to S3 and a request is queued using SQS to process the file. The first application (running on EC2 servers) to dequeue the work request could process the file and post the results either to SQS or to SimpleDB. The client application could then retrieve the results and respond accordingly. As more and more files are uploaded, more instances of the EC2 media-sniffer instance are started to satisfy the increased workload. Similarly, as the load declines, unnecessary EC2 instances could be shut down. In this case, using S3, SQS, and SimpleDB could provide a scalable and cost-effective way to handle surges in load. Further, because S3, SQS, and SimpleDB were used to communicate between applications, no secondary storage needed to be usedan ideal EC2 environment.
EC2 and its complimentary AWS tools represent a major advance in cloud-based computing and its potential for high levels of service and low-cost scalability. Although Amazon has some work to do to remedy some of the existing difficulties of building tools based exclusively on AWS, it will be interesting to see the upcoming tools released under the AWS umbrella. To get an idea about the possibilities, explore the AWS EC2 site to see some of the tools that have already been built.