This paper describes cloud computing, a computing platform for the next generation of the Internet. The paper defines clouds, explains the business benefits of cloud computing, and outlines cloud architecture and its major components. Readers will discover how a business can use cloud computing to foster innovation and reduce IT costs. IntroductionEnterprises strive to reduce computing costs. Many start by consolidating their IT operations and later introducing virtualization technologies. Cloud computing takes these steps to a new leveland allows an organization to further reduce costs through improved utilization, reduced administration and infrastructure costs, and faster deployment cycles. The cloud is a next generation platform that provides dynamic resource pools, virtualization, and high availability.
Cloud computing describes both a platform and a type of application. A cloud computing platform dynamically provisions, configures, reconfigures, and deprovisions servers as needed. Cloud applications are applications that are extended to be accessible through the Internet. These cloud applications use large data centers and powerful servers that host Web applications andWeb services.
Cloud computing infrastructure accelerates and fosters the adoption of innovations Enterprises are increasingly making innovation their highest priority. They realize they need to seek new ideas and unlock new sources of value. Driven by the pressure to cut costs and growâ€simultaneouslyâ€they realize that itâ„¢s not possible to succeed simply by doing the same thingsbetter. They know they have to do new things that produce better results. Cloud computing enables innovation. It alleviates the need of innovators to find resources to develop, test, and make their innovations available to the user community. Innovators are free to focus on the innovation rather than the logistics of finding and managing resources that enable the innovation. Cloud computing helps leverage innovation as early as possible to deliver businessvalue to IBM and its customers. Fostering innovation requires unprecedented flexibility and responsiveness. The enterprise should provide an ecosystem where innovators are not hindered by excessive processes, rules, and resource constraints. In this context, a cloud computing service is a necessity. It comprises an automated framework that can deliver standardized services quickly and cheaply. Cloud computing is a term used to describe both a platform and type of application. A cloud computing platform dynamically provisions, configures, reconfigures, and deprovisions servers as needed. Servers in the cloud can be physical machines or virtual machines. Advanced clouds typically include other computing resources such as storage area networks (SANs),network equipment, firewall and other security devices. Cloud computing also describes pplications that are extended to be accessible through the Internet. These cloud applications use large data centers and powerful servers that host Webapplications and Web services. Anyone with a suitable nternet connection and a standardbrowser can access a cloud application.
A cloud is a pool of virtualized computer resources. A cloud can:
. Host a variety of different workloads, including batch-style back-end jobs and interactive,user-facing applications
. Allow workloads to be deployed and scaledout quickly through the rapid provisioning of virtual machines or physical machines
. Support redundant, self-recovering, highly scalable programming models that allow workloads to recover from many unavoidable hardware/software failures
. Monitor resource use in real time to enable rebalancing of allocations when needed Cloud computing environments support grid computing by quickly providing physical and virtualservers on which the grid applications can run
. Cloud computing should not be confused with grid computing
. Grid computing involves dividing a large task into many smaller tasks that runin parallel on separate servers
. Grids require many computers, typically in the thousands, andcommonly use servers, desktops, and laptops. Clouds also support nongrid environments, such as a three-tier Web architecture running standardor Web 2.0 applications. A cloud is more than a collection of computer resources because acloud provides a mechanism to manage those resources. Management includes provisioning, change requests, reimaging, workload rebalancing, deprovisioning, and monitoring.
Cloud computing infrastructures can allow enterprises to achieve more efficient use of their IThardware and software investments. They do this by breaking down the physical barriers inherentin isolated systems, and automating the management of the group of systems as a single entity. Cloud computing is an example of an ultimately virtualized system, and a natural evolution for data centers that employ automated systems management, workload balancing, and virtualizationtechnologies. A cloud infrastructure can be a cost efficient model for delivering information services, reducingIT management complexity, promoting innovation, and increasing responsiveness through realtime workload balancing. The Cloud makes it possible to launch Web 2.0 applications quickly and to scale up applicationsas much as needed when needed.
The platform supports traditional Javaâ€žÂ¢ and Linux, Apache, MySQL, PHP (LAMP) stack-based applications as well as new architectures such as MapReduceand the Google File System, which provide a means to scale applications across thousands ofservers instantly architecture
4.1. Cloud Computing Application Architecture
This gives the basic architecture of a cloud computing application. We know that cloud computing is the shift of computing to a host of hardware infrastructure thatis distributed in the cloud. The commodity hardware infrastructure consists of thevarious low cost data servers that are connected to the system and provide theirstorage and processing and other computing resources to the application. Cloudcomputing involves running applications on virtual servers that are allocated on thisdistributed hardware infrastructure available in the cloud. These virtual servers are made in such a way that the different service level agreements and reliability issues are met. There may be multiple instances of the same virtual server accessing the different parts of the hardware infrastructure available. This is to make sure that there are multiple copies of the applications which are ready to take over on another oneâ„¢s failure.
The virtual server distributes the processing between the infrastructure and the computing is done and the result returned. There will be a workload distribution management system, also known as the grid engine, for managing the different requests coming to the virtual servers. This engine will take care of the creation of multiple copies and also the preservation of integrity of the data that is stored in the infrastructure. This will also adjust itself such that even on heavier load, the processing is completed as per the requirements.
The different workload management systems are hidden from the users. For the user, the processing is done and the result is obtained. There is no question of where it was done and how it was done. The users are billed based on the usage of the system - as said before - the commodity is now cycles and bytes. The billing is usually on the basis of usage per CPU per hour or GB data transfer per hour.
4.2. Server Architecture
Cloud computing makes use of a large physical resource pool in the cloud. As said above, cloud computing services and applications make use of virtual server instances built upon this resource pool. There are two applications which help inmanaging the server instances, the resources and also the management of the resources by these virtual server instances. One of these is the Xen hypervisor which provides an abstraction layer between the hardware and the virtual OS so that the distribution of the resources and the processing is well managed. Another application that is widely used is the Enomalism server management system which is used for management of the infrastructure platform. When Xen is used for virtualization of the servers over the infrastructure, a thin software layer known as the Xen hypervisor is inserted between the serverâ„¢s hardware and the operating system. This provides an abstraction layer that allows each physical server to run one or more virtual servers, effectively decoupling the operating system and its applications from the underlying physical server.
The Xen hypervisor is a unique open source technology, developed collaboratively by the Xen community and engineers at over 20 of the most innovative data center solution vendors, including MD, Cisco, Dell, HP, IBM, Intel, Mellanox, Network Appliance, Novell, Red Hat, SGI, Sun, Unisys, Veritas, Voltaire, and Citrix. Xen is licensed under the GNU General Public License (GPL2) and is available at no charge in both source and object format. The Xen hypervisor is also exceptionally leanâ€œ less than 50,000 lines of code. That translates to extremely low overhead and near-native performance for guests. Xen re-uses existing device drivers (both closed and open source) from Linux, making device management easy. Moreover Xen is robust to device driver failure and protects both guests and the hypervisor from faulty or malicious drivers The Enomalism virtualized server management system is a complete virtual server infrastructure platform. Enomalism helps in an effective management of the resources. Enomalism can be used to tap into the cloud just as you would into a remote server. It brings together all the features such as deployment planning, load balancing, resource monitoring, etc. Enomalism is an open source application. It has avery simple and easy to use web based user interface. It has a module architecture which allows for the creation of additional system add-ons and plugins. It supports one click deployment of distributed or replicated applications on a global basis. It supports the management of various virtual environments including KVM/Qemu, Amazon EC2 and Xen, penVZ, Linux Containers, VirtualBox. It has fine grai ned user permissions and access privileges.
4.3. Map Reduce
Map Reduce is a software framework developed at Google in 2003 to support parallel computations over large (multiple petabyte) data sets on clusters of commodity computers. This framework is largely taken from Ëœmapâ„¢ and Ëœreduceâ„¢ functions commonly used in functional programming, although the actual semantics of the framework are not the same. It is a programming model and an associated implementation for processing and generating large data sets. Many of the real world tasks are expressible in this model. MapReduce implementations have been written in C++, Java and other anguages. Programs written in this functional style are automatically parallelized and executed on the cloud. The run-time system takes care of the details of partitioning the input data, scheduling the programâ„¢s execution across a set of machines, handling achine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a largely distributed system. The computation takes a set of input key/value pairs, and produces a set of output key/value pairs. The user of the MapReduce library expresses the computation as two functions: Map and Reduce. Map, written by the user, takes an input pair and produces a set of intermediate key/value pairs. The MapReduce library groups together all intermediate values associated with the same intermediate key I and passes them to the Reduce
The Reduce function, also written by the user, accepts an intermediate key I and a set of values for that key. It merges together these values to form a possibly smaller set of values. Typically just zero or one output value is produced per Reduce invocation. The intermediate values are supplied to the userâ„¢s reduce function via an iterator. This allows us to handle lists of values that are too large to fit in memory.
MapReduce achieves reliability by parceling out a number of operations onthe set of data to each node in the network; each node is expected to report back periodically with completed work and status updates. If a node falls silent for longer than that interval, the master node records the node as dead, and sends out the nodeâ„¢s assigned work to other nodes. Individual operations use atomic operations for naming file outputs as a double check to ensure that there are not parallel conflicting threads running; when files are renamed, it is possible to also copy them to another name in addition to the name of the task (allowing for side-effects).
4.4. Google File System
Google File System (GFS) is a scalable distributed file system developed by Google for data intensive applications. It is designed to provide efficient, reliable access to data using large clusters of commodity hardware.It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. Files are divided into chunks of 64 megabytes, which are only extremely rarely overwritten, or shrunk; files are usually appended to or read.
It is also designed and optimized to run on computing clusters, the nodes of which consist of cheap,commodity computers, which means precautions must be taken against the high failure rate of individual nodes and the subsequent data loss. Other design decisions select for high data throughputs, even when it comes at the cost of latency. The nodes are divided into two types:
one Master node and a large number of Chunkservers. Chunkservers store the data files, with each individual file broken up into fixed size chunks (hence the name) of about 64 megabytes, similar to clusters or sectors in regular file systems. Each chunk is assigned a unique 64-bit label, and logical mappings of files to constituent chunks are maintained. Each chunk is replicated several times throughout the network, with the minimum being three, but even more for files that have high demand or need more redundancy. The Master server doesnâ„¢t usually store the actual chunks, but rather all the metadata associated with the chunks, such as the tables mapping the 64-bit labels tochunk locations and the files they make up,the locations of the copies of the chunks,
what processes are reading or writing to a particular chunk, or taking a snapshot of the chunk pursuant to replicating it (usually at the instigation of the Master server, when, due to node failures, the number of copies of a chunk has fallen beneath the set number). All this metadata is kept current by the Master server periodically receiving updates from each chunk server (Heart-beat messages). Permissions for modifications are handled by a system of time-limited, expiring leases, where the Master server grants permission to a process for a finite period of time during which no other process will be granted permission by the Master server to modify the chunk. The modified chunkserver, which is always the primary chunk holder, then propagates the changes to the chunkservers with the backup copies. Thechanges are not saved until all chunkserversacknowledge, thus guaranteeing the
completion and atomicity of the operation. Programs access the chunks by firstquerying the Master server for the locationsof the desired chunks; if the chunks are not being operated on (if there are no outstanding leases), the Master replies with the locations, and he program then contacts and receives the data from thechunkserver directly. As opposed to manyfile systems, itâ„¢s not implemented in the kernel of an Operating System but accessedthrough a library to avoid overhead.
Hadoop is a framework for running applications on large cluster built ofcommodity hardware. The Hadoopframework transparently provides applications both reliability and data motion. Hadoopimplements the computation paradigm named MapReduce which was explained above. The application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides adistributed file system that stores data on the compute nodes, providing very high
aggregate bandwidth across the cluster. BothMapReduce and the distributed file system are designed so that the node failures are
automatically handled by the framework. Hadoop has been implemented making useof Java. In Hadoop, the combination of the
entire JAR files and classed needed to run a MapReduce program is called a job. All ofthese components are themselves
collected into a JAR which is usually referred to as the job file. To execute a job, itis submitted to a jobTracker and then executed.
Tasks in each phase are executed in a faulttolerant manner. If node(s) fail inthe middle of a computation the tasks assigned to them are re-distributed among theremaining nodes. Since we are using MapReduce, having many map and reduce tasksenables good load balancing and allows failed tasks to be re-run with smaller runtimeoverhead. The Hadoop MapReduce framework has master/slave architecture. It has a single master server or a jobTracker andseveral slave servers or taskTrackers, one per node in the cluster. The jobTracker is the point of interaction between the users andthe framework. Users submit jobs to the jobTracker, which puts them in a queue of pending jobs and executes them on a firstcomefirst-serve basis. The jobTrackermanages the assignment of MapReduce jobs to the taskTrackers. The taskTrackersexecute tasks upon instruction from the jobTracker and also handle data motion between the Ëœmapâ„¢ and Ëœreduceâ„¢ phases ofthe MapReduce job. Hadoop is a framework which has received awide industry adoption. Hadoop is used along with other cloud computing technologies like the Amazon services so as to make better use of the resources. There are many instances where Hadoop has beenused. Amazon makes use of Hadoop for processing millions of sessions which it uses for analytics. This is made use of in a cluster which has about 1 to 100 nodes. Facebook uses Hadoop to store copies of internal logs and dimension data sources a use it as a source for reporting/analytics and machine learning. The New York Times made use of Hadoop for large scale image conversions. Yahoo uses Hadoop to support research for advertisement systems
and web searching tools. They also use it to do scaling tests to support development of Hadoop
5. Cloud Computing Services
Even though cloud computing is a pretty new technology, there are many companies offering cloud computing services. Different companies like Amazon, Google, Yahoo, IBM and Microsoft are all players in the cloud computing services industry. But Amazon is the pioneer in the cloud computing industry with serviceslike EC2 (Elastic Compute Cloud) and S3(Simple Storage Service) dominating the industry. Amazon has an expertise in this industry and has a small advantage over the others because of this. Microsoft has good knowledge of the fundamentals of cloudscience and is building massive data centers. IBM, the king of business computing and traditional supercomputers, teams up with Google to get a foothold in the clouds. Google is far and away the leader in cloud computing with the company itself built from the ground up on hardware.
5.1. Amazon Web Services
The ËœAmazon Web Servicesâ„¢ is the set of cloud computing services offered byAmazon. It involves four different services.
They are Elastic Compute Cloud (EC2), Simple Storage Service (S3), Simple QueueService (SQS) and Simple Database
1. Elastic Compute Cloud (EC2)
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make webscale computing easier for developers. It provides on-demand processing power. Amazon EC2â„¢s simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with completecontrol of your computing resources and lets you run on Amazonâ„¢s proven computing environment. Amazon EC2 reduces the time required to obtain and
boot new server instances to minutes, allowing you to quickly scale capacity,both up and down, as your computing
requirements change. Amazon EC2 changes the economics of computing by allowing you to pay only for capacity
that you actually use. Amazon EC2 p ovides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios. Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to requisition machines for use, load them with your custom application environment, manage your networkâ„¢s access permissions, and run your image using as many or few systems as you desire. To set up an Amazon EC2 node we have tocreate an EC2 node
configuration which consists of all our applications, libraries, data and associated configuration settings. Thisconfiguration is then saved as an AMI (Amazon Machine Image). There are alsoseveral stock instances of Amazon AMIs available which can be customized and used. We can then start, terminate and monitor as many instances of the AMI as needed.Amazon EC2 enables you to increase or decrease capacity withinminutes. You can commission one, hundreds or even thousands of server instances simultaneously. Thus the applications can automatically scale itselfup and down depending on its needs. You have root access to each one, and you can interact with them as you would any machine. You have the choice of several instance types, allowing you to select a configuration of memory, PU, and instance storage that is optimal for your application. Amazon EC2 offers a highly reliable environment where replacement instances can be rapidly and reliably commissioned. Amazon EC2 provides web service
interfaces to configure firewall settings that control network access to and between groups of instances. You will be charged at the end of each month for your EC2 resources actually consumed. So charging will be based on the actual usage of the resources.
2. Simple Storage Service (S3)
S3 or Simple Storage Service offers cloud computing storage service. It offers services for storage of data in the cloud. It provides a high-availability large-store database. It provides a simple SQL-like language. It has been designed for interactive online use. S3 is
storage for the Internet. It is designed to make web-scale computing easier for developers. S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. Amazon S3 allows write, read and delete of objects containing from 1 byte to 5 gigabytes of data each. The number of objects that you can store is unlimited. Each object is stored in a bucket and retrieved via a unique developer-assigned key. A bucket can be located anywhere in Europe or the Americas but can be accessed from anywhere. Authentication mechanisms are provided to ensure that the data is kept secure from unauthorized access. Objects can be made private or public, and rights can be granted to specific users for particular objects. Also the S3 service also works with a pay only for what you use method of payment.
3. Simple Queue Service (SQS)
Amazon Simple Queue Service (SQS) offers a reliable, highly scalable, hosted queue for storing messages as they travel between computers. By using SQS, developers can simply move data between distributed components of their applications that perform different tasks, without losing messages or requiring each component to be always available
With SQS, developers can create an unlimited number of SQS queues, each of which can send and receive an unlimited number of messages Messages can be retained n a queue for up to 4 days. It is simple, reliable,secure and scalable.
4. Simple Database Service (SDB)
Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with the Amazon S3 and EC2, collectively providing the ability to store, process and query data sets in the cloud. These services are designed to make web-scale computing easier and more cost-effective to developers. Traditionally, this type of functionality is accomplished with a clustered relational database, which requires a sizable upfront investment and often requires a DBA to maintain and administer them. Amazon SDB provides all these without the operational complexity. It requires no schema, automatically indexes your data and provides a simpleAPI for storage and access. Developers gain access to the different functionalities from within the Amazonâ„¢s proven computing environment andare able to scale instantly and need to pay only for what they use.
5.2. Google App Engine
Google App Engine lets you run your web applications on Googleâ„¢sinfrastructure. App Engine applications areeasy to build, easy to maintain, and easy to scale as your traffic and data storage needs grow. You can serve your app using a free domain name on the appspot.com domain, or use Google Apps to serve it from your own domain. You can share your application with the world, or limit access to members of your organization. App Engine costs nothing to get started. Sign up for a free account, and you can develop and publish your application at no charge and with no obligation. A free account can use up to 500MB of persistent storage and enough CPU and bandwidth for about 5 million page views a month. Google App Engine makes it easy to build an application that runs reliably, even under heavy load and with large amounts of data. The environment includes the following features
:Â¢ dynamic web serving, with full support for common web technologies
Â¢ persistent storage with queries, sorting and transactions
Â¢ automatic scaling and load balancing
Â¢ APIs for authenticating users and sending email using Google Accounts
Â¢ a fully featured local development environment that simulates Google AppEngine on your computer
Google App Engine applications are implemented using the Python programming language. The runtimeenvironment includes the full Python language and most of the Python standard library. Applications run in a secure environment that provides limited access to the underlying operating system. These limitations allow App Engine to distribute web requests for the application across multiple servers, and start and stop servers to meet traffic demands. App Engine includes a service API for integrating with Google Accounts.
Your application can allow a user to sign in with a Google account, and access the email address and displayable name associated with the account. Using Google Accounts lets the user start using your application faster, because the user may not
need to create a new account. It also saves you the effort of implementing a user account system just for your application
App Engine provides a variety of services that enable you to perform common operations when managing your application. The following APIs are provided to access these services: Applications can access resources on the Internet, such as web services or ther data, using App Engineâ„¢s URL fetch service. Applications can send email messages using App Engineâ„¢s mail service. The mail service uses Google infrastructure to send email messages. TheImage service lets your application manipulate images. With this API, you can resize, crop, rotate and flip images inJPEG and PNG formats.In theory, Google claims App Engine can scale nicely. But Google currently places a limit of 5 million hits per month on each application. This limit nullifies App Engineâ„¢s scalability, because any small, dedicated server can have this performance. Google will eventually allow webmasters to go beyond this limit (if they pay).
Cloud computing is a powerful new abstraction for large scale data processing systems which is scalable, reliable and
available. In cloud computing, there are large self-managed server pools available whichreduces the overhead and eliminates management headache. Cloud computing services can also grow and shrink according to need. Cloud computing is particularly
valuable to small and medium businesses, where effective and affordable IT tools are critical to helping them become more
productive without spending lots of money on in-house resources and technical equipment. Also it is a new emerging
architecture needed to expand the Internet to become the computing platform of the future.