Dissecting the Cloud – Part 1

“The Cloud brings it with it the promise of utility-style computing and the ability to pay according to usage.

Cloud Computing provides elasticity or the ability to grow and shrink based on traffic patterns.

Cloud Computing does away with CAPEX and the need to buy infrastructure upfront and replaces it with OPEX model and so on”.

All this old news and has been repeated many times. But what exactly constitutes cloud computing? What brings about the above features? What are its building blocks of the cloud that enable one to realize the above?

This post tries to look deeper into the innards of the Cloud to determine what the cloud really is.

Before we get to this I would like to dwell on an analogy to understand the Cloud better.

Let us assume, Mr. A owns a large building of about 15,000 sq feet and about 100 feet tall. Let us assume that Mr. A wants to rent this building.

Now, assume that the door of this building opens to single, large room on the inside!

Mr. X comes to rent this building. If this was the case then poor Mr. X would have to pay through his nose, presumably, for the entire building even though his requirement would have been for a small room of about 600 x 600 feet. Imagine the waste of space. Moreover this would also have resulted in an enormous waste of electricity. Imagine the lighting needed. Also an inordinate amount of water would have to be utilized if this single, large room needed to be cleaned. The cost for all of this would have to be borne by Mr. X.

This is clearly not a pleasant state of affairs for either Mr. X or for the owner Mr. A of the building.

The solution to this is easy.  What Mr. A needs to do, is to partition the building into self-contained rooms (600 x 600 sq feet) with all the amenities. Each self-contained unit would need to have its own electricity and water meter.

Now Mr. A can rent rooms to different tenants on their need basis. This is a win-win situation both for Mr, A and Mr. X. The tenants only need to pay for the rooms they occupy and the electricity and water they consume.

This is exactly the principle behind cloud computing and is known as ‘virtualization’

There are 3 computing components that one must consider. CPU, Network and Storage. The below picture shows the virtualization of CPU,RAM, NIC (network card), Disk (storage)

Server-Virtualization-Logical-View

The Cloud is essentially made up of  anywhere between 100 servers to 100,000 servers. The servers are akin to the large building. Running a single OS and application(s) on the entire server is a waste of computing, storage and network resources.

Virtualization abstracts the hardware, storage and network through the use of software known as the ‘hypervisor’. On top of the hypervisor several ‘guest OSes’ can run. Applications can then run on these guest OSes.

Hence over the CPU (single, dual or multi-core) of the server,  multiple guest OS’es  can run each with its own set of applications

This is similar to partitioning the large CPU resource of the server into smaller units.

There are 3 main Virtualization technologies namely VMware, Citrix and MS Hyper-V

Here is a diagram showing the 3 main the virtualization technologies

thumb_server_virtualization_lrg

To be continued …


Find me on Google+

The data center paradox

In today’s globalized environment organizations are spread geographically across the globe. Such globalizations result in multiple advantages ranging from quicker penetration into foreign markets, cost advantage of the local workforce etc.  This globalization results in the organization having data centers that are spread in different geographical areas. Besides mergers and acquisitions of different businesses spread across the globe results in hardware and server sprawl.

Applications in these dispersed servers tend to be silo’ed with legacy hardware and different OS’es and disparate software that execute on them.

The costs of maintaining different data centers can be a prickly problem. There are different costs in managing a data center. The chief among them are operational costs, real estate costs, power and cooling costs etc. The problem of hardware and server sprawl is a real problem and the enterprise must look to ways to solve this problem.

There are two techniques to manage hardware and server sprawl.

The first method is to use virtualization technologies so that the hardware and server sprawl can be reduced. Virtualization techniques abstract the raw hardware through the use of special software called the hypervisor. Any guest OS namely Windows, Linux or Solaris can execute over the hypervisor. The key benefit that virtualization brings to the enterprise is that it abstracts the hardware, storage and the network and creates a shared pool of compute, storage and network for the different applications to utilize. Hence the server sprawl can be mitigated to some extent through the use of Virtualization Software such as VmWare, XenApp, Hyper-V etc.

The second method requires rationalization and server consolidation. This essentially requires taking a hard look at the hardware infrastructure, the application and their computing needs and trying to come up with a solution which involves more powerful mainframes or servers which can replace the existing less powerful infrastructure.  Consolidation has multiple benefits. Many distributed data centers can be replaced with a single consolidated data center with today’s powerful multi-core, multi-processor servers. This results in highly reduced operational costs, easier management, savings from reduction in the need for power and cooling requirements and real estate saving etc. Consolidation truly appears to be the “silver bullet” for server sprawl.

However this brings us to what I call “the data center paradox”.  While a consolidated data center can do away with operational expenses of multiple data centers, result in reduction in power and cooling costs and save in real estate costs it introduces WAN latencies. When geographically dispersed data centers across the globe are replaced with a consolidated data center, in a single location, the access times from different geographical areas can result in poor response times and latencies. Besides there is also an inherent cost of data access over the WAN network

The WAN network results in latencies which are difficult to eliminate. There are technologies which can lessen the bandwidth problem to some extent. WAN optimizer is one such technology.

In fact e-commerce and many web applications intentionally spread their application across geographical regions to provide a better response time.

So while on the one hand consolidation results in cost savings, better efficiencies of management of a single data center, reduced power and cooling costs and real estate savings it results in WAN latencies and associated bandwidth costs.

Unless there is a breakthrough innovation in WAN technologies this will be a paradox that architects and CIOs will have to contend with.

Find me on Google+

Software Defined Networks (SDNs): A glimpse of tomorrow

Published in Telecom Asia, Jul 28,2011 – A glimpse into the future of networking

Published in Telecoms Europe, Jul 28 2011 – SDNs are new era for networking

Networks and networking, as we know it, is on the verge of a momentous change, thanks to a path breaking technological concept known as Software Defined Networks (SDN). SDN is the result of pioneering effort by Stanford University and University of California, Berkeley and is based on the Open Flow Protocol and represents a paradigm shift to the way networking elements operate.

Networks and network elements, of today, have been largely closed and have been based on proprietary architectures. In today’s network and switching and routing of data packets happen in the same network elements for e.g. the router.

Software Defined Networks (SDN) decouples the routing and switching of the data flows and moves the control of the flow to a separate network element namely, the flow controller.   The motivation for this is that the flow of data packets through the network can be controlled in a programmatic manner. A Flow Controller can be typically implemented in a standard PC.  In some ways this is reminiscent of Intelligent Networks and Intelligent Network Protocol which delinked the service logic from the switching and moved it a network element known as the Service Control Point.

The OpenFlow Protocol has 3 components to it. The Flow Controller that controls the flows, the OpenFlow switch and the Flow Table and a secure connection between the Flow Controller and the OpenFlow switch. The OpenFlow Protocol is an open source API specification for modifying the flow table that exists in all routers, Ethernet switches and hubs.  The ability to securely control the flow of traffic programmatically opens ups amazing possibilities.

OpenFlow Specification

Alternatively, existing branded routers can implement the OpenFlow Protocol as an added feature to their existing routers and Ethernet switches. This will enable these routers and Ethernet switches to support both production traffic and research based traffic using the same set of network resources.

The single greatest advantage of separating the control and data plane of network routers and Ethernet switches is the ability to modify and control different traffic flows through a set of network resources. In addition to this benefit Software Define Networks (SDNs) also include the ability to virtualize the network resources. Virtualized network resources are known as a “network slice”. A slice can span several network elements including the network backbone, routers and hosts.

Computing resources can be virtualized through the use of the Hypervisor which abstracts the hardware and enables several guest OS to run in complete isolation. Similarly when a network element a FlowVisor, experimentally demonstrated, is used along with the OpenFlow Controller it is possible to virtualize the network resources. Hence each traffic flow gets a combination of bandwidth, routers, traffic flows and computing resources. Hence Software Defined Networks (SDNs) are also known as Virtualized Programmable Networks owing to the ability of different traffic flows being able to co-exist in perfect isolation of one another allowing for traffic flows through the resources to be controlled by programs in the Flow Controller.

The ability to manage different types of traffic flows across network resources opens up endless possibilities. SDNs have been successfully demonstrated in wireless handoffs between networks and in running multiple different flows through a common set of resources. SDNs in public and private clouds allow appropriate resources to be pooled during different times of the day based on the geographical location of the requests. Telcos could optimize the usage of their backbone network based on peak and lean traffic periods through the Core Network.

The OpenFlow Protocol has already gained widespread support in the industry and has resulted in the formation of the Open Networking Foundation (ONF). The members of ONF include behemoths like Google, Facebook, Yahoo, and Deutsche Telekom to networking giants like Cisco, Juniper, IBM and Brocade etc. Currently the ONF has around 43 member companies

Software Define Networks is a tectonic shift in the way networks operate and truly represent the dawn of a new networking era. A related post of interest is “Adding the OpenFlow variable in the IMS equation

Find me on Google+