Re-imagining the Web portal

Published in Telecom Asia, Mar 16, 2012 – Re-imagining the web portal

Web portals had their heyday in the early 1990’s. Remember Lycos, Alta-vista, Yahoo, and Excite – portals which had neatly partitioned the web into compartments for e.g. Autos, Beauty, Health, and Games etc. Enter Google. It had a webpage with a single search bar. With a single stroke Google pushed all the portals to virtual oblivion.

It became obvious to the user that all information was just a “search away”. There was no longer the need for neat categorization of all the information on the web. There was no need to work your way through links only to find your information at the “bottom of the heap”. The user was content to search their way to needed information.

That was then in the mid 1990s. But much has changed since then. Many pages have been uploaded into the trillion servers that make up the internet. There is so much more information in the worldwide web.  News articles, wikis, blogs, tweets, webinars, podcasts, photos, you tube content, social networks etc etc.

Here are some fun facts about the internet – It contains 8.11 billion pages (Worldwidewebsize), has more than 1.97 billion users, 266 million websites (State of the Internet). We can expect the size to keep growing as the rate of information generation and our thirst for information keeps increasing.

In this world of exploding information the “humble search” will no longer be sufficient. As a user we would like to browse the web in a much more efficient, effective and personalized way.   Neither will site aggregators like StumbleUpon, Digg, Reddit and the like will be useful.  We need to have a smart way to be able to navigate through this information deluge.

It is here I think that there is a great opportunity for re-imagining the Web Portal. As a user of the web it would be great if the user is shown a view of the web that is personalized to the tastes and interests that is centered on him. What I am proposing is a Web portal that metamorphoses dynamically based on the user’s click stream, the user’s browsing preferences, of his interests and inclinations as the focal center.  Besides the user’s own interests the web portal would also analyze the click streams of the user’s close friends, colleagues and associates. Finally the portal would also include inputs from what the world at large is interested in and following. The web portal would analyze the key user’s preferences and then create a web portal based on its analysis of what the user would like to see.

This can be represented in the diagram below

We have all heard of Google’s zeitgeist which is a massive database of the world’s inclinations and tendencies.  Such a similar database would probably be also held by Yahoo, Microsoft, FB, Twitter etc.

The Web portal in its new incarnation would present contents that are tailored specifically to each user’s browsing patterns. In a single page would be included all news, status updates, latest youtuble videos, tweets etc he would like to see.

In fact this whole functionality can be integrated into the Web browser. In its new avatar the Web Portal would have content that is dynamic, current and personalized to each individual user. Besides every user would also be able to view what his friends, colleagues and the world at large are browsing.

A few years down the line we may see “the return of the dynamic, re-invented Web Portal”.

Find me on Google+

The promise of predictive analytics

Published in Telecom Asia – Feb 20, 2012 –  The promise of predictive analytics

Published in Telecoms Europe – Feb 20, 2012 – Predictive analytics gold rush due

We are headed towards a more connected, more instrumented and more data driven world. This fact is underscored once again in  Cisco’s latest   Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2011–2016.The statistics from this report is truly mind boggling

By 2016 130 exabytes (130 * 2 ^ 60) will rip through the internet. The number of mobile devices will exceed the human population this year, 2012. By 2016 the number of connected devices will touch almost 10 billion.

The devices that are connected to the net range from mobiles, laptops, tablets, sensors and the millions of devices based on the “internet of things”. All these devices will constantly spew data on the internet and business and strategic decisions will be made by determining patterns, trends and outliers among mountains of data.

Predictive analytics will be a key discipline in our future and experts will be much sought after. Predictive analytics uses statistical methods to mine information and patterns in structured, unstructured and streams of data. The data can be anything from click streams, browsing patterns, tweets, sensor data etc. The data can be static or it could be dynamic. Predictive analytics will have to identify trends from data streams from mobile call records, retail store purchasing patterns etc.

Predictive analytics will be applied across many domains from banking, insurance, retail, telecom, energy. In fact predictive analytics will be the new language of the future akin to what C was a couple of decades ago.  C language was used in all sorts of applications spanning the whole gamut from finance to telecom.

In this context it is worthwhile to mention The R Language. R language is used for statistical programming and graphics. The Wikipedia defines R Language as “R provides a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others”.

Predictive analytics is already being used in traffic management in identifying and preventing traffic gridlocks. Applications have also been identified for energy grids, for water management, besides determining user sentiment by mining data from social networks etc.

One very ambitious undertaking is “the Data-Scope Project” that believes that the universe is made of information and there is a need for a “new eye” to look at this data. The Data-Scope project is described as “a new scientific instrument, capable of ‘observing’ immense volumes of data from various scientific domains such as astronomy, fluid mechanics, and bioinformatics. The system will have over 6PB of storage, about 500GBytes per sec aggregate sequential IO, about 20M IOPS, and about 130TFlops. The Data-Scope is not a traditional multi-user computing cluster, but a new kind of instrument, that enables people to do science with datasets ranging between 100TB and 1000TB The Data-scope project is based on the premise that new discoveries will come from analysis of large amounts of data. Analytics is all about analyzing large datasets and predictive analytics takes it one step further in being able to make intelligent predictions based on available data.

Predictive analytics does open up a whole new universe of possibilities and the applications are endless.  Predictive analytics will be the key tool that will be used in our data intensive future.

Afterthought

I started to wonder whether predictive analytics could be used for some of the problems confronting the world today. Here are a few problems where analytics could be employed

–          Can predictive analytics be used to analyze outbreaks of malaria, cholera or AID and help in preventing their outbreaks in other places?

–          Can analytics analyze economic trends and predict a upward/downward trend ahead of time.

Find me on Google+

The data center paradox

In today’s globalized environment organizations are spread geographically across the globe. Such globalizations result in multiple advantages ranging from quicker penetration into foreign markets, cost advantage of the local workforce etc.  This globalization results in the organization having data centers that are spread in different geographical areas. Besides mergers and acquisitions of different businesses spread across the globe results in hardware and server sprawl.

Applications in these dispersed servers tend to be silo’ed with legacy hardware and different OS’es and disparate software that execute on them.

The costs of maintaining different data centers can be a prickly problem. There are different costs in managing a data center. The chief among them are operational costs, real estate costs, power and cooling costs etc. The problem of hardware and server sprawl is a real problem and the enterprise must look to ways to solve this problem.

There are two techniques to manage hardware and server sprawl.

The first method is to use virtualization technologies so that the hardware and server sprawl can be reduced. Virtualization techniques abstract the raw hardware through the use of special software called the hypervisor. Any guest OS namely Windows, Linux or Solaris can execute over the hypervisor. The key benefit that virtualization brings to the enterprise is that it abstracts the hardware, storage and the network and creates a shared pool of compute, storage and network for the different applications to utilize. Hence the server sprawl can be mitigated to some extent through the use of Virtualization Software such as VmWare, XenApp, Hyper-V etc.

The second method requires rationalization and server consolidation. This essentially requires taking a hard look at the hardware infrastructure, the application and their computing needs and trying to come up with a solution which involves more powerful mainframes or servers which can replace the existing less powerful infrastructure.  Consolidation has multiple benefits. Many distributed data centers can be replaced with a single consolidated data center with today’s powerful multi-core, multi-processor servers. This results in highly reduced operational costs, easier management, savings from reduction in the need for power and cooling requirements and real estate saving etc. Consolidation truly appears to be the “silver bullet” for server sprawl.

However this brings us to what I call “the data center paradox”.  While a consolidated data center can do away with operational expenses of multiple data centers, result in reduction in power and cooling costs and save in real estate costs it introduces WAN latencies. When geographically dispersed data centers across the globe are replaced with a consolidated data center, in a single location, the access times from different geographical areas can result in poor response times and latencies. Besides there is also an inherent cost of data access over the WAN network

The WAN network results in latencies which are difficult to eliminate. There are technologies which can lessen the bandwidth problem to some extent. WAN optimizer is one such technology.

In fact e-commerce and many web applications intentionally spread their application across geographical regions to provide a better response time.

So while on the one hand consolidation results in cost savings, better efficiencies of management of a single data center, reduced power and cooling costs and real estate savings it results in WAN latencies and associated bandwidth costs.

Unless there is a breakthrough innovation in WAN technologies this will be a paradox that architects and CIOs will have to contend with.

Find me on Google+

Technological hurdles: 2012 and beyond

Published in Telecom Asia, Jan 11,2012 – Technological hurdles – 2012 and beyond

You must have heard it all by now – the technological trends for 2012 and the future. The predictions range over BigData, cloud computing, internet of things, LTE, semantic web, social commerce and so on.

In this post, I thought I should focus on what seems to be significant hurdles as we advance to the future. So for a change, I wanted to play the doomsayer rather than a soothsayer. The positive trends are bound to continue and in our exuberance we may lose sight of the hurdles before us. Besides, “problems are usually opportunities in disguise”. So here is my list of the top issues that is facing the industry now.

Bandwidth shortage: A key issue of the computing infrastructure of today is data affinity, which is the result of the dual issues of data latency and the economics of data transfer. Jim Gray (Turing award in 1998) whose paper on “Distributed Computing Economics” states that that programs need to be migrated to the data on which they operate rather than transferring large amounts of data to the programs. In this paper Jim Gray tells us that the economics of today’s computing depends on four factors namely computation, networking, database storage and database access. He then equates $1 as follows

One dollar equates to

= 1 $

≈ 1 GB sent over the WAN

≈ 10 Tops (tera cpu operations)

≈ 8 hours of cpu time

≈ 1 GB disk space

≈ 10 M database accesses

≈ 10 TB of disk bandwidth

≈ 10 TB of LAN bandwidth

As can be seen from above breakup, there is a disproportionate contribution by the WAN bandwidth in comparison to the others.  In others words while the processing power of CPUs and the storage capacities have multiplied accompanied by dropping prices, the cost of bandwidth has been high. Moreover the available bandwidth is insufficient to handle the explosion of data traffic.

In fact it has been found that  the “cheapest and fastest way to move a Terabyte cross country is sneakernet (i.e. the transfer of electronic information, especially computer files, by physically carrying removable media such as magnetic tape, compact discs, DVDs, USB flash drives, or external drives from one computer to another).

With the burgeoning of bandwidth hungry applications it is obvious that we are going to face a bandwidth shortage. The industry will have to come with innovative solutions to provide what I would like to refer as “bandwidth-on-demand”.

The Spectrum Crunch: Powerful smartphones, extremely fast networks, content-rich applications, and increasing user awareness, have together resulted in a virtual explosion of mobile broadband data usage. There are 2 key drivers behind this phenomenal growth in mobile data. One is the explosion of devices-smartphones, tablet PCs, e-readers, laptops with wireless access. The second is video. Over 30% of overall mobile data traffic is video streaming, which is extremely bandwidth hungry. All these devices deliver high-speed content and web browsing on the move. The second is video. Over 30% of overall mobile data traffic is video streaming, which is extremely bandwidth hungry. The rest of the traffic is web browsing, file downloads, and email

The growth in mobile data traffic has been exponential. According to a report by Ericsson, mobile data is expected to double annually till 2015. Mobile broadband will see a billion subscribers this year (2011), and possibly touch 5 billion by 2015.

In an IDATE (a consulting firm) report,  the total mobile data will exceed 127 exabytes (an exabyte is 1018 bytes, or 1 mn terabytes) by 2020, an increase of over 33% from 2010).

Given the current usage trends, coupled with the theoretical limits of available spectrum, the world will run out of available spectrum for the growing army of mobile users. The current spectrum availability cannot support the surge in mobile data traffic indefinitely, and demand for wireless capacity will outstrip spectrum availability by the middle of this decade or by 2014.

This is a really serious problem. In fact, it is a serious enough issue to have the White House raise a memo titled “Unleashing the Wireless Broadband Revolution”. Now the US Federal Communication Commission (FCC) has taken the step to meet the demand by letting wireless users access content via unused airwaves on the broadcast spectrum known as “White Spaces”. Google and Microsoft are already working on this technology which will allow laptops, smartphones and other wireless devices to transfer in GB instead of MB thro Wi-Fi.

But spectrum shortage is an immediate problem that needs to be addressed immediately.

IPv4 exhaustion: IPv4 address space exhaustion has been around for quite some time and warrants serious attention in the not too distant future.  This problem may be even more serious than the Y2K problem. The issue is that IPv4 can address only 2^32 or 4.3 billion devices. Already the pool has been exhausted because of new technologies like IMS which uses an all IP Core and the Internet of things with more devices, sensors connected to the internet – each identified by an IP address. The solution to this problem has been addressed long back and requires that the Internet adopt IPv6 addressing scheme. IPv6 uses 128-bit long address and allows 3.4 x 1038 or 340 trillion, trillion, trillion unique addresses. However the conversion to IPv6 is not happening at the required pace and pretty soon will have to be adopted on war footing. It is clear that while the transition takes place, both IPv4 and IPv6 will co-exist so there will be an additional requirement of devices on the internet to be able to convert from one to another.

We are bound to run into a wall if organizations and enterprises do not upgrade their devices to be able to handle IPv6.

Conclusion: These are some of the technological hurdles that confront the computing industry.  Given mankind’s ability to come up with innovative solutions we may find new industries being spawned in solving these bottlenecks.

Find me on Google+

Tomorrow’s wireless ecosystem

The wireless networks of today had its humble beginnings in 1924 when the first mobile radio was demonstrated. It was many years since this beginning, that a completely functional cellular network was established. The earliest systems were the analog 1G system that was demonstrated in 1978 in US with great success. The initial mobile systems were primarily used for making mobile voice calls. This continued for the next 2 decades as the network evolved to digital based 2G systems.

 

It was around 1999-2000 that ETSI standardized GPRS or 2.5G technology to use the cellular network for data. Though the early data rates, of 144 kbps, were modest, the entry of GPRS proved to be a turning point in technological history. GPRS provided the triple benefits of wireless connectivity, mobility and internet access.  Technological advancement enabled faster and higher speeds of wireless, mobile access to the internet. The deployments of 3G enabled speeds of up to 2 Mbps for fixed access while LTE promised speeds of almost 56 Mbps per second coupled with excellent spectral efficiency.

 

The large increase of bandwidth along with mobility has allowed different technologies to take advantage of the wireless infrastructure for their purposes.  While Wi-Fi networks based on 802.11 and WiMAX based on 802.16 will play a part in the wireless ecosystem this post looks at the role that will be played by cellular networks from 2G to 4G.

 

The cellular network with its feature of wireless access, mobility and the ability to handle voice, video and data calls will be the host of multiple disparate technologies as we move forward into the future.  Below are listed some of the major users of the wireless network in the future

 

Mobile Phones:  The cellular network was created to handle voice calls originating from mobile phones. A large part of mobile traffic will still be for mobile to mobile calls. As the penetration of the cellular networks occurs in emerging economies we can expect that there will be considerable traffic from voice calls. It is likely that as the concept of IP Multimedia System (IMS) finds widespread acceptance the mobile phone will also be used for making video calls. With the advent of the Smartphone this is a distinct possibility in the future.

Smartphones, tablets and Laptops: These devices will be the next major users of the cellular network. Smartphones, besides being able to make calls, also allow for many new compelling data applications. Exciting apps on tablets like the iPad and laptops consume a lot of bandwidth and use the GPRS, 3G or LTE network for data transfer. In fact in a recent report it has been found that a majority of data traffic in the wireless network are video. Consumers use the iPad and the laptop for watching videos on Youtube and for browsing using the wireless network.

Internet of Things (IoT):  The internet of things, also known as M2M, envisages a network in which passive or intelligent devices are spread throughout the network and collect and transmit data to back end database. RFIDs were the early enablers of this technology. These sensors and intelligent devices will collect data and transmit the data using the wireless network. Applications for the Internet of Things range from devices that monitor and transmit data about the health of cardiac patients to being able to monitor the structural integrity of bridges.

Smart Grid: The energy industry is delicately poised for a complete transformation with the evolution of the smart grid concept. There is now an imminent need for an increased efficiency in power generation, transmission and distribution coupled with a reduction of energy losses. In this context many leading players in the energy industry are coming up with a connected end-to-end digital grid to smartly manage energy transmission and distribution.  The digital grid will have smart meters, sensors and other devices distributed throughout the grid capable of sensing, collecting, analyzing and distributing the data to devices that can take action on them. The huge volume of collected data will be sent to intelligent device which will use the wireless 3G networks to transmit the data.  Appropriate action like alternate routing and optimal energy distribution would then happen. The Smart Grid will be a major user of the cellular wireless network in the future.

Hence it can be seen the users of the wireless network will increase dramatically as we move forward into the future. Multiple technologies will compete for the available bandwidth. For handling this exponential growth in traffic we not only need faster speeds for the traffic but also sufficient spectrum available for use and it is necessary that ITU addresses the spectrum needs on a war footing.

It is thus clear that the telecom network will have to become more sophisticated and more technologically advanced as we move forward into the future.


Find me on Google+

The Future is C-cubed: Computing, Communication and the Cloud

We are the on the verge of the next great stage of technological evolution. The trickle of different trends clearly point to what I would like to term as C-cubed (C3) representing the merger of computing technologies, communication advances and the cloud.

There are no surprises in this assessment. Clearly it does not fall into the category of Chaos theories’ “butterfly effect” where a seemingly unrelated cause has a far-reaching effect, typically the fluttering of a butterfly in Puerto Rico is enough to cause an earthquake in China.

The C-cubed future that seems very probable is based on the advances in mobile broadband, advances in communication and the emergence of cloud computing.  A couple of years back Scott McNealy of Sun Microsystems believed that the “network is the computer”. Now with the introduction of Google’s Chrome book this trend will soon catch on. In fact I can easily visualize a ubiquitous device which I would like to call as the “cloudbook”.

The cloudbook would be a device that would resemble a tablet like the iPad, Playbook etc but would carry little or no hard disk.  Local storage will be through USB devices or SD-Cards which these days come with large storage in the range of 80GB and above. The Cloud book would have no operating system. It would simply have a bootstrap program which will allow the user to choose from several different Operating Systems (OS) namely Window’s, Linux, Solaris and Mac etc which will execute on the cloud. All applications will be executed directly on the cloud. The user will also store all his programs and data on the cloud.  Some amount of offline storage will be possible in portable storage devices like the memory stick, SD card etc.

The cloudbook will be a ubiquitous device.  It will access the internet through mobile broadband.  The access could be through a GPRS, WCDMA or a LTE connection. With the blazing speeds of 56 Mbps promised by LTE the ability to access the public cloud for executing programs and for storing of data is extremely feasible. Access should be almost instantaneous. Using the mobile broadband for access and the cloud for computing and storage will be the trend in the future.

Besides its use for computing, the cloudbook will also be used for making voice or video calls. This is the promise of IP Multimedia Systems (IMS) technology. IMS is a technology that has been in the wings for quite some time. IMS technology envisages an all-IP Core Network that will be used for transporting voice, data and video. As the speeds of the IP pipes become faster and the algorithms to iron out QOS issues are worked out the complete magnificence of the vision of IMS will become a reality and high speed video applications will become common place.

The cloudbook will use the WCDMA, 3G, network to make voice and video calls to others. The 3G RNC or the 4G eNodeB’s will enable the transmission and reception of voice, data or video to and from the Core Network. LTE networks will either user Circuit Switched Fall Back (CSFB) or VOLTE (Voice over LTE) to transfer voice and video over either the 3G network or over the Evolved Packet Core (EPC).  In the future high speed video based calls and applications will be extremely prevalent and a device like the cloudbook will increase the user experience manifold.

Besides IMS also envisions Applications Server (AS) spread across the network providing other services like Video-on-Demand, Real-time multi player gaming. It is clear that these AS may actually be instances sitting off the public cloud.

Hence the future clearly points to a marriage of computing, communication and the cloud where each will have a symbiotic relationship with the other resulting in each other. The network can be visualized as one large ambient network of IMS Call Session Control Function Servers (CSCFs) , Virtualized Servers on the Cloud and Application servers (AS).

Mobile broadband will become commonplace and all computing and communication will be through 3G or 4G networks.

The future is almost here and the future is C-cubed (C3)!!!

Published in Telecom Asia, Jul 8 2011 – The Future is C-cubed

Find me on Google+

Managing Multi-Region Deployments

If there is one lesson from this year’s major Amazon’s EC2 outage it is “don’t deploy all your application instances in a single region”. The outage has clearly demonstrated that entire regions are not immune to disasters. Thus, it has become imperative for designers and architects to deploy applications spanning major regions. Currently there are 4 major regions – US-West, US-East, Europe and APAC.

Both fundamentally and from a strategic point of view it makes sense to deploy web applications in different regions for e.g. both in US-East and US-West. This will build into the application a certain amount of geographical resiliency . In this way you are protected from major debacles like the Amazon’s EC2 outage in April 2011 or a possible meteor crashing and burning in one of the data centers.

Deploying instances in different regions is almost like minimizing risk by diversifying your portfolio. The design of application besides including other methods of fault tolerance should also incorporate geographical resilience.

Currently Amazon’s ELB does not support load balancing across regions. The ELB can only distribute traffic among instances in different availability zones of a region. The solution is to go for other DNS services like UltraDNS, DNSMadeEasy or DynDNS.

These DNS services provide geoIP based load balancer that can distribute traffic based on the region from which it originated. Currently there are 4 major regions in the world – US-East, US-West, Europe and APAC. GeoIP based traffic distribution besides balancing the load based on origination also has the added benefit of getting to the application closest to the origination thus reducing latencies.

The GeoIP based traffic distributor can distribute traffic to the closest region. An Amazon’s ELB can then internally distribute the traffic among the instances within that region. For a look at some typical problems in multi-region cloud deployments do look at my post “Cache-22

INWARDi Technologies

Deploying across regions

Find me on Google+

Singularity

Pete Mettle felt drowsy. He had been working for days on his new inference algorithm. Pete had been in the field of Artificial Intelligence (AI) for close to 3 decades and had established himself as the father of “semantics”. He was particularly renowned for his 3 principles of Artificial Intelligence. He had postulated the Principles of Learning as

The Principle of Knowledge Acquisition: This principle laid out the guidelines for knowledge acquisition by an algorithm. It clearly laid out the rules of what was knowledge and what was not. It could clearly delineate between the wheat and chaff from any textbook or research article.

The Principle of Knowledge Assimilation: This law gave the process for organizing the acquired knowledge in facts, rules and underlying principles. Knowledge assimilation involved storing the individual rules, the relation between the rules and provided the basis for drawing conclusions from them

The Principle of Knowledge Application: This principle according to Pete was the most important. It showed how all knowledge acquired and assimilated could be used to draw inferences andconclusions. In fact it also showed how knowledge could be extrapolated to make safe conclusions.

Zengine The above 3 principles of Pete were hailed as a major landmark in AI. Pete started to work on an inference engine known as “Zengine” based on his above 3 principles. Pete was almost finished fine tuning his algorithm. Pete wanted to test his Zengine on the World Wide Web. The World Wide Web had grown into gigantic proportions. A report in May 2025 issue of Wall Street Journal mentioned that the total data that was held in the internet had crossed 400 zettabytes and that the daily data stored on the web was close to 20 terabytes. It was a well known fact that there an enormous amount of information on the web on a wide variety of topics. Wikis, blogs, articles, ideas, social networks and so on there was a lot of information on almost every conceivable topic under the sun.

Pete was given special permission by the governments of the world to run his Zengine on the internet. It was Pete’s theory that it would take the Zengine close to at least a year to process the information on the web and make any reasonable inferences from them. Accompanied by world wide publicity Zengine started its work of trying to assimilate the information on the World Wide Web. The Zengine was programmed to periodically give a status update of its progress to Pete.

A few months passed. Zengine kept giving updates on the number of sites, periodicals, blogs it had condensed into its knowledge database. After about 10 months Pete received a mail. It read “Markets will crash on March 2026. Petrol prices will sky rocket – Zengine. Pete was surprised at the forecast. So he invoked the API to check on what basis the claim had been made. To his surprise and amazement he found that a lot events happening in the world had been used to make that claim which clearly seemed to point in that direction. A couple of months down the line there was another terse statement “Rebellion very likely in Mogadishu in Dec 2027”. – Zengine.The Zengine also came with corollaries to Fermat’s last theorem. It was becoming clear to Pete and everybody that the Zengine was indeed becoming smarter by the day..It became apparent to everybody when Zengine would become more powerful than human beings.

Celestial events: Around this time peculiar events were observed all over the world. There were a lot of celestial events that were happening. Phenomenon like the aurora borealis became common place. On Dec 12, 2026 there was an unusual amount of electrical activity in the sky. Everywhere there were streaks of lightning. By evening time slivers of lightning hit the earth in several parts of the world. In fact if anybody had viewed the earth from outer space then it would have a resembled a “nebula sphere” with lightning streaks racing towards the earth in all directions. This seemed to happen for many days. Simultaneously the Zengine was getting more and more powerful. In fact it had learnt to spawn of multiple processes to get information and return to it.

Time-space discontinuity: People everywhere were petrified of this strange phenomenon. On the one hand there was the fear of the takeover of the web by the Zengine and on the other was this increased celestial activity. Finally on the morning of Jan 2028 there was a powerful crack followed by a sonic boom and everywhere people had a moment of discontinuity. In the briefest of moments there was a natural time-space discontinuity and mankind had progressed to the next stage in evolution.

The unconscious, sub conscious and the conscious all became a single faculty of super consciousness. It has always been known from the time of Plato that man knows everything there is to know. According to Platonic doctrine of Recollection, human beings are born with a soul possessing all knowledge, and learning is just discovering or recollecting what the soul already knows. Similarly according to Hindu philosophy, behind the individual consciousness of the Atman, is the reality known as the Brahman which is universal consciousness attained in a deep state of mysticism through self-inquiry.

However this evolution by some strange quirk of coincidence seemed to coincide with the development of the world’s first truly learning machine. In this super conscious state a learning machine was not something to be feared but something which could be used to benefit mankind. Just like cranes can lift and earthmovers perform tasks that are beyond our physical capacity so also a learning machine was a useful invention that could be used to harness the knowledge from mankind’s storehouse – the World Wide Web.

Find me on Google+

Design Principles of Scalable, Distributed Systems

Designing scalable, distributed systems involves a completely different set of principles and paradigms when compared to regular monolithic client-server systems. Typical large distributed systems of Google, Facebook or Amazon are made up of commodity servers.  These servers are expected to fail, have disk crashes, run into network issues or be struck by any natural disasters.

Rather than assuming that failures and disasters will be the exception these systems are designed assuming the worst will happen.  The principles and protocols assume that failures are the rule rather than the exception. Designing distributed systems to accommodate failures is the key to a  good design of distributed scalable systems. A key consideration of distributed system is the need to maintain consistency, availability and reliability. This of course is limited by the CAP Theorem postulated by Eric Brewer which states that a system can only provide any two of “consistency, availability and partition tolerance” fully,

Some key techniques in distributed systems

Vector Clocks: An obvious issue in distributed systems with hundreds of servers is that each server will have its own clock running at a slightly different rate. It is difficult to get a view of a global time considering that each system has slightly different clock speeds. How does one determine causality in such a distributed system? The solution to this problem is provided by Vector Clocks devised by Leslie Lamport. Vector Clocks provide a way of determining the causal ordering of events.  Each system maintains an array of timestamps based on its own internal clock which it keeps incrementing. When a system needs to send an event to another system it sends the message with the timestamp generated from its internal array.  When the receiving system receives the message at a timestamp that is less than the sender’s timestamp it increments its own timestamp by 1 and continues to increments its internal array through its own internal clock. In the figure the event sent from System 1 to System 2  is assumed to be fine since the timestamp of the sender “2”  < “15. However when System 3 sends an event with timestamp 40 to System 2 which received it timestamp 35, to ensure a causal ordering where System 2 knows that it received the event after it was sent from System the vector clock is incremented by 1 i.e 40 + 1 = 41 and System 2 increments at it did before, This ensures that partial ordering of events is maintained across systems.

Vector clocks have been used in Amazon’s e-retail website to reconcile updates.  The use of vector clocks to manage consistency has been mentioned in Amazon’s Dynamo Architecture

Distributed Hash Table (DHT): The Distributed Hash Table uses a 128 bit hash mechanism to distribute keys over several nodes that can be conceptually assumed to reside on the circumference of a circle. The hash of the largest key coincides with the hash of the smallest key. There are several algorithms that are used to distribute the keys over this conceptual circle. One such algorithm is the Chord System. These algorithms try to get to the exact node in the smallest number of hops by storing a small amount of local data at each node. The Chord System maintains a finger table that allows it to get to the destination node in O (log n) number of hops. Other algorithms try to get to the desired node in O (1) number of hops.  Databases like Cassandra, Big Table, and Amazon use a consistent hashing technique. Cassandra spreads the keys of records over distributed servers by using a 128 bit hash key.

Quorum Protocol:  Since systems are essentially limited to choosing two of the three parameters of consistency, availability and partition tolerance, tradeoffs are made based on cost, performance and user experience. Google’s BigTable chooses consistency over availability while Amazon’s Dynamo chooses ‘availability over consistency”. While the CAP theorem maintains that only 2 of the 3 parameters of consistency, availability and partition tolerance are possible it does not mean that Google’s system does not support some minimum availability or the Dynamo does not support consistency. In fact Amazon’s Dynamo provides for “eventual consistency” by which data become consistent after a period of time.

Since failures are inevitable and a number of servers will fail at any instant of time writes are replicated across many servers. Since data is replicated across servers a write is considered “successful” if the data can be replicated in N/2 +1 servers. When the acknowledgement comes from N/2+1 server the write is considered successful. Similarly a quorum of reads from >N/2 servers is considered successful. Typical designs have W+R > N as their design criterion where N is the total number of servers in the system. This ensures that one can read their writes in a consistent way.  Amazon’s Dynamo uses the sloppy quorum technique where data is replicated on N healthy nodes as opposed to N nodes obtained through consistent hashing.

Gossip Protocol: This is the most preferred protocol to allow the servers in the distributed system to become aware of server crashes or new servers joining into the system, Membership changes and failure detection are performed by propagating the changes to a set of randomly chosen neighbors, who in turn propagate to another set of neighbors. This ensures that after a certain period of time the view becomes consistent.

Hinted Handoff and Merkle trees: To handle server failures replicas are sometimes sent to a healthy node if the node to which it was destined was temporarily down. For e.g.  data destined for Node A is delivered to Node D which maintains a hint in its metadata that the data is to be eventually handed off to  Node A when it is healthy.  Merkle trees are used to synchronize replicas amongst nodes. Merkle trees minimize the amount of data that needs to be transferred for synchronization.

These are some of the main design principles that are used while designing scalable, distributed systems. A related  post is “Designing a scalable architecture for the cloud

Find me on Google+

Latency, throughput implications for the Cloud

The key considerations for any website are latency and throughput. These two parameters are extremely important to web designers as the response time of the web site and the ability to handle large amounts of traffic are directly related to the user experience and the loyalty of returning users.

What are these two parameters and why are they significant? Before looking at latency we need to understand what the response time of the web application is. Ideally this could be defined as the time between the receipt of the HTTP request and the emitting of the corresponding response. Unfortunately any web site hosted on the World Wide Web adds a lot more delay than the response time. This delay comes as the latency of the web site and is primarily due to the propagation and transmission delays on the internet. There are many contributors to this latency starting from the DNS lookup, to the link bandwidth etc.

Throughput on the other hand represents the maximum simultaneous queries or transactions per second that the web application is capable of handling. This is usually measured as transactions-per-second (tps) or queries-per-second (qps).

A good way to understand response time and throughput is to use a oft used example, of a retail store handling customers.  Assuming that there are 5 counter clerks who take 1 minute to check out a customer  we can readily see that as the number of customers to the store increases the throughput increases from 1 customer/minute to a maximum of 5 customers/minute.  Since the cashiers are able to process in 1 minute the response time for the customer is 1 minute/customer. Assuming a 6th customer enters and needs to checkout he/she will have to wait, for e.g.1 minute, if the 5 counter clerks are busy processing 5 other clients,. Hence the response time for the customer will be 1 minute (waiting) + 1 minute (servicing) = 2 minute. The response time increases from 1 minute to 2 minute.  If further clients are ready to check out the length of the wait in the queue will increase and hence the response time. Clearly the throughput cannot increase beyond 5 customers/minute while the response time will increase non-linearly as the clients enter the store faster than they can checked out by the counter clerks.

This is precisely the behavior of web applications. When the traffic to a web site is increased the throughput increases linearly and finally reaches a throughput “plateau”. After this point as the load is increased the throughput remains saturated at this level.  While on the other hand the response time is low at low traffic  it starts to increase non-linearly with increasing load and continues to increase as it maxes out  system resources like the CPU and memory.

When deploying applications on the cloud the latency and throughput are key considerations which are needed to determine the kind of computing resources that  are needed in  the cloud.  Assuming the web application has been optimized and performance tuned for optimum performance what needs to be done is run load testing of the application on the cloud using different CPU instances. For example assume that application is load tested on a small CPU instance.  We need to get the response times and throughput plots with increasing loads. Similarly we now need to deploy the web application on a medium instance and plot response times and the throughput plateaus on the medium instances.

Now the choice as to whether to go for a small CPU instance or medium CPU instance can be calculated as follows. Assuming that the requirements of the web application is to have a response time of ‘t’ seconds then we determine the corresponding traffic handling capacity , for the small CPU instance, say ‘c’ and for the medium CPU instance, let’s assume ‘C’. If the web site has to handle to total traffic of T then we determine the number of instances needed in each case. For the

small CPU instance it will be n= (T/c) + 1

and for

the medium CPU instance it will be N =( T/C)+1.

Now we compute the relative costs of the small and medium CPU instances and identify which is more economical. For example if r1 is the cost per hour of the small CPU instance and R1 is the cost of the medium CPU instance we choose

The small CPU instance if r1 *n < R1 *N (per hour)

While on the other hand if R1 *N < r1 *n then we will choose the medium instance.

Hence the determination of which CPU instance and the configuration of the web application on the cloud will depend on appropriate performance tuning and proper load testing on the cloud. Do also ready my other posts on latency namely ‘The Many faces of latency” and “The Anatomy of Latency“.

Also see latency and throughput in action in the following series of posts

– Bend it like Bluemix, MongoDB with autoscaling – Part 1

– Bend it like Bluemix, MongoDB with autoscaling – Part 2

– Bend it like Bluemix, MongoDB with autoscaling – Part 3

Find me on Google+