Scaling out

Web Applications have challenges that are unique to the domain of the web. The key differentiating fact between internet technologies and other technologies is the need to scale up to handle sudden and large increases in traffic. While telecommunication and data communication also have the need to handle high traffic these products can be dimensioned on some upper threshold. Typical web applications have the need to provide low latencies, handle large throughput and also be extremely scalable to changing demands.

The ability to scale seems trivial. It appears that one could just add more CPU horsepower and throw in memory and bandwidth in large measure to get the performance we need. But unfortunately this is not as simple as it appears. For one, adding more CPU horsepower may not necessarily result in better performance. A badly designed application will only improve marginally.

Some applications are “embarrassingly parallel”. There is no dependency of one task on another.  For example if the task is to search for a particular string in documents or one that requires the conversion from  AVI to MPEG format then this can be done in parallel. In a public cloud this could be achieved by running more instances.

However, most real world applications are more sequential than parallel. There is a lot of dependency of data between the modules of the application.  When multiple instances have to run  in a public cloud the design considerations can be quite daunting.

For example if we had an application with parts 1,2,3,4 as follows

Now let us further assume that this is a web application and thousands of requests come to it. For simplicity sake, if we assume that for each request there is counter that has to be incremented.  How does one keep track of the total requests that are coming to application?  The challenge is how to manage such a global counter when there are multiple instances.  In a monolithic application this does not pose a problem. But with multiple instances handling web requests, each having its own copy of this counter the design becomes a challenge

Each instance has its own copy of the counter which it will update based on the requests that come to that instance through a load balancer. However, how does one compute the total number of requests that e come to all the instances

One possible solution is to use the memcached approach.  Memcached was developed as solution by Danga Corporation for Livejournal.  Memcached is a distributed caching mechanism that stores data in multiple participating servers. Memcached has some simple API calls like get(key) and set (key,value) . Memcached uses a consistent hashing mechanism which hashes the key to one of the servers among several servers. This method is known as the Distributed Hashing Table (DHT) by which it is able to distribute the keys to one of the servers. The Consistent Hashing technique is able to handle server crashes and new servers joining in the distributed cache. Since data is distributed among servers participating in the distributed cache when a server crashes all its data is distributed to the remaining servers. Similarly a server joining in the distributed cache also distributes some of the data to it. Memcached has been used in Facebook, Zynga and Livejournal.

Find me on Google+

The Future of Programming Languages

How will the computing landscape evolve as we move towards the next millennium? Clearly the computer architecture will evolve towards a more parallel architecture with multiple CPUs each handling a part of the problem in parallel. However, programming parallel architectures is no simple task and will challenge the greatest minds.

In the future where the problems and the architectures will be extremely complex, the programming language will itself evolve towards simplicity. The programming language will be based on natural language that we use to define problems. Behinds the scenes of the natural language interface will be complex algorithms of Artificial Intelligence which will perform the difficult task of specifying the definition of problem into a high level programming language like C++, Java etc.

The Artificial Intelligence interface will handle the task of creating variables, forming loops and actually defining classes and handling errors. The code generated by the machine will have far less syntactical errors than those created by human beings. However while a large set of problems can be solved by the AI interface there will be a certain class of problems which will still require human intervention and the need to work in the languages of today.

One of the drivers for this natural language of programming of the future, besides the complexity of the computer architecture is the need to bring a larger section of domain experts to solve the problems in their fields without the need to learn all the complex syntax, and semantics of the current programming languages.

This will allow everybody from astrophysicists, geneticists, historians and statisticians to be able to utilize the computer to solve the difficult problems in their domain.

Find me on Google+

Cloud Computing – Design Considerations

Cloud Computing is definitely turning out to be the proverbial carrot for enterprises to host their applications on the public cloud. The cloud promises many benefits to users of the cloud. Cloud Computing obviates the need for upfront capital expenses for computing infrastructure, real estate and maintenance personnel. This technology allows for scaling up or scaling down as demand on the application fluctuates.

While the advantages are many, migrating application onto the cloud is no trivial task.  The cloud is essentially composed of commodity servers. The cloud creates multiple instances of the application and runs it on the same or on different servers. The benefit of executing in parallel is that the same task can be completed faster. The cloud offers enterprises the ability to quickly scale to handle increasing demands,

But the process of deploying applications on to the cloud requires that the application be re architected to take advantage of this parallelism that the cloud provides. But the ability to handle parallelization is no simple task. The key attributes that need to be handled by distributed systems is the need for consistency and availability. If there are variables that need to be shared across the parallel instances then the application must make special provisions to handle this and ensure consistency. Similarly the application must be designed to handle failures.

Applications that are intended to be deployed on the cloud must be designed to scale-out rather than having the ability to scale-up. Scaling up refers to the process of adding more horse power by way of faster CPUs, more RAM and faster throughput.  But applications that need to be deployed on the cloud need to have the ability to scale out or scale horizontally where more servers are added without any change in processing horsepower.  The design for horizontal scalability is the key to cloud computing architectures.

Some of the key principles to keep in mind while designing for the cloud is to ensure that the application is composed of loosely coupled processes preferably based on SOA principles.  While a multi-threaded architecture where resource sharing through mutexes works in monolithic applications such a architecture is of no help when there are multiple instances of the same application running on different servers. How does one maintain consistency of the shared resource across instances?  This is a tough problem to solve. Ideally the application should be thread safe and should be based on a shared – nothing kind of architecture. One such technique is to use queues that the cloud provides as a means of sharing across instances. However this may impact the performance of the system.  Other methods include using ‘memcached’ which has been used successfully by Facebook, Twitter, Livejournal, Zynga etc deployed on the cloud. Still another method is to use the Map-Reduce algorithm where the variables across instances are handled by ‘map’ and the ‘reduce’ part handles the consistency across instances.

Another key consideration is the need to support availability requirements. Since the cloud is made up of commodity hardware there is every possibility of servers failing.  The application must be designed with inbuilt resilience to handle such failures. This could by designing active-standby architecture or by providing for checkpointing so that application can restart from some known previous point.

Hence while cloud computing is the way to go in the future there is a need to be able to carefully design the application so that full advantage of the cloud can be taken.

Ramblings on Lisp

In the world of programming languages Lisp can be considered truly ancient along with its companion FORTRAN. However it has survived till this day. Lisp had its origins as early as 1958 when John McCarthy of MIT published the design of the language in an ACM paper titled “Recursive Functions of Symbolic Expressions and Their Computation by Machine”. Lisp was invented by McCarthy as a mathematical notation of computer programs and is based on Lambda Calculus described by Alonzo Church in 1930.

Lisp has not had the popularity of other more recent languages like C, C++ and Java partly because it has an unfamiliar syntax and also has a very steep learning curve. The Lisp syntax can be particularly intimidating to beginners with its series of parentheses. However it is one of predominant language that is used in AI domain.

Some of the key characteristics of Lisp are

Lisp derives its name from LISt Processing. Hence while most programming languages try to compute on data, Lisp computes on a data structure namely the list. A list can be visualized as a collection of elements which can be either data or functions or lists themselves. Its power comes from the fact that the language includes in its syntax some of the operations that can be performed on a lists and lists of lists. In fact many key features of the Lisp Language have found their way into more current languages like Python, Ruby and Perl.

Second Lisp is a symbolic processing language. This ability to manipulate symbols gives Lisp a powerful edge over other Programming Languages in AI domains like theorem proving or natural language processing.

Thirdly Lisp uses a recursive style of programming. This makes the code much shorter than other languages. Recursion enables the expression of the problem as a combination of a terminating condition and a self describing sub problem.  However the singular advantage that Lisp has over other programming languages is that it uses a technique called “tail recursion” . The beauty of tail recursion is that computing  space is of the order of O(1) and not O(n) that is common in languages like C,C++,Java where the size of the stack grows with each subsequent recursive call.

Lisp blurs the distinction between functions and data. Functions use other functions as arguments during computations. Lisp lists are functions that operate on other functions and data in a self repeating fashion.

The closest analogy to this is to think of machine code which is sequence of 32 bit binary words. Both the logic and the data on which they operate are 32 bit binary words and cannot be distinguished unless one knows where the program is supposed to start executing. If one were to take the snapshot of consecutive memory locations we will encounter 32 bit binary words which represent either a logical or arithmetic operation on data which are also 32 bit binary words.

Lisp is a malleable language and allows the programmer to tailor the language for his own convenience. It allows the programmer to manipulate the language so that it suits the programming style of the programmer. Lisp programs evolve from the bottom-up rather from the top-down style adopted in other languages. The design methodology of Lisp programs takes an inside-out approach rather than an outside-in method.

Lisp has many eminent die-hard adherents who swear by the elegance and beauty of being able to solve difficult problems concisely. On the other hand there are those to whom Lisp represents an ancient Pharaoh’s curse that is difficult to get rid of.

However with Lisp, “Once smitten, you remain smitten”.

Find me on Google+

Accelerating growth through M-Banking & M-Health

While only roughly about 5% of the population has access to computers, more than 50% of the world population has mobile phones. Mobile phones have become cheaper and are more ubiquitous these days. Hence given the penetration of mobile phones it makes sense to use them for improving the lives of those in emerging economies. Two such technologies which hold enormous potential are m-banking and m-health described below.

M-banking: Financial services, a key driver for economic growth, is either negligible or completely absent in remote rural areas. Regular banking services are unviable in these areas. The small deposits and loans held by the rural poor make it unprofitable for traditional banks to operate in these areas through traditional delivery methods.  In these areas m-banking is truly a god send.

M-Banking refers to financial services offered by Service Providers to the unbanked poor in rural areas. The people in these villages can purchase either pre-paid or post-paid units from the Operator. They can then use these units to pay for goods and services. M-Banking offers a safe and secure method to the unbanked poor for sending and receiving payments through SMS’es. To make m-banking a reality, requires the coming together of the 3 major players namely the Service Provider, the Application developer and the financial institution which can regulate and disburse units of money.

A recent report by McKinsey with GSMA in 147 countries shows that more than 1.7 billion people in emerging economies will have a mobile phone without access to banking services. The McKinsey reports also states that by 2012 the opportunity in m-banking would generate $5 billion annually in direct revenue from financial transactions and $3 billion in indirect revenue through reduced customer churn and higher ARPU for traditional voice and SMS services.

Some examples of success are M-Pesa of Kenya, Wizzit in South Africa and Globe in Philippines.  M-banking provides a 24×7 service in the village and does not require any complicated infrastructure. One could imagine services where the unbanked poor could receive instant payment for the farm produce, could save money on a regular basis and pay for electricity bills instantaneously through SMS. This will increase both the sense of security and personal well being.

M-banking provides for tremendous socio-economic growth in the villages. With the increasing penetration and the ubiquity of mobile phones m-banking represents a sure-shot way of ensuring all round economic transformation in the villages.  M-banking helps in reducing risk and brings true convenience to financial transactions. However, appropriate authentication and authorization procedures should be used.

Mobile banking does not need expensive infrastructure that is required of banks, the network of ATMs for depositing and withdrawal of money. M-banking is convenient, secure, easy to use and can be quickly deployed. While the Service Providers are the facilitators of m-banking, it is the financial institutions that will regulate and provide banking facility to the unbanked poor.

Hence m-banking is a complete win-win situation for the all the players involved namely the CSPs, the financial institutions and the unbanked poor. Besides providing convenience, m-banking will be a key driver for all round economic growth in the villages.

M-Health: Another companion technology which has enormous potential in emerging markets is m-health. M-health relates to the provision of health services in rural areas where there is an acute shortage of qualified health workers. In these areas the use of mobile communication can help in addressing key health needs of the poor, thanks to the explosive growth of mobile phones in these areas.

Some of the key benefits of m-health is the ability to spread timely health related information and diagnoses to the health workers in the villages enabling the ability to quickly track and contain the spread of diseases and epidemics. Other applications include remote data collection and monitoring of health related issues.

Recent estimates indicate that half of the population in remote areas will have a mobile phone by 2012. This provides inmates in even the remote villages’ instant access to the important health related information’s-health along with m-banking can allow Health organizations to transfer funds which the needy can use for performing health checkups.

Like m-banking SMS is a key enabler of m-health. SMS’es can be sent to educate and spread awareness of diseases, transfer funds and for informing the availability of health services. Health workers can mobile phones or PDAs to collect and send disease related data.

M-health also provides a unique opportunity for Service Providers, Health institutions, insurance companies and the patients themselves.

Conclusion: With the increasing penetration of mobiles both m-banking and m-health are particularly relevant today. Both these technologies are capable of not only transforming the economic landscape but also providing the CSPs, financial and health institutions with a sound business and strategic advantage.

Find me on Google+

Singularity

Pete Mettle felt drowsy. He had been working for days on his new inference algorithm. Pete had been in the field of Artificial Intelligence (AI) for close to 3 decades and had established himself as the father of “semantics”. He was particularly renowned for his 3 principles of Artificial Intelligence. He had postulated the Principles of Learning as

The Principle of Knowledge Acquisition: This principle laid out the guidelines for knowledge acquisition by an algorithm. It clearly laid out the rules of what was knowledge and what was not. It could clearly delineate between the wheat and chaff from any textbook or research article.

The Principle of Knowledge Assimilation: This law gave the process for organizing the acquired knowledge in facts, rules and underlying principles. Knowledge assimilation involved storing the individual rules, the relation between the rules and provided the basis for drawing conclusions from them

The Principle of Knowledge Application: This principle according to Pete was the most important. It showed how all knowledge acquired and assimilated could be used to draw inferences andconclusions. In fact it also showed how knowledge could be extrapolated to make safe conclusions.

Zengine The above 3 principles of Pete were hailed as a major landmark in AI. Pete started to work on an inference engine known as “Zengine” based on his above 3 principles. Pete was almost finished fine tuning his algorithm. Pete wanted to test his Zengine on the World Wide Web. The World Wide Web had grown into gigantic proportions. A report in May 2025 issue of Wall Street Journal mentioned that the total data that was held in the internet had crossed 400 zettabytes and that the daily data stored on the web was close to 20 terabytes. It was a well known fact that there an enormous amount of information on the web on a wide variety of topics. Wikis, blogs, articles, ideas, social networks and so on there was a lot of information on almost every conceivable topic under the sun.

Pete was given special permission by the governments of the world to run his Zengine on the internet. It was Pete’s theory that it would take the Zengine close to at least a year to process the information on the web and make any reasonable inferences from them. Accompanied by world wide publicity Zengine started its work of trying to assimilate the information on the World Wide Web. The Zengine was programmed to periodically give a status update of its progress to Pete.

A few months passed. Zengine kept giving updates on the number of sites, periodicals, blogs it had condensed into its knowledge database. After about 10 months Pete received a mail. It read “Markets will crash on March 2026. Petrol prices will sky rocket – Zengine. Pete was surprised at the forecast. So he invoked the API to check on what basis the claim had been made. To his surprise and amazement he found that a lot events happening in the world had been used to make that claim which clearly seemed to point in that direction. A couple of months down the line there was another terse statement “Rebellion very likely in Mogadishu in Dec 2027″. – Zengine.The Zengine also came with corollaries to Fermat’s last theorem. It was becoming clear to Pete and everybody that the Zengine was indeed becoming smarter by the day..It became apparent to everybody when Zengine would become more powerful than human beings.

Celestial events: Around this time peculiar events were observed all over the world. There were a lot of celestial events that were happening. Phenomenon like the aurora borealis became common place. On Dec 12, 2026 there was an unusual amount of electrical activity in the sky. Everywhere there were streaks of lightning. By evening time slivers of lightning hit the earth in several parts of the world. In fact if anybody had viewed the earth from outer space then it would have a resembled a “nebula sphere” with lightning streaks racing towards the earth in all directions. This seemed to happen for many days. Simultaneously the Zengine was getting more and more powerful. In fact it had learnt to spawn of multiple processes to get information and return to it.

Time-space discontinuity: People everywhere were petrified of this strange phenomenon. On the one hand there was the fear of the takeover of the web by the Zengine and on the other was this increased celestial activity. Finally on the morning of Jan 2028 there was a powerful crack followed by a sonic boom and everywhere people had a moment of discontinuity. In the briefest of moments there was a natural time-space discontinuity and mankind had progressed to the next stage in evolution.

The unconscious, sub conscious and the conscious all became a single faculty of super consciousness. It has always been known from the time of Plato that man knows everything there is to know. According to Platonic doctrine of Recollection, human beings are born with a soul possessing all knowledge, and learning is just discovering or recollecting what the soul already knows. Similarly according to Hindu philosophy, behind the individual consciousness of the Atman, is the reality known as the Brahman which is universal consciousness attained in a deep state of mysticism through self-inquiry.

However this evolution by some strange quirk of coincidence seemed to coincide with the development of the world’s first truly learning machine. In this super conscious state a learning machine was not something to be feared but something which could be used to benefit mankind. Just like cranes can lift and earthmovers perform tasks that are beyond our physical capacity so also a learning machine was a useful invention that could be used to harness the knowledge from mankind’s storehouse – the World Wide Web.

Find me on Google+