Wednesday, December 29, 2010

Carbon: Towards a Server Building Framework for SOA Platform

Following is the slides I presented at the 5th International Workshop on Middleware for Service Oriented Computing (co-located with Middleware 2010, India). The paper can be found in http://portal.acm.org/citation.cfm?id=1890914 and here.

The paper presents Carbon, the underline platform for all WSO2 products. Based on OSGI, Carbon provides a server building framework for SOA with a kernel, which handles most of the details regarding building SOA servers. The paper discusses the design decisions, potential impact, and its relationship to state of art in the Component Based Software Engineering.

BISSA: Empowering Web gadget Communication with Tuple Spaces

Following is a talk I did at the 8th International Workshop on Middleware for Grids, Clouds and e-Science (co located with Super Computing 2010 at New Orleans). The research paper with the same title can be found at http://portal.acm.org/citation.cfm?id=1890809.

The paper presents Biassa, a scalable tuple space implementation on top of a Distributed Hash Table, and its application as an inter gadget communication medium. This is the outcome of a University of Moratuwa final year project done by Pradeep Fernando, Charith Wickramarachchi, Dulanjanie Sumanasena, and Udayanga Wickramasinghe and supervised by Dr. Sanjiva Weerawarana  and myself. I am elated that MRT final year projects can go this far!

An Scaling pattern: Life beyond Distributed Transactions: an Apostate’s Opinion

The paper “Life beyond Distributed Transactions: an Apostate’s Opinion” provides an inserting a scaling pattern to cope without transactions for very large scale.

The idea can be described using three things: entities, activities, and using workflows instead of transactions.
  • entity is a single collection of data that lives within a single scope of serializability (e.g. cluster). Each entity has a unique key, and transactions cannot expand multiple entities.
  • Activity is a relationship between two entities (holds all state about such a relationship).
  • Finally, the key idea is that entities cope by not doing transactions with each other, but handling the uncertainty rise out of lack of transactions through a workflow. 
  • The author makes an interesting observation that normal world does not have transactions, but cope with uncertainties through time limits, compensations, cancellations etc., or in other words cope through a workflow. So he argues that activities should be carried out with tentative message, confirmation or cancellation model.
So does that scale? Yes, handling stabilizability within one entity is very much possible, and rest is data partition, which is well understood. Only key decider is that can we model uncertainties that arise from lack of transactions through workflows. My gut feeling is most of the time we can, but there certainly are exceptions.

Tuesday, November 16, 2010

SC keynote: How to Create New Growth Businesses in a Risk-Minimizing Environment

Clayton M. Christensen (author of The Innovator's Dilemma) talked about descriptive technology. This is new product which simplify and drastically change the user experience which go on to kill big players.

Discussed DEC as a example, how Apple PC, and then PC killed all mainframe players. Stressed that  at start PC looked like a joke, none of the users thought that PC is useful. So listening to users did  not helped the companies at all. Manager did all the right things (from business school point of view), but did not worked. Other examples, Steel (integrated plants-> Mini plants), Vacuum Tubes->transistors

Point is listening to your customers does not going to help you. Because they do not understand the process they are following and nor they understand how to improve it. Instead try to understand what he is actually trying to do and improve it. Argued that understanding should be done based in function, emotions, and social aspects of the product.

Also stressed that going for only things that let you maximize profit and outsourcing can let other players enter the market and topple you (big player eventually). It is more or less local vs. global (short term vs. long term) optimization argument. Gave dell and Azure as an example.

Apparently IBM is one of the companies that has been able to make through from main frame era to PC era and beyond. Argued  that they did so by shutting down business units and letting new business units to take over so that in overall the company can survive.

Final note, on a survey, 93% of entrepreneur succeeded on a different idea than that what they started, which suggests success depends on ability to understand and reposition themselves.

Thursday, October 7, 2010

You Can't Sacrifice Partition Tolerance?

There is a very nice post on http://codahale.com/you-cant-sacrifice-partition-tolerance/.

One complications to that argument is replication implemented with group communication, which is a reasonable approximation to a system with both availability and consistency. (Just thinking aloud, as these stuff needs lots of thinking to be sure). Now what happen in group communication is that if a node fails, it creates a new group excluding failed node, and although the system operation has to wait till new group is defined (rather repair the existing node ), after the repair, things will work fine given that you have enough replicas left. So does group communication contradict or works around above assertion?

Wednesday, October 6, 2010

NBQSA 2010 - Overall Gold award goes to WSO2 ESB

Copied from Ruwan's Blog, NBQSA 2010 - Overall Gold award goes to WSO2 ESB: "

NBQSA (National Best Quality Software Award) is an annual award ceremony held in Sri Lanka to evaluate and award the Software developedin the country. This is hosted by BCSSL (British Computer Society - Sri Lanka)

WSO2 marketing team decided to submit 3 of our products to this award evaluation, it was just few days we had to prepare for this presentation. I initially objected to submitting WSO2 ESB (the product that I am responsible of in WSO2) for this so quickly, as my intention was to win the first price if we are to submit. Kushlani from our marketing team together with Asanka has been driving this and motivated us a lot to submit for this.

Well, we spent few hours and Miyuru, Kasun and few others get together with me and prepared a plan to execute with a presentation script and demonstration. Miyuru did the first round
presentation and we got through to the second round easily. Hiranya took it over and we delivered the second presentation, to which we got a very good response from the judges.

Then, we were waiting to see the results and we had to app
arently wait till, day before yesterday (1st of October 2010) for that.

From WSo2 8 of us got invited and we all went on time with a lot of enthusiasm as all 3 of our products were on the wining cycle. The award ceremony got started and the first
to get an award from the 3 products was the WSO2 Gadget Server which got the Silver award under the RnD software category.

Then I was so waiting till the Infrastructure and Tools category to be awarded since both the WSO2 Data Services and WSO2 ESB was on that category.

They had only 2 awards on that category and one Bronze and a Gold. When it is announced that WSO2 DS as the bronze award I knew that ESB is getting the Gold award. I was not that excited, to be frank, I sort of knew that we are going to get it on our category :-) I was expecting the overall gold award. ;-)

Then they came to announcing the final overall awards, before which they have given out some special awards, but ESB was no where on those special awards. I was waitin
g and waiting and waiting... so does all of us from WSO2.

Finally they have announced the Bronze and Silver awards and those are the folks who won some special awards and I was 90% sure about my expectation now. And they finally announced WSo2 ESB as the Overall Gold winner at NBQSA 2010. I was so excited to accept the best award on that ceremony as the product manager of WSO2 ESB. I got down from the stage holding the most valuable award of the nite as the most proud man on that nite, though I must say that this price is for the whole WSO2 team including the past members of our team.

All of us who participated for the event from Left, Samisa, Asanka, Kushlani, Miyuru (holding the Gadget Server silver award), Me (holding the overall gold for ESB), Hiranya (holding the ESB Gold award on Infrastructure and Tools category), Sumedha (holding Data Service Server bronze award) and Azeez.

Now we are about to head towards APICTA in next week at Malaysia, Kuala Lumpur. Will update on that as well. :-)

"

Saturday, September 18, 2010

Zero Copy pass through in WSO2 ESB

There have been some confusion on Zero Copy pass through scenario (we call that Binary Relay) implementation at WSO2 ESB. Let me try to clarify that.



To implement Binary Relay, we use Axis2 architecture. Axis2 works only with SOAPEnvelope as the input to its pipeline,  but users can add Formatters and builders against different content types. Job of a Builder is to build a SOAP envelope from anything (e.g. HTTP SOAP Message, from fast XML, from Jason message etc.) and pass it in to Axis2 pipeline, and job of a Formatter is to take a envelope and write it out in any expected form (write it as a text message, to fast XML, to Jason message etc).

To implement Binary Relay,  we wrote a builder, that takes the input stream and and hides it inside a fake SOAP message without reading it, and wrote a formatter that takes the input stream and writes it directly to a output stream. (of course, we take a small buffer and use that to move data between two streams).

Now if you want to understand how it works, best way is to look at the code.

When you look at the code, it is important to note following. Pass through works only if no one access the content of the message. However, there are many cases where user wants to cache (e.g. logging or any other intermediate use), and we detect that and handles it. If you look at BinaryRelayBuilder.java, you will see that we create a DataHandler, hides the input stream inside the DataHandler, and passes it in. To understand how we stream the data, you should look StreamingOnRequestDataSource.java. There, if that is the last use, we just take the input stream and stream it.

If you just look at BinaryRelayBuilder.java, readAllFromInputSteam(..) method can be misleading, and the real code that does the streaming is at StreamingOnRequestDataSource.java. We do cache. But that is ONLY IF something tried  to access the SOAP body, and if nothing reads the SOAP body, it is zero copy, as we pass over the input stream as it is to next code.

Sunday, September 12, 2010

WSO2 Con Tomorrow

First WSO2 Con will be starting tomorrow, and to make things better, this is also our 5th year celebration. Refer to Sanjiva's blog to find out more about the topic. There is lot to say, but I am going to put that off for later. Let me just say that in my opinion, what we have done so far is nothing compared to the potential of next five years to come.

I will be doing a session on Doing Enterprise Business with Processes and Rules, and the abstract of the talk is given below.

Business logic describes how a business functions and how it would react to the different conditions arise within the organizations and market. They are typically carefully developed and refined, and often holds the competitive advantage of an organization. Ability to keep track and change the business logic in response to changing conditions is an invaluable assert to any agile organizations. In this talk, Srinath Perera presents Business Processes and Business Rules, which are two alternative approaches to represent and manage business logic instead of embedding them within programming logic and discuss when each of these three modes should be used within the enterprise.

Friday, July 23, 2010

Multi-Tenant SOA Middleware for Cloud Computing

Following are the slides for my talk on WSO2 carbon multi-tenancy architecture at Cloud 2010, two weeks back. The paper describes the WSO2 Statos multi-tenancy Architecture .

Multi-Tenant SOA Middleware for Cloud Computing


The full paper can be found here. The citation of the paper is given below.
Afkham Azeez, Srinath Perera, Dimuthu Gamage, Ruwan Linton,  Prabath Siriwardana, Dimuthu Leelaratne, Sanjiva Weerawarana, Paul Fremantle, "Multi-Tenant SOA Middleware for Cloud Computing" 3rd International Conference on Cloud Computing, Florida, 2010.
Abstract
Enterprise IT infrastructure incurs many costs ranging from hardware costs and software  licenses/maintenance costs to the costs of monitoring, managing, and maintaining IT infrastructure. The recent advent of cloud computing offers some tangible prospects of reducing some of those costs; however, abstractions provided by cloud computing are often inadequate to provide major cost savings across the IT infrastructure life-cycle. Multi-tenancy, which allows a single application to emulate multiple application instances, has been proposed as a solution to this problem. By sharing one application across many tenants, multi-tenancy attempts to replace many small application instances with one or few large instances thus bringing down the overall cost of IT infrastructure. In this paper, we present an architecture for achieving multi-tenancy at the SOA level, which enables users to run their services and other SOA artifacts in a multi-tenant SOA framework as well as provides an environment to build multi-tenant applications. We discuss architecture, design decisions, and problems encountered, together with potential solutions when applicable. Primary contributions of this paper are motivating multitenancy, and the design and implementation of a multitenant SOA platform which allows users to run their current applications in a multi-tenant environment with minimal or no modifications.

Tuesday, July 13, 2010

Towards Improved Data Dissemination of Publish-Subscribe Systems

Last week I presented the paper Towards Improved Data Dissemination of Publish-Subscribe Systems at ICWS 2010. It is based on work done with Rmaith and Dinesh on improving WS-Messenger from Indiana as a part of Open Grid Computing Environment Project.

Abstract: with the proliferation of internet technologies, publish/subscribe systems have gained wide usage as a middleware. However for this model, catering large number of publishers and subscribers while retaining acceptable performance is still a challenge. Therefore, this paper presents two parallelization strategies to improve message delivery of such systems. Furthermore, we discuss other techniques which can be adopted to increase the performance of the middleware. Finally, we conclude with an empirical study, which establishes the comparative merit of those two parallelization strategies in contrast to serial implementations.

Mainly it discusses how to implement parallel message delivery in a Pub/Sub broker while keeping the partial order of events. Slides are given below.

Wednesday, June 9, 2010

Amazon and 11 nines

Amazon has claimed 11 nines on Availability . It is very very hard feast to accomplish, and if they have done it (I would love to know how they decided on that number), that is a ground breaking achievement.

To see why, lets see what it means. Availability is measured as MTTR (Mean time to Recovery)/ MTTF (Mean time to Failure) as a percentage. In other words, it is time to recover after a failure, divided by mean time for such a failure happen. Reliability is measured in terms of number of nines in availability. So Amazon S3 will be fail for a second only for every 10^9 seconds, or 10^9/(360*24*60*60) = 32 years!!

On their seminal paper "High Availability Computer Systems", Jim Gary and Daniel Siewiorek defined availability classes, as follows

unmanaged 90.% - 50,000 mins/year downtime
managed 99.% - 5,000 mins/year downtime
well-managed 99.9% - 500 mins/year downtime
fault-tolerant 99.99% - 50 mins/year downtime
high-availability 99.999% - 5 mins/year downtime
very-high-availability 99.9999% - .5 mins/year downtime
ultra-availability 99.99999% - .05 mins/year downtime

As you will notice even they defined only 7 nines. So we do not have a name to call what Amazon has claimed.


Tuesday, June 1, 2010

WSO2 Stratos Services Released

WSO2 Stratos, which offers WSO2 SOA platform as a service, is now live, up and running. Stratos opens up a new deployment choice for our servers, by enabling them have their servers as a Service. For example, if a user need a Governance Registry, he had to either run it real hardware, run it through a Virtual Machine, or run it through Cloud. Now, there is a forth option: that is get it as a service through WSO2 Stratos. Of course there is much more to Stratos, see http://wso2.com/cloud/stratos/ and The Six Weeks and 12 People Magic for some more details.

You can find it from https://cloud.wso2.com/., and you can try it for free.



Saturday, May 29, 2010

E-Science: What and Why?

E-Science facilitates scientific research by applications of Computer Science. Although it has roots in High Performance Computing and Super Computing, which focus on number crunching, it has evolved to a general role in last few years. Since last few years, E-Science has appeared in the spotlight and has attracted significant amount of funds and able to attract researchers likes of Jim Gray.

Years ago the role of computing in Research was number crunching and help scientists to keep track of their data. However, now computing has become an indistinguishable part of scientific research, and almost all research disciplines have major dependencies on Computer Science. Let us briefly look at some of those areas and reasons why computers play such an integral role in sciences.

Science said to be stand on two pillars: empirical methods and analytical methods. In the first, scientists uses data collected over sufficient period of time to find new trends and patterns of the nature. In the second, based on formal models of world and current knowledge, they try to derive new results through formal logic. More often than not, these two methods have been used in tandem, one helping the other. However, with the advent of computers, a third pillar--computations-- has arrived. Since many results derived from empirical and analytical results do not have closed from answers, scientists often solve such problems through simulations. For example, PDEs (partial Deferential Equations), which often resulting from many real life calculations, often do not have a closed from answers. Therefore, they had to be solved through numerical analysis. For example, the state of art weather models predict weather by simulating a model rather than solving them. There are such examples in all filed of engineering and physical sciences. This aspect covers most hpc use cases of E-Science.

On the other hand, most of scientific calculations are easily beyond single computers. For an example, high resolution weather predations can easily use 1000 CPUs, and space telescopes can easily generate tera bytes of data in relatively short time. Handling such problems require multiple computers and distributed system knowledge. Also, building efficient solutions and exploiting the parallel nature need high performance computing (HPC) and Parallel computing.

Furthermore, the reliance of computer science has forced scientists and graduate students from Sciences to learn computer science. Although some of those scientists have made significant contributions to computer science, often that is road block for many scientists to adopt IT in the fullest extent in their research. Consequently, making computing transparent, or in other words, building tools that allow scientists to perform sciences with minimal Computer knowledge, is another interesting challenge being tackled by e-science.

Moreover, efficient scientific research requires a high level of communications and collaboration among scientists. Although IT plays a significant role in that arena even now, there is a greater potential role which it can play. For example, IT has greatly simplified dissemination of scientific research, and has significantly reduce the time and effort required to conduct a literature survey. However, we still lack infrastructure to collaborate in ongoing basis, which allow scientists to collaboratively perform large experiments. More an more grand challenges require collaboration across multiple disciplines, and that increases the importance of such collaborations and consequently the importance of tools to enable such collaborations.

Finally, given the reduced cost of sensors and ubiquity of information technology, there are vast amount of data available to a researcher from the natural world. However, one of the challenges of our time is to learn how to make sense of that data, which is more or less the goal of science itself. In the world we live in, it is much easier to obtain data, but it is much harder to make sense of that data. Therefore, computer science can play a major role in enabling and streamlining the process of getting from data to knowledge, which include collecting raw data, generating meta data, archiving, searching, visualizing, generate information by processing, deriving knowledge from information, and preserving data for the future.

Current E-science includes traditional computational topics like Building Super computing, High Performance computing, Parallel programming, multi core and GPU programming, as well as more general topics like data intensive computing, processing systems like workflow systems, and large scale data storage systems. In general, E-Science tries to facilitate scientific discovery through applications of computer science, and it tries to do that in transparent manner as possible hiding details about CS as much as possible from the end users.

Given the significant interested by system researchers to in E-Science, it is interesting to inquire the reasons. The answer is two fold. On one hand, the amount funding available to pure computer science has greatly reduced, while the funding allocated for national wide cyberinfrastructures has greatly increased. On the other hand, E-Science brings in to focus very large scales, in terms of both computations and data. The resulting problems are challenging even to computer scientists, and the tools and systems we have are often inadequate handle such problems. Therefore, E-Science has continue to push boundaries of computer science.

There are multiple E-Science initiates both at U.S., as well as UK and Europe, each receiving millions of dollars and attracting top scientists. Furthermore, Microsoft research has made significant investments and have a major presence in E-Science. Furthermore, IEEE E-Science Conference will be holding its 6th annual conference this year. Among some of the venues are Annual IEEE E-Science Conference, Annual Super Computing Conference, Annual Teragrid conference, and Annual Microsoft E-Science workshop.

To summarize, computing has the potential to facilitate conduct of scientific research enabling humans to take giant leaps, and the E-Science is a filed of study whose goal is to make that a reality. It has attracted scientists from both sciences and computer science, has receives millions of dollars in funding, and currently running many multi-disciplinary research projects to build next generation research infrastructure.




Friday, May 28, 2010

Paul on getting maximum out of Cloud:Cloud Native

Paul has written a nice blog on cloud nativity.

Well what is it? When a new technology come around, you can use it, but it is possible that you are only using it to handle your old usecase. Often, new technologies comes with new strengths and powerful features. To get best out of it, you should be using all its strengths, not just implementing your old scenario.

Lets try to be concrete. With cloud, you can move your apps to the cloud, and you might get some benefits through economic's of scale as your computing provider might be able to give you computing power cheaper than you running your servers. But still there may be lot of other potential benefits of the cloud you are not getting, like elasticity etc. It is like, if I buy a iPhone and only use it to make calls, even though it is cools and it covers my old usecase---making calls---I am getting only 10% of what iPhone can give me. So in the same way, if you are going to use the cloud, you have to look beyond your current usecase and be aware of wonderful new scenarios cloud can enable. On the flip side, if you go around telling "I brought a iPhone, but call quality is the same", obviously you are off the mark by a lot.

On his blog Paul is trying to define some of features required by middleware if it is to exact the best out of cloud on your behalf. They are Distributed / dynamically wired, Elastic, Multi-tenant, Self-service, Granularly metered and billed, and Incrementally deployed and tested. Refer to Paul's blog for more details.

Shulter fro Rain


Shulter
Originally uploaded by Srinath Perera

This couple waits till they get dry. There are about 10 of these guys live very close proximity to our house. They almost come in to the house, specially just after rain.

Beauty of Nature


Beautiful Nature
Originally uploaded by Srinath Perera

It is amazing how beautiful the world around us.

Wednesday, May 26, 2010

Fixing Vista endless reboot

Recently my Vista OS got in to a endless reboot, most probably because it was restarted while installing a live update. When that happens, nothing works, including safe mode. Also figured out that if you have a Linux partition, Vista installations fail with a blue screen.

So rule number 1, Make sure you stop installing live updates automatically.

Above problem can be fixed by deleting the pending.xml file---which lists TODO when a machine bootup, which is at C:\Windows\winsxs\pending.xml. Simply boot with Ubuntu live CD and delete it. This and this give more details.

Wednesday, May 12, 2010

Webinar: Making the hybrid cloud a reality

I am doing a Webinar on "Webinar: Making the hybrid cloud a reality" at 12th 10AM PST (in an hour or so) and 13th 9AM GMT. Check here for more details.

An outline is given below.

The hybrid cloud option leverages the security of a private cloud solution and the elasticity/scalability of a public cloud. Architects designing hybrid cloud solutions need to reconcile between these competing goals. With WSO2’s Cloud Services Gateway organizations can now effectively mediate between their public and private clouds without compromising existing network firewall infrastructures.

Saturday, May 8, 2010

SOAP is 10 years

Bit too late to blog, but Dr. Sanjiva has written a very nice blog on the topic. A must read.

Friday, March 26, 2010

On Research and Writing Research Papers

Did a talk "On Research and Writing Research Papers" for the final year batch of Computer Science & Engineering department of the University of Moratuwa. Slides are below.



Abstract
A firm grasp of scientific method and ability to write clearly and convincingly is a great assert to any professional in sciences. Conducting research and publishing peer reviewed papers train professionals in both scientific method and writing. Moreover, having research papers in your resume is considered a huge plus in both industry and academia. However, conducting research and getting them published requires professionals to approach the problem and present their solutions form a unique angle. The talk will address research in general and writing research papers. Specifically, the talk will cover peer review process, what is a contribution?, and basic composition of a research paper, describing potential pitfalls.

Thursday, February 25, 2010

ESP vs CEP

From Kau's Blog: ESP vs CEP: "A nice an introductory level preso that explains ESP and CEP with examples. Usually these terms are used wrongfully. Mythbusters: Event Stream Processing v. Complex Event Processing

Saturday, January 23, 2010

"Why I Write" by George Orwell

In this famous essay, "Why I Write", George Orwell discusses reasons why people write, and I found the article rather interesting. He said that "All writers are vain, selfish, and lazy, and at the very bottom of their motives there lies a mystery", which is a phrase that is often quoted.

Friday, January 8, 2010

Icons for your Architecture Diagrams

If you have been wondering about where to find Icons for your Architecture Diagrams/ Presentations, here are few options.

  1. Try Open Clipart, but get 0.18 (new version, categories are not good and very hard to find something)
  2. If you are in Ubuntu, try /usr/share/icons/gnome/scalable, there is lot of .svg files there
  3. Look for wikipedia entry, images in there are reusable (same for wikimedia)
  4. Google with create commons settings

Sunday, January 3, 2010

Fourth Paradigm, an E-Sceince book by MSR

Microsoft Research has published this book "Fourth Paradigm", which describes how to handle very large data sets. If you are interested in E-Science use cases, this seems very good reading. It is partly based on Jim Gary's last work. The book is free and can be downloaded from here.