Wednesday, February 27, 2013

The Holy Grail in Clouds: Scientific users

Cloudscape V is proudly showing the successes of the cloud IT community in the last 5 years of efforts in standardization, piloting and cooperation. The impact of cloud provisioning models in industry, both at the level of large companies and SMEs is beyond doubt. However, it seems surprising that in science, where many of the technologies and procedures emerged, in science the same breakthrough has not yet taken place. Several studies, such as the Magellan report or the VENUS-C project, have studied the requirements, expectancies and specific requirements of scientific disciplines and identified gaps that affect the take-off. Last year in Cloudscape IV I was blogging on whether the introduction of clouds has required the user to acquire additional skills to act as system administrators.
Scientists are an interesting market for cloud providers, and several initiatives have been created in the recent years to pave the way for the entry of scientific communities into that market. Cloud4Science ( is a Microsoft-funded initiative in this sense. The design principle of Cloud4Science aims at being attractive both to the resource provider and the users. Cloud4Science aims at creating an open-source, self-sustainable community on top of a set of components and data sources maintained by the cloud providers. Users will benefit from the availability of data, sharing capabilities and PaaS components released under open source licenses.
The first target of Cloud4Science is one of the most prominent scientific communities in cloud – bioinformatics. Bioinformatics experiments face an exponential growth in resource requirements, both in terms of data and computing. Bioinformaticians are used to, or even forced, to work on the “open availability” paradigm, and multidisciplinarity (biology, computer science, chemistry) is the norm for most research groups. And the key is data. The boost of genomics and proteomics would have not been possible without widely sharing results in a standardized and open way. It is one of the communities where the “Open Science” model is developed best. However, at the same time, this exponential growth of data makes its exploitation and update complex for the mass of bioinformaticians (members of the so-called “long tail of science”) relying on local computers. And exactly here is the business opportunity for Microsoft and other market players, who will ensure that data and tools are updated, making the platform attractive.
I envisage CloudScape VI with new demonstrations on the massive usage of cloud resources from bioinformatics.

Ignacio Blanquer

No comments: