Monday, April 13, 2009

Sustaining the software (cyber/e-) infrastructure

A few weeks ago I was present at a Cyberinfrastructure Software Sustainability and Reusability Workshop organised by Indiana University for the National Science Foundation (NSF). It was mainly composed of academic from the USA, with a scattering of people from Europe and even further afield.

The NSF supports an Office of CyberInfrastructure which has as its remit:

coordinates and supports the acquisition, development and provision of state-of-the-art cyberinfrastructure resources, tools and services essential to the conduct of 21st century science and engineering research and education.

Frequently, cyberinfrastructure (in the USA) is generally equated with e-infrastructure (in Europe). While this is broadly true...e-infrastructure generally focuses (in my view) on the provision of capability and the software needed to access it. Cyberinfrastructure on the other hand encompasses e-infrastructure, but also looks at the more application oriented tools and libraries needed to support computation and data analysis. This is a much broader remit!

Anyway, with this broader remit NSF has understandably a larger software base that they need to maintain and support. How to best do this was the primary subject of the workshop, and of interest to the EGEE community as it looks to see how its various software outputs could be sustained.

A collection of position papers prepared by the participants before the workshop are available online. While all endorsed the open source model... it was recognised as not being a solution in its own right. The need to have dedicated staff contributing to open source development was seen as essential. In addition, it was recognised that software in different stages of its life (prototyped vs. developed vs. sustained vs. maintained) needs different approaches and effort.

All of this work needs to be guided and prioritised by the community it is being devised to server (see for example this recent announcement from the Globus Alliance). Other more direct models were discussed during the breakout sessions - including a model used within the UK HPC community (amongst others) where investigators are given the ability to 'spend' allocated credits with designated support units. Another issue raised at the workshop was how should (mostly academic) providers from large multi-site projects to smaller single site development teams be incentivised and rewarded?

Moving beyond the mechanics of keeping the software running... the issue as to what software to keep running came up. There seemed to be a strong consensus on having a defined 'platform' with a declared roadmap that would provide a basis for the community to build upon - both for tool and application developers. The issue as to what to sustain was a can of worms we decided not to open! Decomposing the desired functionality into areas, prioritising these areas, and then making sure that areas at the bottom of the stack are sustained before the upper layers are filled was one approach that was discussed - going up the stack as far as the budget would stretch!

So bringing this back to EGEE and the provision of a sustainable e-infrastructure within Europe? A group made up from the leading grid middleware providers in Europe (gLite, ARC and UNICORE) have been meeting regularly since December to understand how middleware will be provisioned within the EGI (European Grid Infrastructure). Working from the framework given within the EGI_DS Blueprint we are filling in the details as to how the software will be integrated from the middleware providers and moved through to being installed onto the National Grid Infrastructures.

The issue as to what needs to be maintained from the current software base and where are their areas that still need development are critical questions that still need to be answered.... but splitting up maintenance and development activities as two clear distinct activities is one option that has been discussed. As is defining which components need to be maintained as part of a core European software distribution... and which do not!

More discussions on this are taking place later this month... watch this space!

No comments: