After a sojourn in Lisbon last week for eChallenges, it’s back to southern regions and the EUDAT first conference in Barcelona. For an Amsterdam resident it’s a pleasant experience to be able to sit outside without huddling inside a winter coat in October and perhaps this one of the reasons behind the high attendance at the event – another being of course the draw of EUDAT’s first annual gathering for the growing ‘big data’ community.
With the Horizon2020 funding programme moving into its next phase of development in January 2013, this is an area that the European Commission takes a very keen interest in. Kostas Glinos, Head of e-Infrastructures at DG-CONNECT brought us up to date with the plans so far. First he reminded us that big data is itself an infrastructure. It is used by a multiplicity of scientific research teams, covers a broad area of interest, needs to be open and accessible and available 24/7 plus providing high degrees of reliability and trust. These are all characteristics of infrastructures such as roads, electricity and grid computing.
Over the last 5 or 6 years, the Commission has spent a not inconsiderable 100 million Euros on data related projects such as EUDAT. The aim of the Commission is to put the user at the centre, provide end-to-end services, break down barriers between disciplines and offer services that are as broad as possible. Increasingly the researcher finds herself working more and more in a digital environment. What she needs is a digital data shop. To provide this, there is a need for European level coordination, an e-infrastructure both of and for data. Not only at a European level but internationally, and there are also joint initiatives between the EC and National Science Foundation through the iCORDI infrastructures call, plus collaboration between the US, Australia and Canada through the Research Data Alliance.
Essentially, says Kostas, there is a tension between taking too narrow a focus and spreading the efforts too thinly. And there are many areas that need addressing in the arena of data. The increasing data deluge means that a service driven e-Infrastructure is needed to take research into the exascale. Funding should support community driven initiatives to meet community needs. There needs to be a digital management plan and comprehensive policies for data curation and preservation. Neelie Kroes has committed to provide all data from Horizon2020 funded projects through open access across institutional, disciplinary and national boundaries. But there also needs to be a way to link data to authors and funders, a sort of “scholar’s passport” plus an educational framework for data scientists. Coordination is needed on a global scale, both top down and bottom up.
The challenges to all this, as ever, are around sustainability and funding. It is generally easier to get money to set things up rather than to drip feed support for ongoing development and archiving. For the European Commission, the Digital European Research Area for researchers equals data centric science and engineering, a robust computational infrastructure, such as grids, cloud and supercomputing, accessible research and education networks and a thriving set of virtual research communities and e-scientists. Innovation will also be supported through pre-commercial procurement and services both for and provided by industry and SMEs.
For Kostas, the trends in research are towards large scale global collaborations and usage of rare or remote resources generating data-intensive, exponentially growing datasets. Virtual research communities rely on experimentation in silico and high performance computing simulations. The discussions during the rest of the meeting will no doubt focus on the role of the data community in supporting this vision.