Thursday, March 22, 2012

Taking data into the next 1000 years at ICRI2012

What will life be like in 1000 years? Will we all still be crammed onto the same planet, or will the human race have spread out to the stars? What languages will we be discussing our research in? Or instead will we all be permanently plugged into augmented virtual reality in a tank somewhere? It’s possible we won’t even exist at all, victims of a catastrophic side swipe by a passing asteroid on its way to the edges of the Milky Way.

These are some of the questions that spring to mind when considering life in the distant future... but perhaps not your first thought when it comes to talking about data. However, at this morning’s session at ICRI2012, Alan Blatecky of the Office of Cyberinfrastructure in the US told us that data should be a thousand year effort, not just 2 or 3 project years. We should start to think in terms of ‘data as a service’. And not forget that we still need to work out how to store the data (super) long term and make sure that we recognise the impact of the data that we have.

Sergio Bertolucci of CERN coined a new term when talking about the characteristics of global research infrastructures. He told us that globalisation has happened in a few fields of science, but shaped by two opposing drivers: competition and collaboration. Together this gives us global research or ‘coopetition’. We need to strengthen the network of research infrastructures in the European Research Area to play a major role on the global stage, and back it up with visionary global policies… that nevertheless take a pragmatic approach to handling coopetition.

Janet Thornton, of EMBL-EBI and ELIXIR talked about her experiences of drawing dispersed databanks together, in contrast to CERN which generates the data at source. EMBL has been setting up databanks in the areas of genome sequencing, protein expression and structure from all over Europe, often establishing the de facto data standards in the process. They have 20 major databases – when they started out they could have stored the lot on the equivalent of a smart phone. Now they have 4 million users per year and a healthy 14 Petabytes of data. Thornton called for long term core funding, and pointed out that there is currently a lack of financial support for pan-European components. We need real political support behind research infrastructures – after all, they are excellent value for money, seeing as sharing data is only 1% of the cost of generating it.

Jacqueline McGlade, of the European Environment Agency introduced us to a mashup website called Eye on Earth, a global public information service for sharing data from all over the world, built through a collaboration between industry and academia. You can easily manipulate data in a way that previously you might have needed PhD level training to achieve – potentially a great tool for citizen scientists. Seeing as we can measure everything that moves, as McGlade put it, data has suddenly become THE currency for governments. It’s all about getting the best access to the best data.

So all this might start to put us into a good position to welcome in the next 1000 years – for some additional ideas, try Arthur C Clarke at It certainly got me thinking during the coffee break...

No comments: