Monday, March 11, 2013

From Galileo to LIGO: Edward Seidel

Galileo's career was marked by many discoveries, multiple inventions and major contributions to physics and observational astronomy. It's difficult to imagine that back in the 1600s, Galileo worked alone. Over the last four centuries (and more notably in the last 15 years), there has been a rapid change in the culture of the way science is carried out. Science is now more collaborative; it is ofcourse also increasingly data-driven and computational.

Astrophysicist Edward Seidel, and Senior Vice President of Research and Innovation at the Skolkova Institute of Science and Technology, gave a thought-provoking keynote at the 10th e-Infrastructure Concertation meeting last week. He provided some recommendations on how best to develop an efficient e-infrastructure ecosystem.

Modern science and society are being fundamentally transformed by data. Seidel's own field, gravitational physics, has been completely transformed from being highly mathematical/ theoretical to being highly computational and data driven.  Gamma ray astrophysics has been for four centuries a relatively small-scale science, but over the last two decades there has been radical change in both data (factors of 1000 per 5 years) and collaborations (e.g. Laser Interferometer Gravitational Wave Observatory or LIGO). Seidel describes how a new field is emerging called transient data-intensive astronomy, and how the complexity of the universe can only be solved by globally distributed multidisciplinary collaborations involving experts in relatively, hydrodynamics, nuclear physics, radiation, neutrinos etc. Libraries will still remain important for archiving and curating data. The John’s Hopkins Library recently curated 40 terabytes of data from the Sloan Digital Sky Survey.

"There is an increasing pressure to make data available in a much more global sense, so different levels of data policy are paramount," said Seidel. "As different instruments (radio telescopes, optical telescopes) are required to understand each fundamental force, any solution will require integration across disciplines end to end. Communities will also need to share data, software, and knowledge in real time".

Seidel also pointed out that open and sharable software environments enable new science that was not possible before, and the software infrastructure is as important as hardware in the e-infrastructure landscape. The Community Einstein Toolkit is an example of this. It is open source software, which is now used by 67 members on 29 sites across 11 countries. It was first developed by the astrophysics network in 2000 to enable Einstein’s equations to be solved more routinely.

There are also a large number of data gathering differences between big science experiments versus the long tail of science. Many big data projects (~1%) are special and highly organised, but data creation in the long tail of science (the other 99%) is more heterogenous, sometimes hand generated, often not curated etc. So the question arises how best do you harness the power of this long tail of data.

Edward Seidel identified five crises/issues/questions that science will have to deal with:

1.    Computing technology –What new models (clouds, grids, GPUs) are available?
2.    Data, provenance and visualisation –What is an international data infrastructure? How do we create data scientists?
3.    Software should be treated as an integral part of e-Infrastructure
4.    Modern science requires new university structures, interdisciplinary in design
5.    How do we educate people in this environment, and help universities in transition? Advances in campus culture and policy on data is just as important as the underlying e-infrastructure.

As grand challenges communities begin to converge to solve complex problems, Seidel's main message is that researchers collaborate and work by sharing data, which places requirements on e-Infrastructure: software, networks, collaborative environments, data etc. This raises so many questions for reproducibility, access, university structures, and changing the paradigm of what a publication is. "E-Infrastructure ecosystem must support computing, instrumentation and data services, but requires strategic thinking not only at the campus level, but from the campus level to the international level", concludes Seidel.

No comments: