After nearly 20 hours in travel time, I arrived in beautiful
Amsterdam at the Academic Medical Center (AMC or Academisch Medisch Centrum in
Dutch) for the International Workshop in Science Gateways in the Life Sciences.
This event was being held in conjunction with the HealthGrid 2012 conference.
The overlapping events and location at the AMC meant there was a real draw from
the medical community. Thank you to the organizer Sandra Gessig for inviting me
to give a talk at this event.
The AMC is one of the largest hospitals in the Netherlands
and the conference was held in a conference room right off the main atrium
inside the hospital.
Presentations featured speakers from many countries – the
Netherlands, Switzerland, the Ukraine, Italy, Hungary, Germany, the UK, the US,
Poland and more. Gateways were presented that handled medical imaging, analysis
of millions of samples, genomic analysis. Speakers described using gateways to
solve problems involving disparate workflows, interoperability, huge varieties
of back end codes, sharing data (and protected data, too) across institutions,
data curation and community building. As the session progressed, presenters
observed parallels in other projects relevant to their own work and I noticed
many references to the work of previous presenters. To me that is one of the
primary benefits of such a workshop – to learn about the related work of others
and to be able to immediately speak with an author after a presentation. I plan
to investigate the use of HDF5 containers for many small files on a parallel
filesystem as presented by Vincent
Rouilly (ETH Zurich) in his iBRAIN2 work.
Over and over speakers mentioned the very rapid pace of
development in the life sciences and any gateway building effort that had a
significant development cycle was doomed.
Roberto Barbera (University of Catania and INFN, Italy) made
many wonderful points in his keynote talk. He described the rapid advances in
Web technologies with vintage visual screen shots of browsers some of us remember
like Lynx, Mozilla and Netscape. He mentioned that gateways feature prominently
in the eResearch 2020 final report. In
fact gateways are a key centerpiece for Europe’s GRDI2020 vision (http://www.grdi2020.eu), which puts forth a
vision for global research data infrastructure for 2020.
Roberto highlights the difference in size between the number
of users of social networks, the number of EU users and the number of EGI users
to make the point that we can increase the scientific user base through the use
of Web technologies. Through Lego imagery, Roberto highlighted the importance
of standards and building blocks to build both simple and complex structures. Catania
gateways are used by 114 organizations in 41 countries with an increased focus
on Latin America.
Personally, I was impressed at job launching modules
presented by both Roberto and Peter Kacsuk (see next paragraph) that interface
to many different grid infrastructures (Globus, Unicore, gLite, ourgrid, garuda,
GOS). For gateways that use modules such as these, middleware upgrades and
infrastructure changes would be relatively straightforward for gateway
developers.
Peter Kacsuk (MTA SZTAKI, Hungary) spoke about his SCI-BUS
(SCIentific gateway Based User Support) program, a 3-year EU-funded project
that creates a WS-PGRADE-based framework for gateway building, but also
provides support for those developing gateways. 11 gateways have been created
in the first project year, several of which presented at the IWSG workshop and
there is an exciting summer school being held in Budapest this summer. My
initial reaction was to look into joining SCI-BUS as an associated member
providing my organization will agree to the terms of the MOU.
It was tremendously exciting to me to observe so many
overlapping areas of interest between presentations at IWSG and my own work in
the NSF XSEDE program. The BioAssist project from the Netherlands
Bioinformatics Centre mirrors work we in XSEDE are trying to do to extend the
capabilities of Galaxy. Many of the codes Bhanu Rekepalli is addressing in his
super-scaling work in the Systems Biology Workbench (BLAST, HMMR, ClustalW,
MUSCLE, PhyML, GARLI, RAxML, MrBayes, DOCK6, AutoDock, AMBER, NAMD) were also
mentioned by presenters at IWSG.
The MoSGrid presentation was the rare presentation from a
user of gateways – a real scientist (Ines dos Santos Vieira) rather than a
gateway developer. It was interesting to hear her perspective on what aspects
of MoSGrid improved life for her as a scientist.
I participated in a very interesting panel session on the
use of clouds and a session where attendees discussed the EGI roadmap going
forward. It was interesting to function as an observer and consider our own
planning activities.
Aaron Golden from the Albert Einstein School of Medicine in
New York City gave a very NYC-paced presentation on the Einstein Science
Gateway for genome sequencing on campus. The college is both a teaching and a
clinical center, seeing 80,000 patients per year. Aaron’s background is in
astronomy and it was interesting to hear the parallels between his work on the
Sloan Digital Sky Survey and the Einstein gateway. While they both involve web
interfaces to large data collections, as Aaron points out there is only one universe
while in biology the universe is re-created each time a sequencer is run. It’s
funny how Aaron traveled from NYC and I traveled from San Diego to Amsterdam so
I could learn how successful his group has been using XSEDE with little
assistance from my team.
Finally, I was excited to learn that the High-Performance
Computing Infrastructure for South East Europe’s Research Communities (HP-SEE)
has two BlueGene machines among their 20,000 combined cores. Truly, I learned a
lot at this workshop and enjoyed my time on a small houseboat in Amsterdam.
No comments:
Post a Comment