Friday, December 12, 2008

Final day at e-science 2008

This morning I attended two sessions - one on bioinformatics and the other on cheminformatics which were handily in rooms across from each other. (Everything I wanted to go to today clashed with everything else - all the strongly user focused sessions seemed to be today!).

I found the Bioinformatics session particularly interesting with Simon Lin, Director of Bioinformatics Consulting (Robert H. Lurie Comprehensive Cancer Center & Biomedical, Northwestern University). He is using Amazon cloud computing (which is turning out to be affordable) rather than traditional computing clusters which he also has access to. Part of the reason for this is that it is easier to wipe images on cloud systems unlike university resources. Also there is no queue on cloud resources unlike many university systems. Lin is using Amazon to demonstrate proof of concept as it is the first commercially available cloud computing system. However he is currently investigating running an internal cloud using Eucalyptus, something which was discussed earlier in the week at this conference.

An interesting question was asked at the end as to the difference between cloud and grid. Lin pointed out that they were highly related but there are some important differences such as grid being the joining of distributed clusters using an extra layer whereas cloud used virtual images.

Lins talk was followed by Jake Chen from Indiana University – Purdue University (IUPUI) who spoke about "Bio-computing and Knowledge Discovery of Molecular Networks". He described a method to find potential drugs to treat Alzheimers disease by searching the Pubmed abstract database for specific proteins. His group has found many novel compounds that have been used for treating other diseases but have not yet been applied to Alzheimers.

This afternoons key note was given by Ed Seidel, Director, Office of Cyberinfrastructure who spoke about where cyberinfrastructure (CI) is going over the next 2 years. It's not written in stone yet but plans are beginning to be formed. Seidel is aggressively trying to increase the budget for CI and it seems to be working. He also talked about the lack of suitably trained computational scientists coming out of universities. No one seems to be teaching the skills required to deal with the infrastructure that is now in place. The lack of funding for software was also discussed with funding very much focused on the actual machines as a machine in the Top 500 is better understood than a piece of software by the people who matter in the Senate. He also spoke of the new machines coming onto TeraGrid over the next few years.

Overall his talk brought home to me that every grid faces the same problems at the moment. We are all really at the beginning of implementing and building a system for researchers (I'm not going to include the particle physicists in this!) and we're all in the same boat. Food for thought in his presentation and I'd recommend watching it when it comes online at the end of next week.