Monday, February 27, 2012

Managing data with iRODS: ISGC2012 workshop

Yesterday morning I headed over to ASGC for the Sunday workshops, which lead up to the main ISGC 2012 programme. Trying to ignore the steady drizzle and the jetlag (both my computer and body clock were telling me it was 2 o’clock in the morning), I joined the iRODS workshop, delivered by Reagan Moore of the University of North Carolina and DICE (Data Intensive Cyber Environments). iRODs, Integrated Rule-Oriented Data System, is a policy based data management system, based on open source software. It helps people to manage large collections of data, which might be scattered across multiple sites.

What I found interesting about iRODs is that it scales from managing personal data collections, such as a few thousand digital photos, up to huge international projects such as the Square Kilometre Array. If they need to, users can move hundreds of thousands of files around at a time using an automated process. iRODs is driven by a rule engine, which can configure each system differently, allowing you to choose which policies and rules to include without editing the core code.

Some examples of the types of admin policies that can be applied are file migration, format migration, distribution and usage reports. In the area of validation, policies cover integrity, replication, required metadata, derived data products, audit trail compliance, distribution and retention. Processing of data can be done either at source, or by calling routines remotely.

To try it out, Moore led the group in installing iRODs and accessing a data set in California (located not just on another continent, but still in the day before). Having done this, Moore took a quick poll of the possible use cases in the room. Possibly not surprisingly given the audience, the highest interest was in running regional, national or international data sets. However there was also significant interest in using iRODs for personal digital libraries, archives and federated collections.

The iRODS User Meeting takes place later this week on 1-2 March, and I would quite like to be a fly on the wall, just to see the range of applications in action. With data management at the front of many researcher's minds in Europe at the moment, I’m expecting to hear lots more about iRODs in the future.

No comments: