Wednesday, March 27, 2013

A vision for metrics and citation

The first two speakers at ISGC2013 were Dr Steven Newhouse, Director of European Grid Infrastructure (EGI) and Dr Daniel Katz, Program Director in the Division of Advanced Cyberinfrastructure the National Science Foundation (NSF). Together, they provided perspectives from both sides of the Atlantic Ocean on ways to measure the success of e-infrastructures including the people and services that contribute to their development. Steven Newhouse introduced some of the metrics currently being used to determine the impact of e-infrastructures in Europe from projects such as e-FISCAL, e.inventory and ERINA+ (read more about this in a recent iSGTW article). 

Daniel Katz, NSF
So how do developers/architects of software and e-infrastructures know when their work is successful. And more importantly how does the outside world (especially funders) recognise exactly who collaborated and contributed to each and every scientific output.  For the developer of an open source physics simulation possible metrics could include recording downloads and users as well as tracking paper citations. However, Katz suggests that what could be more valuable but harder to monitor are the number of citations each paper that cited your software receives.  He emphasized the need to develop a system where credit is given to those involved in building the software and data sets. NSF have proposed a number of solutions -a vision for citations and metrics.

One  recommendation is that all products - including software, data sets and workflows, are registered. Next, a credit map is developed. The input would be the weighted list of contributors (e.g. both people and products etc.) with the DOI as the final output. This could lead to transitive credit enabling developers to gather evidence for their applications widespread usability. Product usage could also be automatically recorded by simply incorporating a general usage code into each software package. 

Commercial tools are generally tracked by the money they generate but this doesn't help academics understand what tools were used for what outcomes, says Katz.  Tokens could be provided as part of science grants. Users could then distribute tokens either during or after using the tools.
At the moment, it is largely up to e-infrastructures to not only demonstrate an active user base, but to monitor the quality of service, track their citations and understand usage patterns. Professor Alexandre Bonvin described how WeNMR is keeping track of how their services have contributed to progress in structural biology.  In 2012, there was a significant output from the project with 36 publications acknowledging WeNMR. Their analysis found the use of their e-infrastructure to be application/portal and discipline dependent with a large fraction of users seem to only use specific tools for rather limited lifetimes (~0.5 to 1 year). Bonvin suggests that e-infrastructures should only expect a fraction of potential users to actually use it at any given time.

Unfortunately, as researchers don't always acknowledge the full array of tools that helped them with their work developing an effective reward system for those developers and architects working behind the scenes will be vitally important going forward.

No comments: