“If you build it, they will come.” That’s a quote (or slight misquote actually)
from the film Field of Dreams. The premise being that if you build something
people really want (baseball field, website, supercomputer,
cyberinfrastructure) they will find their way to it, and all will live happily
ever after.
In the research and e-infrastructures field this is perhaps
not the full story. It's true that potential research users are out there in large numbers,
but matching them to the resources of their dreams often takes a lot of work,
and not a little persuasion, outreach and training.
Yesterday’s talk by XSEDE’s David Hart of the National
Center for Atmospheric Research on usage patterns for XSEDE resources was
particularly interesting for those reasons – after a year’s operation and the
transition from Teragrid, who are XSEDE’s users and how do they use the
infrastructure?
From the official project rankings, it’s difficult to get a
clear picture. Projects move up and down in the rankings over time, and the
smeared out data doesn’t give much indication of usage. Hart decided to derive
his own picture of the data, and looked at three factors: injection, continuation
and completion ie how many projects were new in any one year and how long did
they last?
In a given year, Hart discovered that around 40% were likely
to be new, 40% continued beyond that year and 30% completed during the year. Larger
projects were more likely to continue beyond the year, smaller projects often started
and ended in the same year. The most likely duration is less than 6 months, corresponding
to about 1000 projects. Around 63% lasted more than18 months.
When Hart turned his attention to the users of XSEDE, it
turns out that projects are more stable than individual users – 48% of users in any
given year were new, (meaning they had never submitted a job before) and about 30%
carried on using the infrastructure into the next year. About 40% of users are
graduate students however, so this perhaps correlates with the high turnover.
Out of the nearly 8400 users, 4000 users lasted less than 6
months. For about 2000 users, their first and last jobs were less than 15 days
apart. This figure does not include the number of users who went to the trouble
of logging in but never submitted a job at all.
Hart was keen to point out that short lived usage of the
infrastructure is not in itself something that you would want to avoid – this level
of usage could still be exactly what the researchers and grad students need for
their work, or to complete assignments and training. Comparing these patterns
with the closest European equivalents to XSEDE would be interesting – PRACE,
the supercomputing network has supported 103 projects, over 5 calls,
representing more than 2.7 billion core hours. The European Grid Infrastructure,
the pan-European grid computing infrastructure, has over 20,000 users across
Europe and beyond and runs 1.2 million jobs a day. If you build it, and want 'them' to come, I guess you need to know your user.
No comments:
Post a Comment