In this special guest feature from Scientific Computing World, Dan Katz from the University of Chicago and the Argonne National Laboratory writes that the challenge to the scientific community is how to do the work needed to make software sustainable.
Software is omnipresent in science, as the scientific process becomes increasingly digital. About half the papers in recent issues of Science discuss findings that were made using software-intensive projects, and almost all the projects used software at some level, ranging from system software, middleware, libraries, through science gateways and web portals, to computational modelling and data analysis.
In fact, scientific research is dependent on maintaining and advancing a wide variety of software. However, software development, production, and maintenance are people-intensive; software lifetimes are long compared to hardware; and the value of software is often underappreciated.
Because software is not a one-time effort, it must be sustained, meaning that it must be continually updated to work in environments that are changing and to solve changing problems. Software that is not maintained will either simply stop working, or will stop being useful.
Software is created for distinct purposes: research software is developed by a researcher for their own research, whereas infrastructure software is developed for use by people other than the developer. This might seem to imply that only the infrastructure software needs to be sustained, but there are at least two arguments against this. Firstly, most infrastructure software starts as research software, and secondly for science to be reproducible, a particular project’s research software needs to be reusable over some period. So while research software doesn’t need to be as sustainable as infrastructure software, sustainability still needs to be considered.
Open Source and good practice
The challenge to the scientific community is how to do the work needed to make software sustainable. Either reducing the amount of work, or bringing together new resources, can make this more successful. The first is often thought of as using good software engineering practices (which has the additional benefit of making the software more likely to be correct), and the second can potentially be satisfied through Open Source communities.
In some sense, these are both social issues, rather than technical ones. The goal is to encourage software developers, whatever the type of software they are developing, to do the extra work needed to make their own software sustainable and to build or join communities whose members work together on shared code. The task is how to achieve this.
Career paths in software engineering
In academia in particular, this is a big challenge. Faculty members, students, postdocs, and staff members may all be developing scientific software. But, university employees, at all levels, who are interested in both science and software often feel compelled to leave academia due to the lack of recognition of their software activities and the lack of career paths that let them pursue their interests in a manner that they feel is sustainable. Faculty members are measured primarily on their research outputs, in terms of peer-reviewed publications, and their grant funding. Students and postdocs are generally taught that success is obtaining a faculty position, which means that they also need to produce publications and when possible, obtain grant funds. Staff members, sometimes employed in a series of postdoc positions, and sometimes in more formal positions, do not often have defined positions and career tracks within universities. In the United States at least, some have the option of going to national laboratories and continuing in science, but many go to industry and discard their scientific interests.
Software development improves research
Can we improve this situation? For staff members in UK universities at least, the concept of research software engineers as a professional path is starting to be accepted both by the university system as well as by the funding agencies. But as Simon Hettrick of the Software Sustainability Institute explained in his article, Why we need to create careers for research software engineers, published on theScientific Computing World website in November 2015, a lot of work had to be done preparing the ground for acceptance of this idea.
In other places, such as the US, universities are trying different models to build up a critical mass of software peers, such as academic computing centres like NCSA at Illinois and TACC at Texas, or integrated data-science programmes such as those at Berkeley, NYU, and Washington supported by the Moore and Sloan Foundations. Both models recognise software development as a contribution that makes the university a stronger research institution.
Publishing software or papers about software?
A more general change is the growing recognition that developing software is a creative research process, and thus software can be a research output, similar to a publication, meaning that developers can get academic credit, and gain in academic reputation for producing software. Software can be published through Zenodo, and see https://guides.github.com/activities/citable-code/. In addition, a number of journals now publish software papers — traditional papers in the academic literature that discuss particular software (www.software.ac.uk/resources/guides/which-journals-should-i-publish-my-software).
Publishing software has the advantage that the software itself is recognised, rather than a paper about the software. Publishing papers about software, however, currently fits better with the established academic environment, including peer-review and indexing by tools such as Web of Science, and Google Scholar. Both models support digital object identifiers (DOI) for the software, which can then be cited by others, leading to traditional academic credit, and enabling software to be counted in assessment activities such as the UK’s REF.
The impact of software products can also be measured using alternative metrics developed over the last five or so years. These track readership, links, and discussion about publications, including software papers, as well as software-specific metrics such as downloads, builds, use, and reuse, for example, as measured by Depsy.
Changing the culture of science
Software is increasingly important to science, but the socio-cultural system under which science is performed has not been changing as quickly. However, led by highly motivated scientific software creators, new career paths and new models for software credit are developing, which have the potential to promote and reward academic scientific software development.
Daniel S. Katz (firstname.lastname@example.org) is at the University of Chicago and the Argonne National Laboratory. He would like to thank Rajiv Ramnath, C. Titus Brown, and Wolfgang Bangerth for useful feedback on this article. Some work by the author was supported by the US National Science Foundation (NSF), while he was working at the Foundation; however, any opinion, finding, and conclusions or recommendations expressed in this article are those of the author and do not necessarily reflect the views of the NSF.