Since 2016, the CodeRefinery project has been teaching students and researchers how to write better code and providing research groups with software development e-infrastructure tools to support this.
The project has not been holding workshops to teach people to program, but rather has been teaching research programmers the skills they are often missing. The project is funded by NeIC (the Nordic e-Infrastructure Collaboration) with co-funding from the Nordic national high performance computing (HPC) providers. After a successful first phase, the project is now entering phase two, with three more years, a larger team, new workshops, and a renewed focus on community building.
What explains the success of the first phase of the project? Research groups often struggle with collaborating on, and maintaining, scientific software: code projects developed by Ph.D. students and postdoctoral researchers die out after the main developer moves on to another job; the inevitable discovery of bugs in scientific code can cause unnecessary panic since researchers cannot easily determine which publications may have been affected; new Ph.D. students spend months rewriting code from scratch because code inherited from former group members is undocumented and difficult to understand. CodeRefinery workshops focus on state-of-the-art tools and best practices in sustainable software development which can alleviate many of these problems. Since 2016, the project has delivered 13 three-day workshops, as well as five shorter events, across the Nordic countries to over 400 students and researchers. These workshops focus on reproducibility and managing complexity in a collaborative environment by discussing version control, code reviews, automated testing, code documentation, and reproducible workflows and notebooks. According to the CodeRefinery project manager, Radovan Bast, “We consider the CodeRefinery workshops to be a logical next step for students and researchers who have participated in Software Carpentry workshops which focus on the basics. We employ a similar teaching style with interactive exercise-driven code-along sessions, but apply the concept to more advanced work.”
The next three years of the CodeRefinery project will build on the success of the previous years. More of the traditional three-day workshops will be delivered, with the hope that a self-sustaining community will start emerging with more locally organised workshops and volunteer instructors. However, the project will also – in a parallel track – develop new lesson material in order to reach new target audiences. In particular, researchers in humanities and social sciences face distinct technological challenges in making their research software and data reusable, accessible and findable. Topics such as data management, data scraping and data mining, framed around the FAIR principles for data and research output, will be taught in “data hackathons” organised around particular themes or problem types.
During the second phase of the project, universities and research institutes will be able to either request a CodeRefinery workshop or self-organise one. Requested workshops can be customized to meet the needs of the local staff by choosing which lessons will be taught. In this scenario, a local organiser is expected to assist with the administrative tasks involved in arranging a lecture room, handling registrations and advertising the workshop. It is also desirable that experienced local staff can assist as workshop helpers (and even contribute to the teaching), as this will build local competence and pave the way for future self-organised workshops. In such self-organised workshops, the open-source CodeRefinery training material can be taught by anyone who has attended and/or contributed to previous workshops.
In addition to requesting workshops, universities will be invited to join CodeRefinery as partners in the future. A partner university will be able to host 1-2 workshops per year with experienced instructors from other locations, and in exchange the university will be expected to contribute instructors to workshops in other universities. Indeed, after phase two, CodeRefinery hopes to become a self-sustaining organisation with a light-weight coordination structure and in-kind contributions from its partners. Through this, CodeRefinery will create a lasting culture of knowledgeable researchers, peer teaching and mentoring.
The goal of the CodeRefinery project is more than just a teaching program – it is also about catalysing and building a community of Nordic Research Software Engineers (RSEs). RSEs are people who support research by their knowledge of software engineering practices. Similar communities have emerged in other countries, including the UK, the Netherlands, and Germany, and serve as an important pillar of modern computational and data-based research. CodeRefinery workshops, data hackathons and other events will provide a meeting point for local RSEs, but also provide a hub to connect with a network with RSEs across the Nordic research communities and invest in our most important resource – people.
To request a CodeRefinery workshop, ask questions about using CodeRefinery material in your own courses, get further information on becoming a CodeRefinery partner, or request access to the repository hosting platform source.coderefinery.org, please contact the CodeRefinery team through the official support line: firstname.lastname@example.org.
This article is dedicated to the public domain under the CC0 license.
The authors: Thor Wikfeldt, PDC, Radovan Bast, UiT, Richard Darst, Aalto Science-IT, Max R. Eckardt, Datakuben-USD, Anne Fouilloux, UiO, and Stefan Negru, CSC-IT Center for Science