This document describes best practices that software that is part of the CIG collection must meet:
- Minimum Best Practices are the minimum that we expect all software to meet.
- Standard Best Practices are the suite of standards CIG software should be following. If the software falls short of these standards, developers should have a plan of active development to achieve this level.
- Target Best Practices should be considered in the development plan for software under active development.
A sample repository that demonstrates these best practices can be found in the CIG software template repository (https://github.com/geodynamics/software_template).
Practices that codes must follow in order to be accepted by CIG.
-
Licensing
- Use an Open Source Initiative https://opensource.org/license open source license.
Examples
GPL, MIT, BSD
- Use an Open Source Initiative https://opensource.org/license open source license.
-
Version control
- Use version control to manage code changes.
- Use a public repository that is accessible without registration.
Examples
GitHub, GitLab - Obtain persistent identifiers for each named version of the software such as releases.
-
Portability, configuration, and building
- Ensure that the code builds on Unix-like machines (Linux, macOS) with only free tools.
- Use a well designed, portable build system.
Examples
cmake, make, autotools (Unix only), setup.py
-
Testing
- The software includes tests to verify that it runs properly.
- The software reports the results of accuracy and/or established community performance benchmarks.
-
Documentation
- Describe the research problem the software is designed to address. Discuss significant limitations.
- Provide instructions for building and installing the software.
- Describe all parameters including units. If dimensionless, specify the scaling used.
- Explain the physics the software simulates.
- Illustrate how to use the software to solve scientific problems with a few cookbook examples that have sample, editable input files.
- Provide documentation online or offline.
- Include how to cite the software (see also 6 below.).
-
Citable publication
- Provide a citable publication.
-
Support
- Clearly indicate if the software is actively supported and if so, how to report issues, contribute modifications and get help.
Example
provide a CONTRIBUTING.md document
- Clearly indicate if the software is actively supported and if so, how to report issues, contribute modifications and get help.
Practices in addition to the above Minimum Best Practices that should be used by all software developed within the CIG community. Software not meeting all standards should be actively working to eliminate deficiencies.
-
Version control
- Limit source tree to files necessary to build software and documentation, and run verification tests.
- In each release, include release notes distinguishing between significant changes, new features and bugfixes.
Example
use a changelog following https://keepachangelog.com/en/1.1.0/
-
Coding
- Use user-friendly specification of parameters outside of source code. Parameters should be specified at runtime, not at compile time.
Examples
graphical user interfaces, human readable parameter files - Provide a development plan, updated yearly, with prioritization of new features and an estimated timetable for their implementation.
- Use comments in the software that describe the following:
- Algorithms with appropriate references.
- Purpose of functions, objects, etc. and descriptions of arguments (inputs / outputs), and groups of objects.
- Strive for a modular design:
- Balance the use of external libraries to maximize reuse while minimizing dependencies and maintenance.
Examples
make use of PETSc, deal.II - Allow users to extend the code with new features or alternative implementations without destroying original functionality or modifying the main branch.
- Balance the use of external libraries to maximize reuse while minimizing dependencies and maintenance.
- Use error trapping strategies:
- User errors should result in a message that helps the user correct the problem. User errors should not result in crashes without error messages.
- Internal errors are generally bugs or unintended uses. Use consistency checks to catch internal errors which generate error messages that help the developer fix the problem.
- Aim for Scalability:
- Use distributed/parallel data structures.
- Use messages to transfer information between processes instead of the filesystem.
Example
MPI
- Use user-friendly specification of parameters outside of source code. Parameters should be specified at runtime, not at compile time.
-
Portability, configuration, and building
- Let the build system verify that dependencies are available and usable.
- Use an automated and portable configuration and build system.
- Output all configuration and build options during runtime to facilitate reproducibility.
Examples
commit id, compiler options, checksum
-
Testing
- Include pass/fail tests that verify that the software runs properly.
- Create a development pipeline that uses continuous integration (CI) to automate running tests.
Examples
GitLab pipelines, GitHub workflows, Azure pipelines, Jenkins
-
Documentation
- Provide user documentation that describes workflows for research use.
- Provide developer documentation that explains how to extend the code in anticipated ways.
- Provide documentation in dynamic form and available offline.
Example
Sphinx combined with a PDF file - Illustrate how to use the software to solve major scientific use cases with cookbook examples that have sample, editable input files.
- List authors and contributors.
Example
include a CITATION.cff
-
User workflow
- Ensure that running different simulations does not require rebuilding.
- Ensure that the code uses user specified directories and filenames for input and output.
- Use standard binary file formats.
Examples
NetCDF, HDF5, VTK
Practices in addition to the above Standard Best Practices that describe and define long-term development priorities for software developed within the CIG community. These go beyond the Standard Best Practices and are important for long-term projects.
- Version control
- Add new features in separate branches.
- Use a stable development (or main) branch for rapid release of new features.
- Coding
- Implement functionality as a library rather than an application.
- Leverage alternative implementations via plugins.
- Construct higher level applications using libraries as building blocks.
- Output provenance information (such as parameters used).
- Strive for scalability.
- Use parallel access to inputs and outputs.
Example
HDF5
- Use parallel access to inputs and outputs.
- Implement checkpointing and restart capability.
- Implement functionality as a library rather than an application.
- Portability, configuration, and building
- Let users select compilers, optimization and additional build flags during configuration without modifying files under version control.
- Permit multiple builds using the same source tree.
- Ensure software can be installed to a central location.
- Make software available as a package and/or containerized application that does not require manual build steps.
- Provide executable software via an online portal.
Examples
Jupyter servers, online software gateways
- Testing
- Provide pass/fail unit testing for software verification at a fine grain level.
- Use the Method of Manufactured Solutions for software verification at a coarse grain level.
- Use code coverage tools to assess gaps in test coverage.
Examples
python-coverage and gcov
- Documentation
- Include guidelines on parameter scales/combinations for which software is designed/tested.
- Provide a list of publications that cite or use the software.
Examples
link to the citations tracked by CIG and/or by the project - List ORCIDs for each author and contributor and encourage all contributors to add their ORCID ID to their GitHub profile.
- Create a wiki, FAQ, or knowledge base that provides answers to common questions.
- Provide guidance on archiving model data for publishing.
- User workflow
- Allow for reproducibility via archiving of workflows.