You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I very much like the spirit of this manifesto! But as so often, the devil is in the details.
My main problem with the current version is that there are different types of code involved in doing science, but the principles cited in the manifest cannot be applied straightforwardly to all of them.
Here is an illustration of how I see the typical scientific software stack: software-stack.pdf
I'd say that the top three layers are in the domain of the manifesto, so let's go through the principles one by one and see how they fit:
Open over closed
Makes sense for all layers, but "released by the time of publication" makes sense only for the top one, assuming that "publication" refers to a scientific paper. The lower layers evolve independently on any specific paper.
Code for the future
As a principle this is fine for everything, but "testing, writing documentation, instructions on how to run and maintain your code" is not always reasonable or practical for the top layer. Nobody maintains project-specific workflows or notebooks today, and it isn't clear that this is a practice one could reasonably move towards unless scientific papers become less numerous and more substantial. Testing is also of very limited interest for code that computes things that have never been computed before.
Incorrect code results in incorrect science
"Code published in journals should be peer reviewed." I'd say that all scientific code should be peer reviewed. For the top layer, this should be part of reviewing the scientific paper, because the review must also check if the code actually does what the paper says. But this requires changes in the review process that are not obvious to implement. For example, the experience of ReScience suggests that effective code review requires rapid interaction between authors and reviewers referring to a common codebase.
For the bottom two layers, code review needs to be continuous as the code evolves, meaning that it must be separate from any journal publication. There is no infrastructure for this at all at this time. Is it reasonable in a manifesto to call for something that is impossible in the immediate future? Honest question, I don't know how pragmatic manifestos should be.
Availability over perfection
This mostly applies to the top layer. The further down the stack you move, the more professionalism can and should be expected.
Code deserves credit
Certainly, but how far down the stack should one cite? For the top two layers, the obligation seems obvious. But should you cite NumPy? BLAS? Python? zlib? gcc? Linux?
The text was updated successfully, but these errors were encountered:
the experience of ReScience suggests that effective code review requires rapid interaction between authors and reviewers referring to a common codebase.
I very much like the spirit of this manifesto! But as so often, the devil is in the details.
My main problem with the current version is that there are different types of code involved in doing science, but the principles cited in the manifest cannot be applied straightforwardly to all of them.
Here is an illustration of how I see the typical scientific software stack:
software-stack.pdf
I'd say that the top three layers are in the domain of the manifesto, so let's go through the principles one by one and see how they fit:
Makes sense for all layers, but "released by the time of publication" makes sense only for the top one, assuming that "publication" refers to a scientific paper. The lower layers evolve independently on any specific paper.
As a principle this is fine for everything, but "testing, writing documentation, instructions on how to run and maintain your code" is not always reasonable or practical for the top layer. Nobody maintains project-specific workflows or notebooks today, and it isn't clear that this is a practice one could reasonably move towards unless scientific papers become less numerous and more substantial. Testing is also of very limited interest for code that computes things that have never been computed before.
"Code published in journals should be peer reviewed." I'd say that all scientific code should be peer reviewed. For the top layer, this should be part of reviewing the scientific paper, because the review must also check if the code actually does what the paper says. But this requires changes in the review process that are not obvious to implement. For example, the experience of ReScience suggests that effective code review requires rapid interaction between authors and reviewers referring to a common codebase.
For the bottom two layers, code review needs to be continuous as the code evolves, meaning that it must be separate from any journal publication. There is no infrastructure for this at all at this time. Is it reasonable in a manifesto to call for something that is impossible in the immediate future? Honest question, I don't know how pragmatic manifestos should be.
This mostly applies to the top layer. The further down the stack you move, the more professionalism can and should be expected.
Certainly, but how far down the stack should one cite? For the top two layers, the obligation seems obvious. But should you cite NumPy? BLAS? Python? zlib? gcc? Linux?
The text was updated successfully, but these errors were encountered: