-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating BIDS validator and schema to contemporary upstream equivalent #1050
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1050 +/- ##
==========================================
+ Coverage 88.41% 88.48% +0.07%
==========================================
Files 72 73 +1
Lines 9251 9291 +40
==========================================
+ Hits 8179 8221 +42
+ Misses 1072 1070 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Additionally, we might want to start using the reference bids-examples repository for validation (rather than our own fork). In any case, using vanilla upstream would add another ~10s to the BIDS validator tests; as our fork had stripped a lot of the datasets. While we will prospectively no longer run the validator tests themselves (these would be run by the bids-schemacode package), we still need BIDS example data to construct our derivative BIDS datasets for higher-level function testing... Any ideas if we can somehow check out only some of the files to reduce download time? @yarikoptic |
45b386f
to
036e41d
Compare
This pull request introduces 1 alert when merging 4257ef8 into 50ca7b0 - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only started, need to switch, please attend to --report
option meanwhile
"-r", | ||
is_flag=True, | ||
help="Whether to write a report under a unique path in the current directory. " | ||
"Only usable if `--report` is not already used.", | ||
help="Whether to write a report under a unique path in the current directory. ", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we are breaking CLI ... why not just to keep --report
to be an option to provide the path and if not specified, assume that no report writing was requested - print to the screen?
Note: I might not be even able to write to current directory (dataset might be owned by someone else).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
functionality wasn't broken, help text was just incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed --report-flag
and made --report
from taking a string (path) into being is_flag
-- so, it got broken that user no longer would be able to say --report myvalidation.log
. And I think it is worth removing somewhat odd --report-flag
so ok to break but we better "make it right" this time and avoid future breakage.
If you really want to be fancy and support also bool
like behavior, I guess you would need to make it nargs='?'
(if click supports it, @jwodder could help) and then if not report
- assign that default path to the log, so user could do both --report
and --report mylog.log
.
Alternative -- not bother with "bool" like behavior in CLI, and just make it demand path string.
NB: please do not mark such comments Resolved - let original Author decide if they were resolved or not since it makes it harder for a reviewer to locate prior comments while re-reviewing and see if prior concerns were addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--report
handling still needs to be fixed and tested better
"-r", | ||
is_flag=True, | ||
help="Whether to write a report under a unique path in the current directory. " | ||
"Only usable if `--report` is not already used.", | ||
help="Whether to write a report under a unique path in the current directory. ", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed --report-flag
and made --report
from taking a string (path) into being is_flag
-- so, it got broken that user no longer would be able to say --report myvalidation.log
. And I think it is worth removing somewhat odd --report-flag
so ok to break but we better "make it right" this time and avoid future breakage.
If you really want to be fancy and support also bool
like behavior, I guess you would need to make it nargs='?'
(if click supports it, @jwodder could help) and then if not report
- assign that default path to the log, so user could do both --report
and --report mylog.log
.
Alternative -- not bother with "bool" like behavior in CLI, and just make it demand path string.
NB: please do not mark such comments Resolved - let original Author decide if they were resolved or not since it makes it harder for a reviewer to locate prior comments while re-reviewing and see if prior concerns were addressed.
It didn't get broken, the feature was simply removed. I added it initially because I thought it would be helpful, and I decided that in fact it is not. The |
@@ -453,7 +467,7 @@ def validate_all( | |||
|
|||
def write_report( | |||
validation_result, | |||
report_path="{logdir}/bids-validator-report_{datetime}-{pid}.log", | |||
report_path="/var/tmp/bids-validator/report_{datetime}-{pid}.log", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest still using a variable for the log base directory, but set it to $TMPDIR
(falling back to /tmp
) by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for upstream usage, we wrap this command and pass the DANDI log directory via appdirs. /tmp/
is tricky as it's ephemeral on some systems, and users might not know how their system handles it. /var/tmp
is the safer bet.
Compare:
https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s15.html
to
https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s18.html
Particularly:
Programs must not assume that any files or directories in /tmp are preserved between invocations of the program.
is pretty much a non-starter for logging which might need to be consulted over a longer period.
def validate_bids( | ||
bids_paths, | ||
schema_reference_root="{module_path}/support/bids/schemadata/", | ||
schema_reference_root="/usr/share/bids-schema/", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what conditions is this default used, and why would we expect anything to be in this directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the DANDI use case this is never used, but this is bundled code from upstream, which we will soon no longer track but import from a python dependency. Prospectively, the dependency would find the BIDS schema index in that directory, as that is where architecture-independent read-only user data belongs as per the FHS.
Co-authored-by: John T. Wodder II <[email protected]>
Co-authored-by: John T. Wodder II <[email protected]>
@jwodder thanks for your input :3 good to go now? |
ok, I think that main discussions/questions were addressed/resolved. Let's proceed and see where it takes us. Thank you @TheChymera ! |
Brought our code back in sync with upstream BIDS, as we finally have upstream
ome.zarr
support and there is no longer any reason to maintain a fork of the specification.This also fixes: #1037
Ideally this will be the last update of bundled code as we are working on importing the built-in Python validator as a library.
Should the need for specification forking arise in the future, this should best be handled via the parameterized schema directory.
Draft for now as there might be inconsistencies I haven't spotted.