-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiled release checks: Display values/details in the quality failure samples table #49
Comments
The check performs four comparisons, so the table would have to be a fair bit more complicated if you want the publisher to be able to see what is incoherent about the dates. In your simpler table, since the error might be with either of the two dates in each comparison, it's not clear to me which should be reported. In the metadata of the Previews, it shows the pairs of date paths and values that fail. For Moldova, it seems like they frequently amend tenders before the tender period. We should clarify how they implemented the semantics. |
I think the table can express each type of incoherence:
As an analyst I could then summarise and provide examples for each:
With the current set-up I can report to Moldova that there are definitely issues with tender amendments being before the tender period (subject to how they interpreted it), but I don't know whether there are other date coherence issues that don't appear in the previews I happened to click in to, so I also have to say 'there might also be issues with...' and explain the rest of the checks. |
Wouldn't it help to have the actual date values, not just the paths and IDs? I'm thinking we can perhaps add a tag to the reporting feature that reformats the metadata from multiple samples into a table. @hrubyjan |
I think it's most useful to have the relevant ids, e.g. to help track down an issue with the OCDS mapping/export for a particular procedure type or system, but I guess the actual dates could be useful to catch other types of issue, e.g. all amendments have date 1970-01-01 etc. |
For me, the dates give a sense of whether it's e.g. a time zone issue (just a few hours off), a semantic issue (like here, it's just a few days), or a clear data quality or mapping issue (month/years). Pulling up a compiled release (the publisher might not have compiled releases, so they'd have to figure out the specific releases) based on the OCID and then navigating to an entry in an array is a lot of effort, I think. The values can be sufficient to diagnose the issue. |
Okay, makes sense. In any case, it would be useful to know which amendment date / other date field combos have issues. |
Merging #48 into this issue: Title: consistent.period_duration_in_days: Report which periods the check failed for For example, from checking a few sample failiures it looks like the issue is with |
Merging #47 into this issue: Title: Field paths that fail the check of "Contracting process timeline" If we have many errors, it's not possible to know exactly which fields cause this failure. For example, in https://dqt.datlab.eu/resource/201/detail/coherent.dates we have 6,147 failed compiled releases, making it difficult to add the field paths in the Data Quality Feedback Report:
And for example, if we have 10 errors, we have to open each release one by one and verify in which field the problem is. |
Original title: Amendment dates check: List of amendments with issues
Similar to open-contracting-archive/pelican#53, from the amendment dates check I can see that there are 4,820 amendments with incoherent dates, but I can't see which amendments/dates are incoherent.
For feedback to the publisher it would be better if I could report which amendments/dates have issues. Ideally, I would like to export a results table with the following columns:
ocid
, amendment path,amendment/id
, date path, date parent ide.g. for a contract amendment which is before the contract's
dateSigned
ocds-xxxxx-123456, contracts/amendments, 123, contract/dateSigned, 456
Where '456' is the
id
of thecontract
.The text was updated successfully, but these errors were encountered: