Skip to content
This repository has been archived by the owner on Sep 14, 2021. It is now read-only.

Share/export the results #6

Closed
jpmckinney opened this issue Dec 3, 2019 · 9 comments
Closed

Share/export the results #6

jpmckinney opened this issue Dec 3, 2019 · 9 comments
Labels

Comments

@jpmckinney
Copy link
Member

jpmckinney commented Dec 3, 2019

Downloading data:

Matt: Would be useful to have a download of a CSV for each table as we sometimes include these for publishers during data feedback

Charlie: i want to be able to export a summary of the DQT results in a structured format e.g. CSV

Copying data:

Matt: Even more useful would be having a “copy table” button which copies the table in CSV format so we can paste it into GDocs

Copying visualizations:

Charlie: Visual representations of the data are helpful; better even to be able to copy and paste them into reports to support publisher understanding where applicable.

@romifz
Copy link

romifz commented Jan 31, 2020

I think it would be useful to be able to download a list of ocid/release ids that fail a check, especially for those that may require complex queries to extract a list of cases.

@jpmckinney jpmckinney changed the title Export the results Share/export the results Feb 13, 2020
@jpmckinney
Copy link
Member Author

jpmckinney commented Feb 13, 2020

Some notes from OCDS retreat plenary session:

  • Export the JSON for the sample of specific cases that are failing / download the list of errors
  • For each check, add boilerplate text that can be copy-pasted into feedback reports / Export check results in plain text, or other formats to include in feedback reports
    • A fancier version of this would be a button to export to Google Docs
  • Could Pelican prepare SQL queries to select the data in Kingfisher?

@jpmckinney
Copy link
Member Author

See also #22

@jpmckinney
Copy link
Member Author

jpmckinney commented Mar 9, 2020

Additional notes from OCDS retreat co-working session:

Context

When implementers share their OCDS publication with us, a helpdesk analyst presently collects and processes the data with Kingfisher, then uses a Google Colaboratory notebook to run SQL queries, whose results are used to fill in a data feedback report template.

We are now looking into how Pelican can automate and replace parts of the Colaboratory notebook.

While a publication is in-progress, an implementer might share a sample of data with us. In this case, we often give quicker feedback, based on the results of the Data Review Tool and based on a manual review of the data.

We are also considering ways in which Pelican can support this quick feedback.

Features

Download sample failures

It is useful for implementers to see the specific data that has quality issues. As such, it'd be good if Pelican offered the option to:

  1. Download the sample OCIDs that fail the check
  2. Download the sample data that fails the check

A list of OCIDs is useful, because it's less data to send to the implementer. The data itself is also useful, because sometimes the implementer will have already made changes to their system, and their version of the data might have changed or been deleted.

Support report writing

Analysts can already copy-paste results and take screenshots from Pelican to put into feedback reports, but this is time-intensive. @pindec and @romifz prepared the proposal below (please correct any misinterpretations on my part!).

The user flow would be:

  1. The analyst selects which checks to export (e.g. with an "export" checkbox for each check)
    1. The analyst can alternatively click "all failing checks" to auto-select those with failures
  2. The analyst indicates the number of sample failures to include per check
  3. The analyst clicks an "export" button to generate a report on the selected checks
  4. A Google Document is created, and the link is shown to the analyst
    1. I haven't looked into this in detail, but I assume the backend can generate a DOCX or ODT file (whether directly or by using an HTML to DOCX converter) and upload it to Google Docs (which will then convert it for online editing), or create a Google Document natively

The report would contain a summary, and then sections for:

  • field-level coverage
  • field-level quality
  • compiled release coherence
  • compiled release consistency
  • compiled release reference
  • dataset quality (do we want to split this up?)
  • time-based quality

The summary would simply be the list of sections, and a list of checks within each section, indicating the check's pass/fail rate (similar to the web page). In other words, the summary is a table of contents with pass/fail numbers.

Each section would start with boilerplate introductory text, authored by analysts (like in the current feedback report template). This can be managed either in the same way as the help text in the app, or by importing a Google Spreadsheet (easier for analysts to update). Then, there would be details for each check:

  • An image version of the check's visualization
  • Sample failures
    • This might be a two-column table with the OCID and the failing portion of the JSON data (interacts with #18).
  • Boilerplate explanatory text (managed in the same way as other boilerplate):
    • An explanation of the check, and why it's important
    • A description of what action the publisher should take:
      • If the check has failures, a description of how to fix the issue
      • If the check has no failures, some brief praise

@hrubyjan
Copy link

@jpmckinney @romifz @pindec

This is a concept how the whole process can work. This is based on the requirements we discussed on a call (May 5th). Let me know whether we are on the right track 👍🏾 or completely wrong 👎🏼

Our idea is to set up a tagging system which will allow us to import data from Pelican to google document. There will be tags that can be used to import

  • pictures
  • OCIDs
  • sample JSONs
  • links to archives with complete set of sample JSONs
  • whole google documents
  • and/or any other infromation that we'll find useful

Our proposed workflow is

  • users will create a sub templates each for one check so that it's easily reusable by other users
    • eventually there will be a catalog of templates for each check (type of check, we obviously don't want to create template for each field level check separately 😉 )
  • these sub templates will be imported into a main feedback report using tags
    • summary section can be created automatically as we have a list of checks to be imported
  • alternatively or additionally, user can use tags to directly modify the main feedback report template
  • user will give Pelican a URL of the main feedback report template
  • Pelican will replace tags with the real data (images, json, etc.)
  • Pelican will export archives with all sample JSONs and will upload it to gDrive (or somewhere else). One archive for each check. These can be linked directly from the feedback report
  • user will do the final polishing of the texts and formatting
  • user will send the document to the Publisher

This workflow allows user to modify texts twice

  • in the shared sub template
  • in the final feedback document

@romifz
Copy link

romifz commented May 19, 2020

Hi @hrubyjan,

Thanks for this. For me the proposal seems right, however, I'm not sure if I understand the sub-templates concept. Would these exist inside Pelican? And be editable by us? Once the system is in place, we will need to define an initial set of sub-templates, right?

@jpmckinney
Copy link
Member Author

jpmckinney commented May 19, 2020

In our call today, @hrubyjan and I agreed that Datlab will prepare a brief main report template and some sub-templates to demonstrate how the helpdesk will configure the templates – to make it easier to give feedback on the proposal.

@pindec
Copy link

pindec commented May 22, 2020

Thanks @jpmckinney, @hrubyjan, seeing an example would be very helpful.

The proposal sounds like it is on the right track. The sub-templates sound like they may be placeholders for data to import from Pelican, together with explanatory text to support interpretation of that data.

It would also be good for us to understand how the list of checks to be imported from Pelican is generated, given that the selection pool is large and there are multiple formats. We agreed we don't need a flashy UI, but making repetitive tasks simple e.g. tickboxes rather than analysts having to add/delete tags themselves would be helpful.

@jpmckinney
Copy link
Member Author

Closing since the feature is implemented, and follow-up issues are created for changes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants