Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Initial data spec #39

Merged
merged 29 commits into from
Jul 12, 2024
Merged

Add Initial data spec #39

merged 29 commits into from
Jul 12, 2024

Conversation

aclark02-arcus
Copy link
Collaborator

@aclark02-arcus aclark02-arcus commented Jun 27, 2024

Makes progress on #21.

Please review and let me know if these column specs for the study_data object are inaccurate. I was just doing some light exploration in 02_dev.R in order to understand the data better. I can remove that code before we merge this PR.

This is a work in progress. I still need to define specs for the meta object, but happy to push that duty down the road or to someone else.

@aclark02-arcus aclark02-arcus changed the title Ac 21 data spec An Initial data spec Jun 27, 2024
@aclark02-arcus aclark02-arcus changed the title An Initial data spec Add Initial data spec Jun 27, 2024
@aclark02-arcus aclark02-arcus requested a review from LDSamson June 27, 2024 19:06
R/run_app.R Outdated Show resolved Hide resolved
R/run_app.R Outdated Show resolved Hide resolved
vignettes/data_spec.Rmd Outdated Show resolved Hide resolved
DESCRIPTION Outdated Show resolved Hide resolved
Copy link
Collaborator

@LDSamson LDSamson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aclark02-arcus thanks for this, this is a good start indeed.

I've added a view comments. It will need an additional review after you are done with creating your own datasets, but that can be done in a follow-up PR indeed.

Some general points/common thoughts (can also be for future PRs):

  1. Maybe it's good to put the descriptions in the descriptions of the internal data objects, and refer to them, stating that the columns should look like that.
  2. clinsightful_data is data that is already merged with metadata. The descriptions in the vignette should probably be focused on how the raw data looks like, so that the function merge_meta_with_data() (or an equivalent one) can shape it in the right format, given the right metadata object. For example: columns such as item_name, item_type and item_group should not be in the raw data but are created after merging with metadata.
  3. In addition to point 2.: we should see if it is possible to reduce the number of mandatory columns, to reduce complexity. For example, form_id and event_id; they are hardly used.
  4. We should evaluate if it is easier/better to define in the metadata which columns together can identify a unique event, and create key columns based on that for internal use.

@aclark02-arcus
Copy link
Collaborator Author

As discussed on 6/28, @aclark02-arcus will move all this documentation into a vignette instead of as documentation for the run_app() function, especially since these configs are no longer arguments to that function.

@aclark02-arcus
Copy link
Collaborator Author

2. clinsightful_data is data that is already merged with metadata. The descriptions in the vignette should probably be focused on how the raw data looks like, so that the function merge_meta_with_data() (or an equivalent one) can shape it in the right format, given the right metadata object. For example: columns such as item_name, item_type and item_group should not be in the raw data but are created after merging with metadata.

I'm going to focus on this to round out and complete this PR. All other requests above have been addressed.

@LDSamson
Copy link
Collaborator

@aclark02-arcus thanks for your work. If you are okay with it, I can finalize the PR (update a few sentences to reflect the current state, fix the R CMD check, add a few clarifications).

@aclark02-arcus
Copy link
Collaborator Author

@LDSamson, yes! Please do & thank you!

@LDSamson LDSamson merged commit 0b897f0 into dev Jul 12, 2024
2 checks passed
@LDSamson LDSamson deleted the ac-21-data-spec branch July 12, 2024 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants