-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of CohortIncidence module. #147
Implementation of CohortIncidence module. #147
Conversation
Fixed param typo in ResultDataModel.R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that this should be in the package for several reasons:
- If a change is introduced in the model for any reason that requires a release of both packages. Update just one and this will result in a break interoperability
- Users will always need to work with their data, even if they don't use strategus they may wish to use the shiny apps or other report generation tools. Now they will have a dependency they didn't previously need
- The current release of CohortIncidence just released a breaking change with regards to shiny apps and any other apps by not implementing migrations. Now, currently existing projects will no longer be upgradable without custom hacks for what is a relatively small change to the data model
- It's inconsistent with other packages and utilities
- It places a burden on the maintainer of Strategus to maintain 100% version consistency with CohortIncidence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the questions and comments, @azimov . I'll preface this by saying I did give a lot of consideration between what goes into the package and what is part of the module. The deciding factor was if it was a Strategus concern, it went into the module, if it was a package concern, it goes into the package.
If a change is introduced in the model for any reason that requires a release of both packages. Update just one and this will result in a break interoperability
Not necessarily: if the new version of the package produces the same output, from the same input, then I don't think you need a new version of Strategus (just reference a new version in renv.lock). If the analytics change, either by inputs or outputs, I think it would make sense that a different version of Strategus would be necessary to represent the changes in dependencies. I would suggest 2 things on this front: a release of Strategus comes with a published renv.lock file that contains which versions of package dependencies have been tested with the given version of Strategus, and that within a single release of Strategus you may have multiple updates to underlying packages.
Users will always need to work with their data, even if they don't use strategus they may wish to use the shiny apps or other report generation tools. Now they will have a dependency they didn't previously need
While this is true, it can't use the OhdsiShinyModules to view results from the underlying packages at the package level (ie: take the output of CI package and use OhdsiShinyModules to view results). The reason is that the OhdsiShinyModules nave adopted conventions that have been specified by Strategus. 2 Examples: if you want the list of databases, you go to the database meta table (which was created by Strategus; and if you want to get cohort definition names, you go to the cg_table_prefixcohort_definition
, a table created by CohortGenerator which Strategus enforces that CohortGenerator is part of the Strategus analsyis. See the following:
https://github.com/search?q=repo%3AOHDSI%2FOhdsiShinyModules%20prefixcohort_definition&type=code
Based on these decisions, I would suggest that OHDSIShinyModules becomes something like StrategusShinyModules, or we could embed the ShinyModules into the release of Strategus so that you have both the execution of analysis, persistence of results and report viewer all bound to the same version of the software. I fully understand that the original intent of OhdsiShinyModules was intended to view results of the individual HADES packages, but, in my estimation, this principle has been abandoned in favor of Strategus-specific concerns.
The current release of CohortIncidence just released a breaking change with regards to shiny apps and any other apps by not implementing migrations. Now, currently existing projects will no longer be upgradable without custom hacks for what is a relatively small change to the data model
I thought about this, but I decided on the approach for 2 reasons: 1) you won't be changing the version of the analysis packages within a single project. and 2) while I would have preferred a migration approach to this, the most expedient way was to use the functionality out of RMM to create a schema base don the results model spec. I'd like to move to a pure-migration approach to managing database schemas (as we do in WebAPI), but based on the nature that the schema shouldn't change within a study, I decided the simplest approach would be to use RMM.
It's inconsistent with other packages and utilities
There isn't exactly a published standard on this, and the module layer provides that structure. However, it is consistent with the EvidenseSynthisis package which is another RMM that is contained in Strategus.
This PR follows the Strategus conventions through the use of the common R6 classes for module implementation, and in this way, it is consistent with the other packages. This module layer allows the underlying packages to be independent towards how they can function most effectively (in their own way) while being consistent in execution across all modules.
There are a number of ways that packages are inconsistent: CI handles JSON serialization/deserializtion differently, but it is because serialization via ParallelLogger is broken. Not every package defines inputs via R6 classes. I think there is going to be differences in approaches for many of the HADES packages, and I can understand the desire to keep that consistent, it may be more trouble than it's worth to get down to every possible detail.
It places a burden on the maintainer of Strategus to maintain 100% version consistency with CohortIncidence
I'm not sure this is the case, as @anthonysena wasn't involved in any of the implementation of this PR for incorporation of CI into Strategus. I do expect that we'll have different collaborators as the 'responsible party' for the maintenance of the underlying modules (which may or may not be the underlying package maintainer), and I think this will be coordinated through the Strategus team.
Thanks again for your thoughts and comments. Let me know if I misunderstood anything, and I will be glad to provide more detail.
@anthonysena , I've converted to draft as there's one additional thing I'd like to add (and this is a good illustration of some comments that I made in the feedback from @azimov ): In OhdsiShinyModules, the Characterization/CohortIncidence view has some functionality that wants to navigate to those outcomes that belong to specific targets of interest. In other words, the CI UX for reporting would like to be able to select outcomes that exist for a given T. This isn't something that is captured in the analysis results, but it is something that is required by Strategus. So, I will add a new table to the results model specification in Strategus that doesn't exist in the underlying package, which will store the given T-O-TAR pairs that were specified in the analysis. The Module implementation will add this result to the output as part of the |
…nterface' into module-interface-cohortincidence # Conflicts: # R/Module-CohortIncidence.R
Hmm, after I merged the upstream branch to this branch, tests failed in something related to Characterization. @anthonysena do you have any idea where that came from? |
On the topic of module implementation philosophy: I spoke for an hour with @jreps and @pbr6cornell about the implementation details and what is a Strategus, OHDSIShinyModule and package concern, and while we do want to get to a consensus on philosophy and approach, we're not quite there yet. To that end, I've decided that the implementation for this PR will stand, but I did commit to after having this in place for a time, we can discuss the merits and problems with the approach and I have committed to revisiting the code to address any issues that arise. Moving the implementation from module into package will not result in any API changes, so there should not be a backwards compatibility challenge by making the change in the future. |
Yes, the most recent release of FeatureExtraction broke Characterization as described here: OHDSI/Characterization#54. Once we patch Characterization, the unit tests should run without issue. Given this, we can move forward with merging this work into the other feature branch if you feel it is ready? I did a quick review and it is fine for now so we can wrap up the initial module work. |
@anthonysena , I added a new table to the results model for CI that isn't part of the output of the package, but is a requested table to support reporting. This is a good example of where module comes in and bridges functionality betwene the package and the reporting. So this PR is ready to go. |
@anthonysena , i added a commit to this PR but touches DatabaseMetaData.R: I had to set Ideally, we'd just want to convert all columns to the correct underlying type before upload (which is one of the 3 things that checkAndFixCommands does), so maybe we can have RMM change to allow just the column conversion done separately from the other 2 things (which one checks for dupes and I forget what the 3rd function is, but these 2 ops would be very expensive on very large data). |
So |
It's all handed by I do not know why it started working differently, I'm not sure if it was based on a merge from upstream or a new version install of RMM, but the solution was to set that flag in uploadResults. |
Let me push up a change where i don't use that flag, but append the databaseId as a Edit: I see the problem with this now: The value is written to the CSV correctly (there is no .0 in the csv). However, we lose control over what happens in I think the long term solution is to ask RMM to modify the readCSV to read the columns based on the data types that are specified in the results model. By default, the |
I did a test to see if there was a difference in behavior when uploading results to SQLite vs. PostgreSQL based on the notes made by @chrisknoll above. The answer is that there is a difference in behavior. When I uploaded the For now, to keep things consistent, I'll remove the |
00be102
into
remove-deps-add-module-interface
* Implementation of CohortIncidence module. (#147) * Adjustments from testing --------- Co-authored-by: Chris Knoll <[email protected]>
* Implementation of CohortIncidence module. (#147) * Adjustments from testing --------- Co-authored-by: Chris Knoll <[email protected]>
This PR uses ResultsModelManager to create the results schema and upload results.