Unify data loading between narrative and non-narrative modes #1305

jameshadfield · 2021-03-16T23:13:17Z

This PR represents in-progress work to address issue #1283.

Dataset loading, as of v2.24.0, differs dramatically between narrative & non-narrative ("normal") mode. Specifically, the narrative mode failed to fetch sidecar files (tip-frequencies and root-sequence) which resulted in #1283. To some extent the differences in logic between the modes is unavoidable, however there are steps in common which this PR aims to isolate into functions, thus reducing duplicated logic which may diverge.

I may not have a chance to revisit this for a week or so, however there are a few places where 👀 would be welcome:

Are there additions to narratives/test_multiple-datasets.md to capture examples not added via this PR?
@eharkins this PR should correctly populate the cache with sidecar files (for datasets which differ from that of the starting page), however when changing pages those sidecar files aren't requested from the cache. Do you have ideas here?
general comments on the abstractions started by 3c9d28c would be welcome

eharkins · 2021-08-06T19:46:32Z

Rebased on master and merged #1312 to address

sidecar files aren't requested from the cache

Will continue reviewing and testing here.

eharkins

Hi @jameshadfield! Jumping back into the code, I left a few comments about specific lines. Here's my interpretation of what remains to be done:

Use narrativeFetchingErrorNotification; https://github.com/nextstrain/auspice/pull/1305/files#diff-7e6b4cc45e1472da7b3f8f6e7438315586ecbd5450bc7db2a5263a6fe2063e16R290
Handle hardcoded server calls in parseUrlIntoAPICalls; https://github.com/nextstrain/auspice/pull/1305/files#diff-7e6b4cc45e1472da7b3f8f6e7438315586ecbd5450bc7db2a5263a6fe2063e16R205
It seems like you were intending on using google analytics exception tracking? Is this necessary for this PR or could it be in a separately-scoped one? https://github.com/nextstrain/auspice/pull/1305/files#diff-7e6b4cc45e1472da7b3f8f6e7438315586ecbd5450bc7db2a5263a6fe2063e16R14
prevent duplicate dataset requests when collecting narrative datasets; https://github.com/nextstrain/auspice/pull/1305/files#diff-7e6b4cc45e1472da7b3f8f6e7438315586ecbd5450bc7db2a5263a6fe2063e16R334
more testing in general of different narrative cases and specifically that narratives load sidecar files from cache; #1312 is working.

Please add to the list here if I missed some TODO items you had in mind :)

eharkins · 2021-08-07T00:02:47Z

src/actions/loadData.js

- *                  [1] {string | undefined} secondTreeUrl
- *                  [2] {string | undefined} string of old syntax
- */
-const collectDatasetFetchUrlsDeprecatedSyntax = (url) => {


Does the existing collectDatasetFetchUrls handle the deprecated syntax or does this mean we are dropping the ability to parse the deprecated syntax here?

@jameshadfield it seems like in its current state this PR drops the ability to parse the deprecated second tree syntax, e.g. /flu/seasonal/h3n2/ha:na/2y, which works on the master branch. Is that intentional?

src/actions/loadData.js

src/actions/navigation.js

This replaces the functionality of narratives/test_multiple-datasets.md which is removed.

These changes were motivated by [#1283](#1283) which arose as we used different code paths for loading a dataset viz and a narrative. Here we represent each dataset by a `Dataset` object. This is used for stand alone datasets, each dataset in a tangletree, and each dataset in a narrative. Each dataset instance describes the various API endpoints of datafiles for each dataset, manages fetching of these datafiles and, where appropriate, can dispatch data to update redux state. This has been tested on various single datasets, tangle-trees, and narratives in this repo. Notably, this commit breaks narratives with multiple datasets; this will be fixed in a subsequent commit to reflect Eli's work in PR #1312.

This complements the previous commit to allow narrative slide-changes to change datasets by querying the cache set up at narrative load. Appropriate sidecar files are also loaded, and we ensure that sidecar data from the previous dataset is not displayed. This work is based upon PR #1312 by eharkins. Co-authored-by: eharkins <[email protected]>

This fixes a longstanding (perhaps undescovered) bug where narratives could not both define different datasets per slide and have one of those datasets be a tanglegram.

jameshadfield · 2021-08-11T05:58:45Z

Thanks for the detailed review @eharkins - this prompted me to rewrite my original commit to further simply things. The concept of a "dataset" and its associated api calls, fetch results etc is now an object, so dataset views and narratives can both simply pass around instances of Dataset. I think this simplifies things quite a bit and prevents somewhat duplicated code.

I leant heavily on http://localhost:4000/narratives/test/sidecar-files for testing, which identified numerous bugs (including stale sidecar data, broken tangletrees etc). I've also tested a large number of individual datasets, tangletrees and other narratives. The nextstrain.org review app should allow further testing still.

@eharkins would you mind re-reviewing?

eharkins · 2021-08-18T22:54:11Z

src/actions/navigation.js

+      pushState: true,
+      query
+    });
+    mainDataset.loadSidecars(dispatch);


Makes sense in terms of frequencies to only fetch for the main dataset and not the second one since it seems like only the main dataset's frequencies can be displayed in a tanglegram view. What about for root sequences? Sorry, I'm not as familiar with how that file works / is displayed.

eharkins

This looks good. I still would like to test it more and get others to test it out as well before we merge but the code is much easier to understand in terms of logic and abstraction. A couple of abstraction notes:

I know we talked about whether it would make sense to refactor changePage at this point but maybe that's best done in a separate PR. To maintain the best of both worlds, maybe we could make changePage only do case 3 where we actually change page. Then we could have a separate function called changeDatasetState or something like that which has all three options with appropriate option names like queryUpdate, narrativeUpdate, changePage to allow us some consistency if we do think these 3 cases belong under the same heading nominally.
Since you might need to be able to dispatch a clean state with two datasets, i think the abstraction level of the dataset object makes sense as is, since dispatching inside a prototype attached to the dataset object would become complicated in that case.

jameshadfield · 2021-09-07T08:42:12Z

Thanks Eli - I agree that changes to changePage would make for a nicer (internal) API. I've re-tested and think this is good to go, so I'm going to merge and release to dev.nextstrain.org for a final testing pass.

jameshadfield temporarily deployed to auspice-bug-narratives--14d4pl March 16, 2021 23:13 Inactive

jameshadfield changed the title ~~Bug narratives frequencies~~ Unify data loading between narrative and non-narrative modes Mar 16, 2021

eharkins force-pushed the bug-narratives-frequencies branch from 3c9d28c to 0db3a38 Compare August 6, 2021 19:15

jameshadfield mentioned this pull request Aug 6, 2021

Test auspice PR 1305 nextstrain/nextstrain.org#393

Closed

eharkins mentioned this pull request Aug 6, 2021

narratives load sidecar files from cache; #1312

Merged

eharkins reviewed Aug 7, 2021

View reviewed changes

jameshadfield force-pushed the bug-narratives-frequencies branch from d20e833 to 50fcb43 Compare August 10, 2021 05:34

jameshadfield added 2 commits August 11, 2021 09:29

Add narrative to test multiple datasets with sidecar files

9a9bf2d

This replaces the functionality of narratives/test_multiple-datasets.md which is removed.

jameshadfield force-pushed the bug-narratives-frequencies branch from 50fcb43 to de3dafd Compare August 11, 2021 05:47

eharkins and others added 2 commits August 11, 2021 17:51

Allow tanglegrams to be used in narratives

5441324

This fixes a longstanding (perhaps undescovered) bug where narratives could not both define different datasets per slide and have one of those datasets be a tanglegram.

jameshadfield force-pushed the bug-narratives-frequencies branch from de3dafd to 5441324 Compare August 11, 2021 05:51

eharkins reviewed Aug 18, 2021

View reviewed changes

jameshadfield marked this pull request as ready for review September 7, 2021 08:28

jameshadfield merged commit 4aac28d into master Sep 7, 2021

jameshadfield deleted the bug-narratives-frequencies branch September 7, 2021 08:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify data loading between narrative and non-narrative modes #1305

Unify data loading between narrative and non-narrative modes #1305

jameshadfield commented Mar 16, 2021

eharkins commented Aug 6, 2021

eharkins left a comment

eharkins Aug 7, 2021

eharkins Aug 18, 2021

jameshadfield commented Aug 11, 2021

eharkins Aug 18, 2021

eharkins left a comment

jameshadfield commented Sep 7, 2021

Unify data loading between narrative and non-narrative modes #1305

Unify data loading between narrative and non-narrative modes #1305

Conversation

jameshadfield commented Mar 16, 2021

eharkins commented Aug 6, 2021

eharkins left a comment

Choose a reason for hiding this comment

eharkins Aug 7, 2021

Choose a reason for hiding this comment

eharkins Aug 18, 2021

Choose a reason for hiding this comment

jameshadfield commented Aug 11, 2021

eharkins Aug 18, 2021

Choose a reason for hiding this comment

eharkins left a comment

Choose a reason for hiding this comment

jameshadfield commented Sep 7, 2021