Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations to support large tracklists #4499

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

cmdcolin
Copy link
Collaborator

@cmdcolin cmdcolin commented Jul 26, 2024

This is a re-opening of the frozen tracks PR

It will be a challenging PR to get fully merged but interested users can try the branch out

Two optimizations were added to frozen_tracks4 today including (a) removing the clone module, which was accidentally quadratic and introduced big slowdowns after around 64000 tracks and (b) a change to the generateHierarchy function for the track selector to make it faster

cc @Maarten-vd-Sande

this branch (frozen_tracks4)

2.json  0.956s total
4.json  0.944s total
8.json  0.944s total
16.json  0.949s total
32.json  0.960s total
64.json  0.959s total
128.json  0.923s total
256.json  0.905s total
512.json  0.941s total
1024.json  0.934s total
2048.json  0.911s total
4096.json  0.995s total
8192.json  1.069s total
16384.json  1.097s total
32768.json  1.183s total
65536.json  1.463s total
131072.json  2.021s total
262144.json  3.212s total

before today on frozen_tracks4

2.json  4.234s total
4.json  0.935s total
8.json  0.917s total
16.json  0.931s total
32.json  0.987s total
64.json  0.918s total
128.json  0.943s total
256.json  0.904s total
512.json  0.948s total
1024.json  0.991s total
2048.json  0.960s total
4096.json  1.079s total
8192.json  1.369s total
16384.json  2.578s total
32768.json  7.298s total
65536.json  25.797s total
131072.json  1m34.86s total
...timeout at 262144...

main branch

2.json  0.935s total
4.json  1.044s total
8.json  1.021s total
16.json  1.006s total
32.json  1.086s total
64.json  1.216s total
128.json  1.491s total
256.json  1.981s total
512.json  2.828s total
1024.json  4.876s total
2048.json  8.627s total
4096.json  16.225s total
8192.json  42.861s total
...timeout at 16384...

some code for testing https://github.com/cmdcolin/jb2-large-tracklist-profiling

@cmdcolin cmdcolin changed the title Optimizations to support for large tracklists Optimizations to support large tracklists Jul 26, 2024
@cmdcolin cmdcolin added the scalability related to speed and/or scalability label Jul 26, 2024
@Maarten-vd-Sande
Copy link
Contributor

Maarten-vd-Sande commented Jul 29, 2024

While the initial load is very fast 🥳 , it introduced a bug loading tracks:

Error: HTTP 404 fetching data/dm61_FS_ampliconset-dec-2023_800nt.ampliconref/alignments/24-FS-13/P001__WB09__24-FS-13__24MB03939-1034_64a6acfdfd.cram bytes 0-131071
../../../packages/core/util/io/RemoteFileWithRangeCache.ts:94:13 (fetchBinaryRange@)

JBrowse 2.13.0

It even happens with the test data:

Error: HTTP 404 fetching volvox.filtered.vcf.gz.tbi
../../../node_modules/generic-filehandle/src/remoteFile.ts:171:13 (readFile@)

JBrowse 2.13.0

@cmdcolin
Copy link
Collaborator Author

interesting. i'm not sure if i fixed it but i pushed another change just now that could potentially help...can try to refetch the branch potentially with jbrowse upgrade --branch frozen_tracks4 again...

@Maarten-vd-Sande
Copy link
Contributor

Yes that seems to work! 🙏

Remove clone. It seems to be unneeded, even though it will mutate the original object, this doesn't seem like it should matter

Another optimization to the track selector

Use structuredClone

Misc lint fixes

Misc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scalability related to speed and/or scalability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants