Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store only the resolved reference in DocumentationContext.symbolIndex #543

Merged

Conversation

d-ronnqvist
Copy link
Contributor

@d-ronnqvist d-ronnqvist commented Apr 10, 2023

Bug/issue #, if applicable: rdar://106654403

Summary

This changes DocumentationContext.symbolIndex from a lookup that return DocumentationNode values to a lookup that return ResolvedTopicReference values.

Storing all nodes both in the documentation cache and in the symbol index makes it easy to accidentally hold on to old versions of nodes if the value in the documentation cache is updated without also updating the value in the symbol index. By instead only storing the node's resolved reference in the symbol cache and using it to find the node in the documentation cache, various places in the code that modify documentation nodes only need to update the value in one place.

This makes up for the previous peak memory usage increase from #498 (comment).

Dependencies

None.

Testing

None. This is a non-functional change. Everything should continue to work the same as before.

Checklist

Make sure you check off the following items. If they cannot be completed, provide a reason.

  • Added tests
  • Ran the ./bin/test script and it succeeded
  • Updated documentation if necessary

@d-ronnqvist d-ronnqvist requested a review from franklinsch April 10, 2023 22:03
@d-ronnqvist
Copy link
Contributor Author

d-ronnqvist commented Apr 10, 2023

In my local benchmarking across two different machines this lowers peak memory usage by ~1.5% while also making small improvements to the total covert time:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Metric                                   │ Change          │ main                 │ current              │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Duration for 'bundle-registration'       │ -1.184%¹        │ 8.723 sec            │ 8.619 sec            │
│ Duration for 'convert-total-time'        │ -0.6942%²       │ 14.529 sec           │ 14.428 sec           │
│ Duration for 'documentation-processing'  │ no change³      │ 5.376 sec            │ 5.377 sec            │
│ Duration for 'finalize-navigation-index' │ no change⁴      │ 0.228 sec            │ 0.229 sec            │
│ Peak memory footprint                    │ -1.412%⁵        │ 1,019.1 MB           │ 1,004.7 MB           │
│ Data subdirectory size                   │ no change       │ 280.8 MB             │ 280.8 MB             │
│ Index subdirectory size                  │ no change       │ 2.6 MB               │ 2.6 MB               │
│ Total DocC archive size                  │ no change       │ 401.3 MB             │ 401.3 MB             │
│ Topic Anchor Checksum                    │ no change       │ a8c40d797627683c5d7c │ a8c40d797627683c5d7c │
│ Topic Graph Checksum                     │ no change       │ 665713509084b5131a37 │ 665713509084b5131a37 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘

 1: There's a statistically significant difference between the two benchmark measurements.
    The values are different enough that the most probable explanation is that they're random samples from two different data sets.
    t-statistic : +3.770          degrees of freedom : 28       95% confidence critical value : +2.042

 2: There's a statistically significant difference between the two benchmark measurements.
    The values are different enough that the most probable explanation is that they're random samples from two different data sets.
    t-statistic : +2.143          degrees of freedom : 28       95% confidence critical value : +2.042

 3: No statistically significant difference.
    The values are similar enough that the most probable explanation is that they're random samples from the same data set.
    t-statistic : -0.026          degrees of freedom : 28       95% confidence critical value : +2.042

 4: No statistically significant difference.
    The values are similar enough that the most probable explanation is that they're random samples from the same data set.
    t-statistic : -0.443          degrees of freedom : 28       95% confidence critical value : +2.042

 5: There's a statistically significant difference between the two benchmark measurements.
    The values are different enough that the most probable explanation is that they're random samples from two different data sets.
    t-statistic : +9.061          degrees of freedom : 28       95% confidence critical value : +2.042

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

Copy link
Contributor

@franklinsch franklinsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! Is there a test that we can write that asserts that the references from the different stores are the consistent? Similar to https://github.com/apple/swift-docc/blob/c74237d83e542864033e0194b8f4bdf553c12485/Tests/SwiftDocCTests/Infrastructure/DocumentationContext/DocumentationContextTests.swift#L3282

@d-ronnqvist
Copy link
Contributor Author

Is there a test that we can write that asserts that the references from the different stores are the consistent?

Good idea. I think that'd be easy to test with one of the test bundles.

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist d-ronnqvist merged commit 1340fd2 into swiftlang:main Apr 11, 2023
d-ronnqvist added a commit to d-ronnqvist/swift-docc that referenced this pull request Apr 11, 2023
…swiftlang#543)

* Store only the resolved reference in DocumentationContext.symbolIndex

rdar://106654403

* Add test that references in symbol index have nodes in doc cache
@d-ronnqvist d-ronnqvist deleted the resolved-references-in-symbol-index branch April 11, 2023 22:02
d-ronnqvist added a commit that referenced this pull request Apr 12, 2023
…#543) (#545)

* Store only the resolved reference in DocumentationContext.symbolIndex

rdar://106654403

* Add test that references in symbol index have nodes in doc cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants