Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use ArcStr for storing strings in Schema #1247

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

audunhalland
Copy link
Contributor

@audunhalland audunhalland commented Mar 5, 2024

Fixes #819

This is an experiment that uses arcstr::ArcStr for storing strings within the GraphQL schema. String literals that are constant/come from macros no longer allocate memory.

ArcStr appears to fit perfectly for juniper in my opinion. It delivers extremely cheap literal strings, which is the most common usecase, while also supporting dynamic schemas based on extensive use of custom TypeInfo (which is how we're using juniper at work), potentially without extra memory allocation. There's of course the "cost" that users have to see ArcStr in the dynamic schema building APIs, but I think that with these new APIs exposed it should be much clearer when juniper needs to allocate string memory and where it's "free" (using literal!). In short, this choice is now put in the hands of the user.

BREAKING CHANGE:

  • Schema types are not lifetime-generic anymore.
  • GraphQLValue::type_name and GraphQLType::name both return Option<ArcStr> instead of Option<&str>.

Benchmark results

master

Sync vs Async - Users Flat - Instant/Sync/1
                        time:   [24.076 µs 24.100 µs 24.124 µs]
                        change: [+0.9795% +1.1412% +1.3003%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe
Sync vs Async - Users Flat - Instant/Async - Single Thread/1
                        time:   [26.135 µs 26.170 µs 26.218 µs]
                        change: [-0.9653% -0.7074% -0.4492%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
Sync vs Async - Users Flat - Instant/Async - Threadpool/1
                        time:   [25.893 µs 26.171 µs 26.474 µs]
                        change: [+1.6342% +4.1267% +7.0669%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
Sync vs Async - Users Flat - Instant/Sync/10
                        time:   [25.227 µs 25.244 µs 25.265 µs]
                        change: [+0.1578% +0.3255% +0.4799%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe
Sync vs Async - Users Flat - Instant/Async - Single Thread/10
                        time:   [25.831 µs 25.871 µs 25.920 µs]
                        change: [+2.0800% +2.5706% +3.0084%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
Sync vs Async - Users Flat - Instant/Async - Threadpool/10
                        time:   [25.397 µs 25.685 µs 25.936 µs]
                        change: [-1.7187% -0.4294% +0.8521%] (p = 0.50 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

arcstr

Sync vs Async - Users Flat - Instant/Sync/1
                        time:   [9.8646 µs 9.8708 µs 9.8771 µs]
                        change: [-59.007% -58.946% -58.885%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
Sync vs Async - Users Flat - Instant/Async - Single Thread/1
                        time:   [11.486 µs 11.499 µs 11.512 µs]
                        change: [-56.273% -56.158% -56.047%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild
Sync vs Async - Users Flat - Instant/Async - Threadpool/1
                        time:   [10.803 µs 10.851 µs 10.897 µs]
                        change: [-60.022% -58.793% -57.906%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  13 (13.00%) high mild
  1 (1.00%) high severe
Sync vs Async - Users Flat - Instant/Sync/10
                        time:   [11.158 µs 11.173 µs 11.191 µs]
                        change: [-55.883% -55.816% -55.743%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
Sync vs Async - Users Flat - Instant/Async - Single Thread/10
                        time:   [10.419 µs 10.428 µs 10.436 µs]
                        change: [-59.952% -59.862% -59.782%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) low mild
Sync vs Async - Users Flat - Instant/Async - Threadpool/10
                        time:   [11.355 µs 11.487 µs 11.611 µs]
                        change: [-55.387% -54.856% -54.295%] (p = 0.00 < 0.05)
                        Performance has improved.

All of the improvements should be related to schema creation, not the query executor.

juniper/src/ast.rs Outdated Show resolved Hide resolved
@audunhalland audunhalland force-pushed the arcstr branch 7 times, most recently from 8b02019 to 7e3e6b4 Compare March 6, 2024 11:49
@audunhalland audunhalland marked this pull request as ready for review March 6, 2024 12:30
match *t {
Type::NonNullNamed(ref n) => TypeType::NonNull(Box::new(
self.type_by_name(n).expect("Type not found in schema"),
pub fn make_type(&self, t: DynType) -> TypeType<S> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is public, but DynType/AsDynType are not exported, so in the generated rustdocs this method is visible, but I'm unsure whether it's actually usable from outside juniper since you would have a hard time instantiating a DynType value?

Should functions that have private(ish) types in their signature be pub(crate) instead?

@audunhalland audunhalland changed the title refactor: use ArcStr for storing strings in Schema feat: use ArcStr for storing strings in Schema Mar 15, 2024
@LegNeato
Copy link
Member

Sweet, this looks great! I don't have time to look at this right now but perhaps @tyranron has opinions.

@tyranron
Copy link
Member

@LegNeato this is definitely on my agenda in near future. However, the diif is huge enough, which will take some time.

BREAKING CHANGE: Schema types are not lifetime-generic anymore
@tyranron tyranron added this to the 0.17.0 milestone Apr 8, 2024
@tyranron
Copy link
Member

tyranron commented Apr 8, 2024

@audunhalland I'm going to review this in next few days and release in 0.17.0.

@tyranron tyranron added enhancement Improvement of existing features or bugfix k::performance Related to performance k::refactor Refactoring, technical debt elimination and other improvements of existing code base labels Apr 8, 2024
@audunhalland
Copy link
Contributor Author

audunhalland commented Apr 8, 2024

@tyranron cool, I'm not 100% happy with all parts of this PR, there are surely things that can be iterated on later. Especially I see DynType and AsDynType as a kind of hacky way to limit the scope and get it over the line.

edit: also note for the review: I'm not sure I understand the reflect module fully, like e.g.:

impl<S> reflect::WrappedType<S> for ArcStr {
    const VALUE: reflect::WrappedValue = 1;
}

@tyranron
Copy link
Member

tyranron commented Apr 8, 2024

@audunhalland

edit: also note for the review: I'm not sure I understand the reflect module fully, like e.g.:

impl<S> reflect::WrappedType<S> for ArcStr {
    const VALUE: reflect::WrappedValue = 1;
}

That should be fine, yes.

reflect::WrappedType::VALUE is a tricky way to encode composed GraphQL types in prime numbers, since we're kinda limited with const evaluation abilities in Rust for now. 1 means a primitive, non-composite type, which ArcStr represents, the same way as String and similar.

@tyranron tyranron added semver::breaking Breaking change in terms of SemVer k::api Related to API (application interface) labels Apr 8, 2024
@audunhalland
Copy link
Contributor Author

audunhalland commented Apr 11, 2024

@tyranron an idea: Could we somehow make From<ArcStr> a supertrait of ScalarValue? Then one could make a custom scalar value that doesn't need to clone string buffers that are already ArcStr. Think introspection queries, etc.

(Ideally From<&ArcStr> (which doesn't require cloning up front), but that's not possible without making ScalarValue lifetime generic. Another idea is to add a method fn from_arcstr(&ArcStr) -> Self to trait ScalarValue itself.)

edit: fn from_arcstr(&ArcStr) could be a method with a default implementation that just delegates to From<String>

@audunhalland
Copy link
Contributor Author

@tyranron I see there are some compile errors on the bikeshedded changes now. Should I help out with trying to resolve those?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement of existing features or bugfix k::api Related to API (application interface) k::performance Related to performance k::refactor Refactoring, technical debt elimination and other improvements of existing code base semver::breaking Breaking change in terms of SemVer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use a small and/or read-optimized string for internal field representation
3 participants