Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to our own Type type for everything that's exposed in trustfall_core #435

Merged
merged 37 commits into from
Oct 10, 2023

Conversation

u9g
Copy link
Contributor

@u9g u9g commented Aug 6, 2023

No longer re-export a 3rd party crate's Type type for Trustfall output or variable types. Instead of that, use our own.

@u9g u9g force-pushed the stop-reexporting-gql-type-called-type branch 2 times, most recently from 9eef82e to eef6fa7 Compare August 6, 2023 22:14
@u9g u9g force-pushed the stop-reexporting-gql-type-called-type branch from eef6fa7 to a663307 Compare August 6, 2023 22:17
trustfall_core/src/frontend/mod.rs Show resolved Hide resolved
trustfall_core/src/ir/indexed.rs Show resolved Hide resolved
trustfall_core/src/ir/indexed.rs Show resolved Hide resolved
/// let inner_ty_for_assert = Type::new("Int!").unwrap();
/// assert!(matches!(list_ty.value(), InnerType::ListInnerType(inner_ty) if inner_ty == inner_ty_for_assert));
/// ```
pub fn value(&self) -> InnerType<'_> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is value a bad name for this? Another candidate for this could be inner_value. Or, we could separate out what this enum gives us to two separate methods but I think it's much more useful and idiomatic this way, despite us doing more work in some cases to clone the List's inner type for the List() even if that branch is ignored. (although maybe that's optimized away? not sure)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not a great name.

I'm also really not sure about InnerType as a whole. When you find it difficult to come up with a good name for a concept, oftentimes it's an indicator of a confusing or muddled design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have a better interface in mind for this? I think that this is the most ergonomic way to expose the data that the user wants to know of is this a NameOfType (Named Type) or a ListInnerType (List). For the most part this is what we already have with the BaseType enum, so it doesn’t require too much code changes, but if you have a better design you can see, please share.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would require a lot of testing, but if you're up for it, here's a fairly compact and efficient Type implementation with no pointer chasing in the common path, and with all the internals hidden:

pub struct Type {
    base: Arc<str>,
    modifiers: Modifiers,
}

// This type is meant to be tiny: discriminant + 1 pointer/u64 union
enum Modifiers {
    Inline(InlineModifiers),
    Extended(Box<ExtendedModifiers>),
}

struct InlineModifiers {
    mask: u64,  // space for ~30 levels of list nesting
}

impl InlineModifiers {
    const NON_NULLABLE_MASK: u64 = 1;
    const LIST_MASK: u64 = 2;

    fn is_nullable(&self) -> bool {
        (self.mask & Self::NON_NULLABLE_MASK) == 0
    }

    fn is_list(&self) -> bool {
        (self.mask & Self::LIST_MASK) != 0
    }

    fn as_list(&self) -> Option<InlineModifiers> {
        self.is_list().then_some(|| {
            InlineModifiers::new(self.mask >> 2)
        })
    }
}

// In case someone really wants more than ~30 levels of nested lists in their type...
struct ExtendedModifiers {
    // ... TODO ...
}

impl Type {
    pub fn nullable(&self) -> bool {
        self.modifiers.is_nullable()
    }

    pub fn is_list(&self) -> bool {
        self.modifiers.is_list()
    }

    pub fn list_of(&self) -> Option<Self> {
        self.modifiers.as_list().map(|modifiers| {
            Self {
                base: self.base.clone(),
                modifiers,
            }
        })
    }

    pub fn base_type(&self) -> &str {  // this ensures the Arc<str> is an implementation detail
        self.base.as_ref()
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@obi1kenobi I think that initial definition of Type is a good start. If we want to optimize even further, there's some options to reduce Type from 4 words. These are rather tricky to get right, and we can optimize piecemeal, so I recommend against doing any of these in this commit.

To preface, we have an abundance of free real estate for tags:

  • The most significant bit of str length because it cannot exceed isize::MAX
  • The bottom bits of pointers because we can enforce the alignment. For example:
    • 2 byte align: 1 bit
    • 4 byte align: 2 bits
    • 8 byte align: 3 bits

Given the space for tags, we have various options for optimizing Type:

  • Make Type 3 words by making Modifiers a tagged pointer, leaving us 63 bits for InlineModifiers
  • Make Type 2 words by storing ExtendedModifiers as the header for the base name str
  • Make Type 1 word by moving str length into the allocation along with the optional ExtendedModifiers header
  • Compile-time types!
    • Tag Type to indicate &'static instead of Arc
    • Allows commonly-known types to have free Clone and no heap alloc

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to know we'll be set up for future optimizations, should they prove necessary.

Let's start with the initial Type layout (which should already eliminate a lot of pointer-chasing and be faster for it), and we can optimize further in the future — which won't even be a breaking change from here onward!

@u9g do you feel like you can take this initial sketch of a design for Type and run with it from here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bonus optimization: Type could also store tiny strings inline, up to 7 bytes for 1 word Type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll start working with the initial design and we can optimize it later :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial implementation has landed!

trustfall_core/src/ir/types.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/types.rs Outdated Show resolved Hide resolved
trustfall_core/src/schema/mod.rs Outdated Show resolved Hide resolved
Copy link
Owner

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled with this review, because something doesn't feel right about the new design here, but I can't put my finger on what. I think I'm getting tripped up on the naming, and failing to see the forest for the trees...

trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps make types a directory and then move this and types.rs inside there, then re-export their contents in types itself?

Having ty.rs and types.rs side by side is very confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't they be side-by-side in the types directory too though?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At that point we could come up with some better names for the files :)

Comment on lines 24 to 25
/// Creates a new [`Type`] from a string.
/// Returns `None` if the string is not a valid GraphQL type.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return some sort of error that describes why the type isn't valid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we want to write our own graphql type parser. It wouldn't be too hard I just didn't want to do it in this pr. async-graphql-parser's Type::new() returns Option<Type> so we can't get better error messages yet.

trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
/// let inner_ty_for_assert = Type::new("Int!").unwrap();
/// assert!(matches!(list_ty.value(), InnerType::ListInnerType(inner_ty) if inner_ty == inner_ty_for_assert));
/// ```
pub fn value(&self) -> InnerType<'_> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not a great name.

I'm also really not sure about InnerType as a whole. When you find it difficult to come up with a good name for a concept, oftentimes it's an indicator of a confusing or muddled design.

Copy link
Owner

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a step in the right direction, but there's lot of complex code with very insufficient testing, and at least a few serious bugs have snuck in as a result.

trustfall_core/src/ir/mod.rs Show resolved Hide resolved
trustfall_core/src/ir/mod.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/mod.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/ty.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/types.rs Outdated Show resolved Hide resolved
trustfall_core/src/ir/types.rs Outdated Show resolved Hide resolved
@obi1kenobi obi1kenobi added A-ir Area: compiler intermediate representation L-rust Language: affects use cases in the Rust programming language R-breaking-change Release: implementing or merging this will introduce a breaking change. R-relnotes Release: document this in the release notes of the next release labels Oct 10, 2023
@obi1kenobi obi1kenobi enabled auto-merge (squash) October 10, 2023 02:26
@obi1kenobi obi1kenobi merged commit 185deb3 into obi1kenobi:main Oct 10, 2023
12 checks passed
@u9g u9g deleted the stop-reexporting-gql-type-called-type branch October 10, 2023 03:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ir Area: compiler intermediate representation L-rust Language: affects use cases in the Rust programming language R-breaking-change Release: implementing or merging this will introduce a breaking change. R-relnotes Release: document this in the release notes of the next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants