Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose design for packed conditional fields #192

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jasongraffius
Copy link
Collaborator

See #191

@jasongraffius jasongraffius force-pushed the packed-conditional-fields branch from 91d2737 to 1ec8851 Compare September 24, 2024 19:41
- Otherwise:
- Copy `$present(f) ? next : previous` as an expression skeleton
- Replace
- `f` with a `FieldRerference` to `field`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Rerference => Reference

Comment on lines +166 to +168
The purpose of using a stack for `lookback` is twofold, firstly it ensures
correct operation with chained conditional blocks one-after-another, such as in
[Example 4](#example-4-consecutive-conditional-blocks), as well as future
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lookback stack and some of the condition checking are optimizations that, I think, make this section harder to understand.

I think the new (unoptimized) implementation of $next would just be:

  1. For each structure:
    1. Initialize next to 0.
    2. For each physical field:
      1. Replace $next in appropriate contexts with the current value of next.
      2. Update the definition of next to field_condition ? field_start + field_size : (previous_next).

It may even be better to use the unoptimized version in the desugaring stage, then let a later optimization pass strip out the unreachable branches -- I have some related code that I was working on recently that is intended to optimize the $size_in_bytes expression, which has many of the same problems.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, you're right that is an optimization now that I think about it, and the next value can be propagated forward through non-present fields. I'll update this to make it simpler.

I still think there would be value to doing the optimization in the actual code though, I know that every C++ compiler out there would optimize away chained conditionals with constant values or repeated conditions (with only non-volatile and non-aliased components), but it wouldn't be optimized for something like a potential future Python backend, for instance, which would make accessing the nth field of a struct an O(n) operation, as it would process the conditionals for all n-1 fields before it to determine the start location, even if said location is statically known.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's probably best for some of the optimization to happen inside Emboss. I'm just thinking it would be better in a separate optimization stage, instead of trying to implement it in the desugaring stage -- separation of concerns, and it would let any similar constructs get optimized.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah got it, as in a separate optimization pass that resolves conditionals, then it's more general and applies to other designating like $size_in_bytes too?

You're right I like that idea a lot. I'll consider it out of scope of this proposal since it won't matter for any decent C++ compiler and we don't currently support any other backend, but I see now that we don't need to preemptively optimize in the desugaring step.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. The two specific optimizations here are:

  • If a CHOICE has a constant condition, replace it with the appropriate branch; i.e. true ? y : z => y and false ? : y : z => z.
  • For any expression x ? y : (x ? z : q), replace the inner (x ? z : q) with q. There is a larger possible optimization around propagating constraints into choice operator branches, but this would cover a lot of cases and would be fairly easy to implement.

I'll see about getting the prototype optimizer I hacked together onto GitHub for reference.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually now that I took a second to look through it I think the former is already handled by constant_value in ir_utils.

Copy link
Collaborator

@reventlov reventlov Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only partially handled by constant_value -- that works for true ? constant : variable and false ? variable : constant, but not for true ? variable : variable2 or false ? variable : variable2, so it would cover:

struct Foo:
  0     [+4]  UInt  x
  $next [+4]  UInt y
  $next [+4]  UInt z

But not:

struct Foo:
  0     [+4]    UInt      x
  $next [+4*x]  UInt:32[] y
  $next [+4]    UInt      z

[Example 1](#example-1-conditional-structure-with-next) that are covered in
this section.

### Backwards Compatibility
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two major thoughts on this:

  1. There probably are not any (correct) uses of $next after a conditional field in the wild (I did not turn up any in an internal audit).
  2. In principle, a breaking change should break noisily. In this case, that means turning any use of $next immediately after a conditional field into an error, and either using a new keyword ($pack is fine) or having users opt into the new behavior ([next_keyword = "Pack"]). Compilation errors are annoying, but not nearly as problematic as silently changing behavior.

@reventlov
Copy link
Collaborator

#206 has a real-world case of someone getting tripped up by this.

@reventlov
Copy link
Collaborator

@jasongraffius have you had any time to work on this? I think the design sketch is very close to being ready to commit, even if the actual implementation will take some time.

@jasongraffius
Copy link
Collaborator Author

Good point, I'll update land the sketch soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants