Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor IR memory management and pointers. #2449

Closed
otrho opened this issue Aug 3, 2022 · 4 comments
Closed

Refactor IR memory management and pointers. #2449

otrho opened this issue Aug 3, 2022 · 4 comments
Labels
compiler: ir IRgen and sway-ir including optimization passes enhancement New feature or request

Comments

@otrho
Copy link
Contributor

otrho commented Aug 3, 2022

As of #2363 we have explicit reference semantics for mutable args including copy-type args.

Problem

This is a bit of a stop-gap measure though as memory management in the IR is inconsistent and sometimes ambiguous.

In particular:

  • There are no explicit memory manipulation instructions like mem_cpy or mem_set.
  • Pointers are only for locals and now some args.
  • Mutability is neither explicit nor enforced.
  • There's a lack of distinction between (Sway's) copy types and reference types.
  • Arrays and structs are both aggregates but are actually handled orthogonally.

There's probably other things I've forgotten right now.

Proposal

I think to address some of these issues we should change the following:

  • Use store and load only for integer 64-bit types and with explicit pointer src/dst.
  • Introduce memory instructions for copying larger values.
  • No such thing as Sway's implicit copy or ref types, instead pointers must be explicitly used when necessary.
  • All pointers are mutable. It's not up to the IR to worry about mutability, the front-end compiler can enforce those restrictions. No mut keyword needed.
  • Split arrays and structs. Abandon 'aggregates' as they're implemented now.
  • Use get_ptr as LLVM's GEP, update where needed in the land of explicit pointers. Use int_to_ptr and ptr_to_int instead of casting in get_ptr.
  • Introduce (or at least consider) 'opaque' pointers which are untyped. This is the approach LLVM has switched to.

I guess not all of these should be done in a single PR, but much of it will need to be done together to remain workable without intermediate hacks. This is another of those not-urgent-but-should-be-done-sooner-rather-than-later-to-avoid-pain.

@otrho otrho added enhancement New feature or request compiler: ir IRgen and sway-ir including optimization passes labels Aug 3, 2022
@otrho otrho changed the title Refactor IR memory management and pointer. Refactor IR memory management and pointers. Aug 3, 2022
@mohammadfawaz
Copy link
Contributor

Introduce (or at least consider) 'opaque' pointers which are untyped. This is the approach LLVM has switched to.

When I saw the keyword ptr in IR when I first started contributing in January, I thought you designed pointers to be opaque just like LLVM's future direction. The challenge with opaque pointers is losing some information related to size of the type so that it's harder to compute offsets (e.g. opaque pointer arguments etc.). LLVM was trying to fix that with some metadata and some other ways of carrying type information IIRC, but I haven't been keeping track of its recent progress. On the flip side, analyzing IR with opaque pointers is much easier and enables more optimization opportunities. I think we should seriously consider this option and it's clearly the better option long term (WWLLVMD?)

I also agree with the rest of the proposal. Thanks!

@otrho
Copy link
Contributor Author

otrho commented Aug 4, 2022

I haven't dug super deep into what LLVM has needed with opaque pointers, but I was (am?) making the assumption that it resembles their attitude with signedness in arithmetic. I.e., the value itself isn't signed but rather the operation. Same with pointers -- they just represent an address, but what you do with it is up to the operator. Hopefully in a statically typed language we wouldn't come across any ambiguities but there are always edge cases...

@mohammadfawaz
Copy link
Contributor

Same with pointers -- they just represent an address

Yeah exactly. LLVM analysis passes also tend to be easier to write without having to worry about needless pointer casts which would potentially improve the quality of the existing passes and enable more optimization opportunities.

Also looks like support for them is complete in both Clang and LLVM! So if we have any questions, we can always consult LLVM.

@vaivaswatha
Copy link
Contributor

It looks to me like most of the issues listed here were solved when #2819 was fixed with #4336.

Introduce (or at least consider) 'opaque' pointers which are untyped. This is the approach LLVM has switched to.

Except this. I don't see that as a priority at the moment ad we should close this issue. @IGI-111 .

@IGI-111 IGI-111 closed this as completed Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler: ir IRgen and sway-ir including optimization passes enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants