New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Use Abstract Interpretation to search for overridden property names #49199

Closed

staticfloat wants to merge 2 commits into master from sf/repl_tab_typeinf

Member

staticfloat commented Mar 30, 2023

This attempts to use abstract interpretation to propagate type information through farther so that we can solve tricky tab-completion queries such as:

struct Foo
end

Base.propertynames(::Foo) = (:a, :b, :c)
function Base.getproperty(::Foo, s::Symbol)
    if s === :a
        return 1
    elseif s === :b
        return 2
    else
        return s === :c
    end
end

struct Wrap
    w
end

Wrap(Foo()).<TAB>


          Use Abstract Interpretation to search for overridden property names

9d53130

This attempts to use abstract interpretation to propagate type
information through farther so that we can solve tricky tab-completion
queries such as:

```julia
struct Foo
end

Base.propertynames(::Foo) = (:a, :b, :c)
function Base.getproperty(::Foo, s::Symbol)
    if s === :a
        return 1
    elseif s === :b
        return 2
    else
        return s === :c
    end
end

struct Wrap
    w
end

Wrap(Foo()).<TAB>
```

staticfloat mentioned this pull request

Use const propagating inference in REPL tab completions #36437

Closed

Member Author

staticfloat commented Mar 30, 2023

Of course this still fails if you write something degenerate such as:

struct UnstableFoo
end

function Base.propertynames(::UnstableFoo)
    if rand() > 0.5
        return (:a, :b)
    else
        return (:a, :b, :c)
    end
end

struct Wrap
    a
end

Although it is somewhat amusing to see that u.<TAB> returns different things on each <TAB> for u = UnstableFoo().

Member

Keno commented Mar 30, 2023

Although it is somewhat amusing to see that u.<TAB> returns different things on each <TAB> for u = UnstableFoo().

That's not going through this path though, so that's fine.

Keno reviewed

View reviewed changes

stdlib/REPL/src/REPLCompletions.jl Outdated

+                      # Also, try abstract-interpreting propertynames
+                      thunk = context_module.eval(:(() -> propertynames($(ex))))
+                      code_info, rett = Core.Compiler.typeinf_code(
+                          Core.Compiler.NativeInterpreter(),

Member

Keno Mar 30, 2023

For this to be meaningfully powerful, i think you probably need an interpreter with assume_bindings_static turned on.

Member

aviatesk Mar 31, 2023

We should separate the cache for this inference that is much aggressive than the ordinary inference.

Keno reviewed

View reviewed changes

stdlib/REPL/src/REPLCompletions.jl Outdated

+                          true,
+                      )
+                      if rett <: Tuple && hasproperty(rett, :parameters) && all(rett.parameters .<: Symbol)
+                          for field in code_info.code[end].val

Member

Keno Mar 30, 2023

This assumes that the code constant folds, which is not a valid assumption. You need to get the non-wiedened Const lattice element out of inference and look at that instead.

topolarity reviewed

View reviewed changes

stdlib/REPL/src/REPLCompletions.jl

                               end
                           end
                       end
+                      # Also, try abstract-interpreting propertynames

Member

topolarity Mar 30, 2023

Does this lead to duplicate suggestions for, e.g., concrete types?

Member

aviatesk Mar 31, 2023

We should probably do this inference within get_type above.

oscardssmith added the REPL label


          Fix issues

51e8376

aviatesk reviewed

View reviewed changes

stdlib/REPL/src/REPLCompletions.jl

                               end
                           end
                       end
+                      # Also, try abstract-interpreting propertynames

Member

aviatesk Mar 31, 2023

We should probably do this inference within get_type above.

stdlib/REPL/src/REPLCompletions.jl

Comment on lines +219 to +227

+                      thunk = context_module.eval(:(() -> propertynames($(ex))))
+                      inf_params = Core.Compiler.InferenceParams(; assume_bindings_static=true)
+                      frame = Core.Compiler.typeinf_frame(
+                          Core.Compiler.NativeInterpreter(;inf_params),
+                          first(methods(thunk)),
+                          Tuple{typeof(thunk)},
+                          Core.svec(),
+                          true,
+                      )

Member

aviatesk Mar 31, 2023

There are some ways available to perform inference on a toplevel expression directly rather than wrapping it in a temporary function and performing inference on it.
This piece of code from JET.jl might be useful:
https://github.com/aviatesk/JET.jl/blob/9f8add4ce3d8ba528249ec02023711759abb00d3/src/toplevel/virtualprocess.jl#L1395-L1415
(ignore the abstract_global_symbols! stuff)

Member

oscardssmith commented Mar 31, 2023

one improvement to this is that ideally we would do concrete evaluation through non-consistent code. otherwise a[1].<tab> won't work

aviatesk added a commit that referenced this pull request


          REPLCompletions: replace get_type by the proper inference

0da05f4

This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

In particular, aggressive binding resolution presents challenges for
`REPLInterpreter`'s cache validation (since #40399 hasn't been resolved
yet). To avoid cache validation issue, `REPLInterpreter` only allows
aggressive binding resolution for top-level frame representing REPL
input code (`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

Note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199

aviatesk mentioned this pull request

REPLCompletions: replace get_type by the proper inference #49206

Merged

Member

aviatesk commented Mar 31, 2023

I made an alternative PR at #49206.

Member Author

staticfloat commented Mar 31, 2023

Closing in favor of #49206

staticfloat closed this

vchuravy deleted the sf/repl_tab_typeinf branch

March 31, 2023 18:32

aviatesk added a commit that referenced this pull request


          REPLCompletions: replace get_type by the proper inference

e2e238b

This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199

aviatesk added a commit that referenced this pull request


          REPLCompletions: replace get_type by the proper inference

d4240bf

This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199

aviatesk added a commit that referenced this pull request


          REPLCompletions: replace get_type by the proper inference

e2932cf

This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199

oscardssmith pushed a commit that referenced this pull request


          REPLCompletions: replace get_type by the proper inference (#49206)

98988d8

This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199

Xnartharax pushed a commit to Xnartharax/julia that referenced this pull request


          REPLCompletions: replace get_type by the proper inference (JuliaLan…

ea4b22e

…g#49206)

This PR generalizes the idea from JuliaLang#49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see JuliaLang#36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since JuliaLang#40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround JuliaLang#48453.

closes JuliaLang#36437
replaces JuliaLang#49199

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

REPL