Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Abstract Interpretation to search for overridden property names #49199

Closed
wants to merge 2 commits into from

Conversation

staticfloat
Copy link
Member

This attempts to use abstract interpretation to propagate type information through farther so that we can solve tricky tab-completion queries such as:

struct Foo
end

Base.propertynames(::Foo) = (:a, :b, :c)
function Base.getproperty(::Foo, s::Symbol)
    if s === :a
        return 1
    elseif s === :b
        return 2
    else
        return s === :c
    end
end

struct Wrap
    w
end

Wrap(Foo()).<TAB>

This attempts to use abstract interpretation to propagate type
information through farther so that we can solve tricky tab-completion
queries such as:

```julia
struct Foo
end

Base.propertynames(::Foo) = (:a, :b, :c)
function Base.getproperty(::Foo, s::Symbol)
    if s === :a
        return 1
    elseif s === :b
        return 2
    else
        return s === :c
    end
end

struct Wrap
    w
end

Wrap(Foo()).<TAB>
```
@staticfloat
Copy link
Member Author

Of course this still fails if you write something degenerate such as:

struct UnstableFoo
end

function Base.propertynames(::UnstableFoo)
    if rand() > 0.5
        return (:a, :b)
    else
        return (:a, :b, :c)
    end
end

struct Wrap
    a
end

Although it is somewhat amusing to see that u.<TAB> returns different things on each <TAB> for u = UnstableFoo().

@Keno
Copy link
Member

Keno commented Mar 30, 2023

Although it is somewhat amusing to see that u.<TAB> returns different things on each <TAB> for u = UnstableFoo().

That's not going through this path though, so that's fine.

# Also, try abstract-interpreting propertynames
thunk = context_module.eval(:(() -> propertynames($(ex))))
code_info, rett = Core.Compiler.typeinf_code(
Core.Compiler.NativeInterpreter(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to be meaningfully powerful, i think you probably need an interpreter with assume_bindings_static turned on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should separate the cache for this inference that is much aggressive than the ordinary inference.

true,
)
if rett <: Tuple && hasproperty(rett, :parameters) && all(rett.parameters .<: Symbol)
for field in code_info.code[end].val
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the code constant folds, which is not a valid assumption. You need to get the non-wiedened Const lattice element out of inference and look at that instead.

@@ -213,8 +214,26 @@ function complete_symbol(sym::String, @nospecialize(ffunc), context_module::Modu
end
end
end

# Also, try abstract-interpreting propertynames
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this lead to duplicate suggestions for, e.g., concrete types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably do this inference within get_type above.

@oscardssmith oscardssmith added the REPL Julia's REPL (Read Eval Print Loop) label Mar 30, 2023
@@ -213,8 +214,26 @@ function complete_symbol(sym::String, @nospecialize(ffunc), context_module::Modu
end
end
end

# Also, try abstract-interpreting propertynames
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably do this inference within get_type above.

Comment on lines +219 to +227
thunk = context_module.eval(:(() -> propertynames($(ex))))
inf_params = Core.Compiler.InferenceParams(; assume_bindings_static=true)
frame = Core.Compiler.typeinf_frame(
Core.Compiler.NativeInterpreter(;inf_params),
first(methods(thunk)),
Tuple{typeof(thunk)},
Core.svec(),
true,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some ways available to perform inference on a toplevel expression directly rather than wrapping it in a temporary function and performing inference on it.
This piece of code from JET.jl might be useful:
https://github.com/aviatesk/JET.jl/blob/9f8add4ce3d8ba528249ec02023711759abb00d3/src/toplevel/virtualprocess.jl#L1395-L1415
(ignore the abstract_global_symbols! stuff)

@oscardssmith
Copy link
Member

one improvement to this is that ideally we would do concrete evaluation through non-consistent code. otherwise a[1].<tab> won't work

aviatesk added a commit that referenced this pull request Mar 31, 2023
This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

In particular, aggressive binding resolution presents challenges for
`REPLInterpreter`'s cache validation (since #40399 hasn't been resolved
yet). To avoid cache validation issue, `REPLInterpreter` only allows
aggressive binding resolution for top-level frame representing REPL
input code (`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

Note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199
@aviatesk
Copy link
Member

I made an alternative PR at #49206.

@staticfloat
Copy link
Member Author

Closing in favor of #49206

@vchuravy vchuravy deleted the sf/repl_tab_typeinf branch March 31, 2023 18:32
aviatesk added a commit that referenced this pull request Apr 1, 2023
This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199
aviatesk added a commit that referenced this pull request Apr 2, 2023
This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199
aviatesk added a commit that referenced this pull request Apr 2, 2023
This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199
oscardssmith pushed a commit that referenced this pull request Apr 3, 2023
This PR generalizes the idea from #49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see #36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since #40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround #48453.

closes #36437
replaces #49199
Xnartharax pushed a commit to Xnartharax/julia that referenced this pull request Apr 19, 2023
…g#49206)

This PR generalizes the idea from JuliaLang#49199 and uses inference to analyze
the types of REPL expression. This approach offers several advantages
over the current `get_[value|type]`-based implementation:
- The need for various special cases is eliminated, as lowering normalizes
  expressions, and inference handles all language features.
- Constant propagation allows us to obtain accurate completions for complex
  expressions safely (see JuliaLang#36437).

Analysis on arbitrary REPL expressions can be done by the following steps:
- Lower a given expression
- Form a top-level `MethodInstance` from the lowered expression
- Run inference on the top-level `MethodInstance`

This PR implements `REPLInterpreter`, a custom `AbstractInterpreter` that:
- aggressively resolve global bindings to enable reasonable completions
  for lines like `Mod.a.|` (where `|` is the cursor position)
- aggressively concrete evaluates `:inconsistent` calls to provide
  reasonable completions for cases like `Ref(Some(42))[].|`
- does not optimize the inferred code, as `REPLInterpreter` is only used
  to obtain the type or constant information of given expressions

Aggressive binding resolution presents challenges for `REPLInterpreter`'s
cache validation (since JuliaLang#40399 hasn't been resolved yet). To avoid cache
validation issue, `REPLInterpreter` only allows aggressive binding
resolution for top-level frame representing REPL input code
(`repl_frame`) and for child `getproperty` frames that are
constant propagated from the `repl_frame`. This works, since
1.) these frames are never cached, and
2.) their results are only observed by the non-cached `repl_frame`

`REPLInterpreter` also aggressively concrete evaluate `:inconsistent`
calls within `repl_frame`, allowing it to get get accurate type
information about complex expressions that otherwise can not be constant
folded, in a safe way, i.e. it still doesn't evaluate effectful
expressions like `pop!(xs)`. Similarly to the aggressive binding
resolution, aggressive concrete evaluation doesn't present any cache
validation issues because `repl_frame` is never cached.

Also note that the code cache for `REPLInterpreter` is separated from the
native code cache, ensuring that code caches produced by `REPLInterpreter`,
where bindings are aggressively resolved and the code is not really
optimized, do not affect the native code execution. A hack has
also been added to avoid serializing `CodeInstances`s produced by
`REPLInterpreter` during precompilation to workaround JuliaLang#48453.

closes JuliaLang#36437
replaces JuliaLang#49199
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
REPL Julia's REPL (Read Eval Print Loop)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants