Skip to content

Conversation

@sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Jul 1, 2025

Summary

closes astral-sh/ty#738

Test Plan

Added corpus test

@sharkdp sharkdp added the ty Multi-file analysis & type inference label Jul 1, 2025
Comment on lines +5678 to +5683
if expr_ref
.as_name_expr()
.is_some_and(|name| name.is_invalid())
{
return (Place::Unbound, None);
}
Copy link
Contributor Author

@sharkdp sharkdp Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not the best (and not the broadest) fix here. Maybe @mtshiba could have a look at this if you find the time?

Copy link
Contributor

@mtshiba mtshiba Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review. This fix looks good!

One thing I am wondering about is our current handling of (syntactically) invalid place expressions.
The parser seems to construct the following element for [.a., for example:

Attribute(ExprAttribute {
    node_index: AtomicNodeIndex(3),
    range: 1..3,
    value: Name(ExprName { node_index: AtomicNodeIndex(4), range: 1..1, id: Name(""), ctx: Invalid }),
    attr: Identifier { id: Name("a"), range: 2..3, node_index: AtomicNodeIndex(5) },
    ctx: Load
})

This is also not a valid expression and we can immediately return Unbound. Rather, this seems to cause a panic, but it doesn't. Because it is recorded as a place in the UseDefMap. The inner name is invalid, but the attribute is in the load context. Not harmful, but a useless record.

let (is_use, is_definition) = match (ctx, self.current_assignment()) {
(ast::ExprContext::Store, Some(CurrentAssignment::AugAssign(_))) => {
// For augmented assignment, the target expression is also used.
(true, true)
}
(ast::ExprContext::Load, _) => (true, false),
(ast::ExprContext::Store, _) => (false, true),
(ast::ExprContext::Del, _) => (true, true),
(ast::ExprContext::Invalid, _) => (false, false),
};
let place_id = self.add_place(place_expr);
if is_use {
self.mark_place_used(place_id);
let use_id = self.current_ast_ids().record_use(expr);
self.current_use_def_map_mut()
.record_use(place_id, use_id, node_key);
}

It might be better to validate that the inner expressions are also valid when building the SemanticIndex, or to let the parser propagate the invalid context to the outer expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhruvmanila Would you mind also having a look at this, in particular the suggestion here to do part of the work in the parser?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to validate that the inner expressions are also valid when building the SemanticIndex, or to let the parser propagate the invalid context to the outer expressions.

I think it makes sense that if an inner expression is invalid then the entire attribute expression is marked as invalid by the parser. This would then be automatically excluded by the semantic index builder. We do related changes using the helpers::set_expr_context function.

Do we only need to account for the top-level invalid expression or any nested invalid expressions as well? For example, in .a, it's only the top level value expression of the attribute expression that's invalid but in foo(1+).bar, it's the inner expression (+1) that's invalid instead of the outer expression foo(...) and the outer expression corresponds to the value field of an attribute expression. I'm assuming it's the former?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we only need to account for the top-level invalid expression or any nested invalid expressions as well? For example, in .a, it's only the top level value expression of the attribute expression that's invalid but in foo(1+).bar, it's the inner expression (+1) that's invalid instead of the outer expression foo(...) and the outer expression corresponds to the value field of an attribute expression. I'm assuming it's the former?

The only expressions that should have invalid context propagated are those that are simple enough that PlaceExpr::try_from succeeds.
A complex expression like f(1+).bar is not recorded as a place in the UseDefMap, so the outer attribute f(...).bar can be used as the load context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a bit inconsistent because the parser shouldn't care whether the expression can be constructed by PlaceExpr or not. And, I don't think it really matters, right? Like, if there are any invalid expressions nested in the ExprAttribute, then the entire ExprAttribute is invalid and the parser shouldn't special case only specific expressions. But, if special casing is important, then I'd prefer to have this logic in ty and not in the parser.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is changing the context in the parser significantely simpler than doing it in ty? If not, I'd suggest doing it in ty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't be that simple change in the parser because we'd need to get this information (ExprContext::Invalid) up the AST to the ExprAttribute which could possibly be achieved by storing it in ParsedExpr (is_invalid: bool field) and using it in parse_attribute_expression. But, yeah, it might be simpler to do it in ty instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks everyone. If this really only causes unnecessary work, I'm not sure it's worth doing something here at all. I'll move this PR to in-review to fix the original crash. It comes up easily in the LSP/playground when editing code.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2025

mypy_primer results

Changes were detected when running on open source projects
rich (https://github.com/Textualize/rich)
- TOTAL MEMORY USAGE: ~142MB
+ TOTAL MEMORY USAGE: ~129MB

alerta (https://github.com/alerta/alerta)
- TOTAL MEMORY USAGE: ~117MB
+ TOTAL MEMORY USAGE: ~106MB

discord.py (https://github.com/Rapptz/discord.py)
-     memo fields = ~207MB
+     memo fields = ~189MB

vision (https://github.com/pytorch/vision)
-     memo fields = ~304MB
+     memo fields = ~276MB

paasta (https://github.com/yelp/paasta)
-     memo fields = ~171MB
+     memo fields = ~156MB

meson (https://github.com/mesonbuild/meson)
-     memo fields = ~334MB
+     memo fields = ~304MB

scikit-learn (https://github.com/scikit-learn/scikit-learn)
- TOTAL MEMORY USAGE: ~717MB
+ TOTAL MEMORY USAGE: ~652MB

sympy (https://github.com/sympy/sympy)
-     memo fields = ~1399MB
+     memo fields = ~1538MB

@sharkdp sharkdp marked this pull request as ready for review July 9, 2025 06:22
@sharkdp sharkdp merged commit ab3af92 into main Jul 9, 2025
37 checks passed
@sharkdp sharkdp deleted the david/fix-738 branch July 9, 2025 06:46
UnboundVariable pushed a commit to UnboundVariable/ruff that referenced this pull request Jul 10, 2025
…re_help

* 'main' of https://github.com/astral-sh/ruff: (34 commits)
  [docs] add capital one to who's using ruff (astral-sh#19248)
  [`pyupgrade`] Keyword arguments in `super` should suppress the `UP008` fix (astral-sh#19131)
  [`flake8-use-pathlib`] Add autofixes for `PTH100`, `PTH106`, `PTH107`, `PTH108`, `PTH110`, `PTH111`, `PTH112`, `PTH113`, `PTH114`, `PTH115`, `PTH117`, `PTH119`, `PTH120` (astral-sh#19213)
  [ty] Do not run `mypy_primer.yaml` when all changed files are Markdown files (astral-sh#19244)
  [`flake8-bandit`] Make example error out-of-the-box (`S412`) (astral-sh#19241)
  [`pydoclint`] Make example error out-of-the-box (`DOC501`) (astral-sh#19218)
  [ty] Add "kind" to completion suggestions
  [ty] Add type information to `all_members` API
  [ty] Expand API of `all_members` to return a struct
  [ty] Ecosystem analyzer PR comment workflow (astral-sh#19237)
  [ty] Merge `ty_macros` into `ruff_macros` (astral-sh#19229)
  [ty] Fix `ClassLiteral.into_callable` for dataclasses (astral-sh#19192)
  [ty] `dataclasses.field` support (astral-sh#19140)
  [ty] Fix panic for attribute expressions with empty value (astral-sh#19069)
  [ty] Return `CallableType` from `BoundMethodType.into_callable_type` (astral-sh#19193)
  [`flake8-bugbear`] Support non-context-manager calls in `B017` (astral-sh#19063)
  [ty] Improved diagnostic for reassignments of `Final` symbols (astral-sh#19214)
  [ty] Use full range for assignment definitions (astral-sh#19211)
  [`pylint`] Update `missing-maxsplit-arg` docs and error to suggest proper usage (`PLC0207`) (astral-sh#18949)
  [ty] Add `set -eu` to mypy-primer script (astral-sh#19212)
  ...

# Conflicts:
#	crates/ty_python_semantic/src/types/class.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ty panics on input [.

6 participants