Skip to content

Conversation

@ibraheemdev
Copy link
Member

Summary

As discussed in #19059.

@ibraheemdev ibraheemdev added ci Related to internal CI tooling ty Multi-file analysis & type inference labels Jul 3, 2025
ExitStatus::Failure
} else {
ExitStatus::Success
if std::env::var("TY_MEMORY_REPORT").as_deref() == Ok("mypy_primer") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hack for now, we should have a way to silence diagnostics completely (but keep the memory usage report).

flake8
sphinx
prefect
trio
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat arbitrarily chosen, open to suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sympy might be a good addition here, I think -- we have very high memory usage on it already

// the exact numbers across runs and compute the difference, but we don't have
// the infrastructure for that currently.
const BASE: f64 = 1.1;
const BASE: f64 = 1.05;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lowered the threshold because single-threaded runs should be completely deterministic, and the jump between buckets is currently quite high for large numbers.

@ibraheemdev ibraheemdev force-pushed the ibraheem/mypy-primer-determinism branch from a6cc106 to 81a95cc Compare July 3, 2025 23:55
@github-actions
Copy link
Contributor

github-actions bot commented Jul 3, 2025

mypy_primer results

No ecosystem changes detected ✅

@ibraheemdev ibraheemdev force-pushed the ibraheem/mypy-primer-determinism branch 2 times, most recently from a153f65 to eba3b35 Compare July 4, 2025 00:14
@ibraheemdev
Copy link
Member Author

ibraheemdev commented Jul 4, 2025

We could also just run the memory usage report in the same job? If we choose smaller projects that shouldn't add significant time, and we avoid having to use another depot runner and duplicate the setup work. We could also avoid using mypy-primer for that run and get exact numbers.

@ibraheemdev ibraheemdev force-pushed the ibraheem/mypy-primer-determinism branch from eba3b35 to bf4df9e Compare July 4, 2025 00:27
@ibraheemdev ibraheemdev marked this pull request as ready for review July 4, 2025 00:32
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@ibraheemdev ibraheemdev force-pushed the ibraheem/mypy-primer-determinism branch 7 times, most recently from 950612b to a941e5e Compare July 4, 2025 21:14
@github-actions
Copy link
Contributor

github-actions bot commented Jul 4, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@ibraheemdev ibraheemdev force-pushed the ibraheem/mypy-primer-determinism branch from a941e5e to 6f85184 Compare July 4, 2025 21:26
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

flake8
sphinx
prefect
trio
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sympy might be a good addition here, I think -- we have very high memory usage on it already

@MichaReiser
Copy link
Member

I'm a bit surprised that the memory usage results are non deterministic when running multi threaded (to such large degrees). The only two reasons I can think of are:

  • AST GC
  • intermediate results from fixpoint cycles and the non-determinism comes from entering the cycles from different entry queries.

Do we know which of the two is causing the flakes?

tracing::warn!("No python files found under the given path(s)");
}

// TODO: We should have an official flag to silence workspace diagnostics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, at first, didn't understand what features is needed to replace this environment variable usage but I think I understand now. What we need is a --quiet or similar flag. Would you mind opening an issue in the ty repository for adding a flag to suppress diagnostic printing (and mention Ruff's --quiet and --silent flags)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ibraheemdev ibraheemdev merged commit cd84898 into main Jul 7, 2025
37 checks passed
@ibraheemdev ibraheemdev deleted the ibraheem/mypy-primer-determinism branch July 7, 2025 16:17
@ibraheemdev
Copy link
Member Author

I'm a bit surprised that the memory usage results are non deterministic when running multi threaded (to such large degrees)

This is partly due to our rounding strategy being affected by very small changes if we get unlucky with the numbers. Now that we have a separate job we could just get the exact numbers manually. That said, on my machine I was seeing variance of ~10MB on projects with total memory usage in the hundreds of MB. It's possible that AST garbage collection is the culprit.

AlexWaygood pushed a commit that referenced this pull request Jul 7, 2025
UnboundVariable pushed a commit to UnboundVariable/ruff that referenced this pull request Jul 7, 2025
…c_tokens

* 'main' of https://github.com/astral-sh/ruff: (27 commits)
  [ty] First cut at semantic token provider (astral-sh#19108)
  [`flake8-simplify`] Make example error out-of-the-box (`SIM116`) (astral-sh#19111)
  [`flake8-use-pathlib`] Make example error out-of-the-box (`PTH210`) (astral-sh#19189)
  [`flake8-use-pathlib`] Add autofixes for `PTH203`, `PTH204`, `PTH205` (astral-sh#18922)
  [`flake8-type-checking`] Fix syntax error introduced by fix (`TC008`) (astral-sh#19150)
  [`flake8-pyi`] Make example error out-of-the-box (`PYI007`, `PYI008`) (astral-sh#19103)
  Update Rust crate indicatif to 0.18.0 (astral-sh#19165)
  [ty] Add separate CI job for memory usage stats (astral-sh#19134)
  [ty] Add documentation for server traits (astral-sh#19137)
  Rename to `SessionSnapshot`, move unwind assertion closer (astral-sh#19177)
  [`flake8-type-checking`] Make example error out-of-the-box (`TC001`) (astral-sh#19151)
  [ty] Bare `ClassVar` annotations (astral-sh#15768)
  [ty] Re-enable multithreaded pydantic benchmark (astral-sh#19176)
  [ty] Implement equivalence for protocols with method members (astral-sh#18659)
  [ty] Use RHS inferred type for bare `Final` symbols (astral-sh#19142)
  [ty] Support declaration-only attributes (astral-sh#19048)
  [ty] Sync vendored typeshed stubs (astral-sh#19174)
  Update dependency pyodide to ^0.28.0 (astral-sh#19164)
  Update NPM Development dependencies (astral-sh#19170)
  Update taiki-e/install-action action to v2.56.7 (astral-sh#19169)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Related to internal CI tooling ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants