Skip to content

Conversation

@ChayimFriedman2
Copy link
Contributor

Via the observation that:

  • The IngredientIndex can be retrieved from the page, except for tracked fns
  • The generation is useless, except for interneds

And since they don't overlap, we can store only what needed for each.

For rust-analyzer, this saves 55mb on itself and 200mb on a large codebase (omicron).

@netlify
Copy link

netlify bot commented Dec 25, 2025

Deploy Preview for salsa-rs canceled.

Name Link
🔨 Latest commit 635405d
🔍 Latest deploy log https://app.netlify.com/projects/salsa-rs/deploys/6950d63613d585000858acff

@ChayimFriedman2 ChayimFriedman2 marked this pull request as ready for review December 25, 2025 11:31
@codspeed-hq
Copy link

codspeed-hq bot commented Dec 25, 2025

CodSpeed Performance Report

Merging #1045 will improve performance by 8.51%

Comparing ChayimFriedman2:shrink-dki (635405d) with master (309c249)

Summary

⚡ 3 improvements
✅ 10 untouched

Benchmarks breakdown

Benchmark BASE HEAD Efficiency
amortized[InternedInput] 2.2 µs 2.1 µs +5.74%
accumulator 3.3 ms 3.2 ms +5.43%
amortized[SupertypeInput] 3 µs 2.8 µs +8.51%

@ChayimFriedman2 ChayimFriedman2 force-pushed the shrink-dki branch 3 times, most recently from 005814e to 35a1b50 Compare December 25, 2025 11:52
@ChayimFriedman2
Copy link
Contributor Author

GitHub actions out of storage...

@MichaReiser
Copy link
Contributor

Nice, there are some test failures (re-running fixed the temporary out-of-disk-space error)

@ChayimFriedman2
Copy link
Contributor Author

Something weird happens with CI...

@Veykril
Copy link
Member

Veykril commented Dec 26, 2025

Okay I deleted the ci caches, lets see if that helps

@Veykril
Copy link
Member

Veykril commented Dec 26, 2025

Yea looks like llvm/ld just likes to freak out when the diskspace runs out lol

@MichaReiser
Copy link
Contributor

This is cool. Will try to review after the holidays. Happy holidays to all of you

@MichaReiser
Copy link
Contributor

Unfortunately, this breaks our test infrastructure:

    let event = events.iter().find(|event| {
        if let salsa::EventKind::WillExecute { database_key } = event.kind {
            db.ingredient_debug_name(database_key.ingredient_index()) == query_name
        } else {
            false
        }
    });

Because calling ingredient_index now requries a zalsa (which isn't really considered public api?)

@MichaReiser
Copy link
Contributor

The perf and memory improvements on ty are very impressive. But we now have some tests that timeout. Not sure why astral-sh/ruff#22203

@Veykril
Copy link
Member

Veykril commented Dec 26, 2025

Unfortunately, this breaks our test infrastructure:

We can probably change the fields of that event and just deconstruct the key eagerly, since the event infra is callback based one only pays when a callback is registered after all

@Veykril
Copy link
Member

Veykril commented Dec 26, 2025

Or alternatively expose a public getter that takes a &dyn Database instead

@ChayimFriedman2
Copy link
Contributor Author

Or alternatively expose a public getter that takes a &dyn Database instead

Edited to do this.

@AlexWaygood
Copy link

The perf and memory improvements on ty are very impressive. But we now have some tests that timeout. Not sure why astral-sh/ruff#22203

As well as the timeout on the corpus test, there are also some ty mdtests on that PR that are panicking with "too many cycle iterations"

@ChayimFriedman2
Copy link
Contributor Author

I tested this with rust-analyzer and it worked, however we have no fixpoint iteration currently, therefore there might be bugs there. I'll try to understand where the problem is.

Via the observation that:

 - The `IngredientIndex` can be retrieved from the page, except for tracked fns
 - The generation is useless, except for interneds

And since they don't overlap, we can store only what needed for each.
Because we sometimes get id from a `DatabaseKeyIndex` that doesn't have the generation, so it prevent conflicts. And we don't need it either, as explained in a comment.
@ChayimFriedman2
Copy link
Contributor Author

I tried to debug the issue on the ty repo. I did manage to reliably reproduce the issue (cargo nextest run --package ty_python_semantic --test mdtest -- "mdtest::del.md"), and fixed one bug (see the second commit), but did not advance further, as I don't know the codebase.

@MichaReiser
Copy link
Contributor

Yeah, this might require staring at logs for a very long time to see where (I suspect fixpoint) goes wrong.

/// Returns the database key index for a tracked struct with the given id.
pub fn database_key_index(&self, id: Id) -> DatabaseKeyIndex {
DatabaseKeyIndex::new(self.ingredient_index, id)
DatabaseKeyIndex::new_non_interned(self.ingredient_index, id)
Copy link
Member

@ibraheemdev ibraheemdev Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use generations for tracked structs that are reused to avoid an extra read dependency, not just interned values. This might be the cause of the test failures?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use the generation for tracked structs, the whole idea falls, since tracked structs need the generation, but they also need the ingredient index, as sometimes we want to refer to one of the fields (different ingredient) but the page is the same.

Are we surely need the generation for tracked structs? What did we do before the introduction of generational IDs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably required a read dependency when reading an untracked field, see #864

I don't fully remember why we changed TrackedStruct also. Was it just because we had generational ids or was it because it was required for interned struct LRU.

I'm also inclined to encode the iteration into the Salsa id... but that's not definite yet.

Copy link
Member

@ibraheemdev ibraheemdev Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not strictly necessary, but it was meant to be a performance improvement (though I'm not sure it ended up being a significant one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants