Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract compilation-tracker/SG-production code out of SolutionState to a new owner. #71257

Merged
merged 234 commits into from
Dec 21, 2023

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Dec 14, 2023

Implements initial work around cleaning up how SG production works. As part of the effort outlined here: #67441 (comment)

Major design changes in this PR:

  1. SolutionState is broken up into two types: SolutionState and SolutionCompilationState
  2. SolutionState is now a very simple green tree storing the basic information about the solution, its projects, and the documents therein (including the texts/loaders for those documents). Only when information about those values changes would you expect to get a new SolutionState instance.
  3. SolutionCompilationState has a pointer to a SolutionState. So any time a new SolutionState is produced, a new SolutionCompilationState must be produced.
  4. SolutionCompilationState then has all the information used for producing compilations (the 'compilation trackers'), generated docs (part of the compilation tracking system) and the ability to do things like get 'frozen partial semantics' (also related to compilations).
  5. Solution now points at SolutionCompilationState, though it still exposes as a convenience the SolutionState value as many consumers still read that.

Initially, i tried to not do '3'. I wanted to keep these entities entirely independent. That way you could even update a SolutionState and not update a SolutionCompilationState. However, in the process of doing this, i ended up finding this very arduous. Lots of codepaths now needed to pass both entities around, and there was a very real concern about doing this incorrectly and ending up with inconsistencies. There are so many commits here because i went down that route, ran into enough subtle cases, and eventually felt it was not worth it.

Several nice things fell out of this PR imo. First, there were downstream impacts in that a bunch of places that were accessing Solution.State moved to accessing Solution.CompilationState. This is nice as it strongly calls out the dependency that code actually has on compilatino-info, and not just solution-structure. This also revealed that "oop checksum computation" is dependent on SG production today. Something we may want to see about resolving. Specifically, now that generators run in OOP anyways, it would be nice to just have all OOP syncing jsut be based on solution-structure.

--

Remainder items:

  • Remove Solution._state.
  • Rename Solution.State to Solution.SolutionState.
  • Audit SolutionCompilationState methods to see if they actually have to fork projects.
  • Some methods throw after seeing hte state didn't change. Can they just do a more up front data check?
  • rename files.
  • rename SolutionCompilationState Solution to COmpilationState. same with SolutionCompilationState solution
  • update ComputeNewGeneratorInfoInCurrentProcessAsync to only take solutionstate
  • try to make GetState(ProjectId/DocumentId) not allow nulls.

@CyrusNajmabadi
Copy link
Member Author

Considering adding a TODO here, even if it's going away in a follow-up.

Done.

@CyrusNajmabadi
Copy link
Member Author

Why are we requesting to fork trackers though in these situations?

So, a core goal here was to preserve semantics with the existing code. The existing code does this:

public SolutionState WithProjectOutputFilePath(ProjectId projectId, string? outputFilePath)
{
var oldProject = GetRequiredProjectState(projectId);
var newProject = oldProject.WithOutputFilePath(outputFilePath);
if (oldProject == newProject)
{
return this;
}
return ForkProject(newProject);
}

As can be seen, if the project doesn't actually change, then the same solution is returned. If it does change, we do fork the project.

--

We can always cahnge this. And i'm happy to do an audit with you on this after the PR is done. But it's def safest to keep the semnatics the same.

@CyrusNajmabadi
Copy link
Member Author

Should this just check the project reference existing more direclty rather than having the different pattern here?

Happy to do that. Marked for followup.

@CyrusNajmabadi
Copy link
Member Author

Do you have a follow-up list actually going somewhere for these?

Yup. DOing it in the first post of this PR.

@CyrusNajmabadi
Copy link
Member Author

Frankly we should just inline it...there's more lines of parameters than there are actual code

Done!

Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions in this batch are mostly around "do you imagine this changing once we do further work?" and don't really indicate problems here, I'm just trying to mentally match our conversations with this code.

@@ -85,12 +85,13 @@ private partial class CompilationTracker : ICompilationTracker
}

private async Task<(Compilation compilationWithGeneratedFiles, CompilationTrackerGeneratorInfo generatorInfo)?> TryComputeNewGeneratorInfoInRemoteProcessAsync(
SolutionState solution,
SolutionCompilationState compilationState,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this seems a bit strange: if we're computing new generator results, we don't actually need to sync over the existing CompilationState, right? Is this just because right now we don't have a sync method that takes only the "new" SolutionState? If I recall our conversation we figured this was potentially the one "special" place that can safely ignore the generated state since that's the whole point of this method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I of course am worried if it means we have to fork the sync system to support either kind of state, unless we just can easily introduce some interface for either kind of state if all it needs is a checksum in the end or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just because right now we don't have a sync method that takes only the "new" SolutionState? If I recall our conversation we figured this was potentially the one "special" place that can safely ignore the generated state since that's the whole point of this method.

That's exactly right. And this is something i want to do in a followup. It's unclear though how best to layer that and use the type system properly. Especially since everyone else will not want this path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I of course am worried if it means we have to fork the sync system to support either kind of state, unless we just can easily introduce some interface for either kind of state if all it needs is a checksum in the end or something.

Yup. I don't have a great answer yet. What i'm tentatively thinking is:

have all the existing 'Invoke' methods (which take the full compilation state). And then expose the literal: "get me the generator results" APIs which will take only the solution-state. So basically, no one else would call these. And this code here would only call that API.

@@ -70,7 +70,7 @@ public static SolutionStateChecksums Deserialize(ObjectReader reader)
}

public async Task FindAsync(
SolutionState state,
SolutionCompilationState compilationState,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point do you imagine we might have two FindAsync() methods, one that handles just the SolutionState for the run-generator case (since we're not going to sync generators over) and one that takes the SolutionCompiltionState that delegates to the former?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's exactly how i see it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's precisely what i want to do.

Comment on lines +72 to +73
public TState? GetState(DocumentId? documentId)
=> documentId != null && _map.TryGetValue(documentId, out var state) ? state : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh seeing the place we did this for ProjectId terrified me; can we do something else instead? What needed this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will add this to the fallout list.

@@ -100,7 +109,7 @@ void AddReferencedProjects(HashSet<ProjectId> result, ProjectId projectId)
// host to remove the reference. We do not expose this through the full Solution/Project which
// filters out this case already (in Project.ProjectReferences). However, becausde we're at the
// ProjectState level it cannot do that filtering unless examined through us (the SolutionState).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ProjectState level it cannot do that filtering unless examined through us (the SolutionState).
// ProjectState level it cannot do that filtering unless examined through us (the SolutionCompilationState).

Although I'm not really sure the difference matters here.

Comment on lines +344 to +345
// don't fork tracker with queued action since access via partial semantics can become inconsistent (throw).
// Since changing options is rare event, it is okay to start compilation building from scratch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is something to defer for later no matter what.

/// Effectively, everything specified in a project file. Does not contain anything related to <see
/// cref="Compilation"/>s or semantics.
/// </summary>
public SolutionState Solution { get; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a post-this-PR rename: it'd be great if this was renamed to SolutionState, since at so many of the consumption sites I was thinking this was a back-pointer to the Solution somehow.

using Microsoft.CodeAnalysis.PooledObjects;
using Microsoft.CodeAnalysis.Text;
using Roslyn.Utilities;
using ReferenceEqualityComparer = Roslyn.Utilities.ReferenceEqualityComparer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We imported Roslyn.Utilities on the prior line, so I'm guessing this isn't necessary...

Comment on lines +224 to +225
/// <inheritdoc cref="SolutionState.AddProject(ProjectInfo)"/>
public SolutionCompilationState AddProject(ProjectInfo projectInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll make a new tracker map, but the project won't have had an existing tracker and we create them on demand anyways...so why?

Oh I might have an answer: projects at the workspace layer are potentially allowed to have a dangling project reference, so I guess if B depends on A (but A isn't added), and you add B and then A, once you add A you need to go fix up B so it'll consume that.

That really needs a comment, assuming that's the reason. :-)

return ForkProject(
tuple,
translate: null,
forkTracker: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This threw me a bit, but not beacuse of anything you did -- it seems old code was quite strange here. forkTracker = true says "keep the tracker around and fork it with the optional translation", and forkTracker = false means "just throw it away". Sure, this one place needs a throwaway (but as mentioned, this seems realy suspicious), but oddly there's a lot of places that are creating a new SolutionCompilationState for things that don't impact a compilation at all like the WithProjectName or WithProjectPath, etc and we're just going to be doing extra work finalizing those compilations.

As a tracking for a follow up, it seems either we should make this have a third state: "KeepCompilation", "ForkToUpdateCompilation", or "ThrowAwayCompilation" (the last one being unique here). Or if we can get rid of this per the comment applying a few lines earlier, then that last one goes away and we can have a better use here.

But you're 100% keeping the behavior the same -- the old code is just confusing me now. 😄

Comment on lines +380 to +388
/// <inheritdoc cref="SolutionState.WithProjectDocumentsOrder"/>
public SolutionCompilationState WithProjectDocumentsOrder(
ProjectId projectId, ImmutableList<DocumentId> documentIds)
{
return ForkProject(
this.Solution.WithProjectDocumentsOrder(projectId, documentIds),
static stateChange => new CompilationAndGeneratorDriverTranslationAction.ReplaceAllSyntaxTreesAction(stateChange.NewProjectState, isParseOptionChange: false),
forkTracker: true);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal but somehow this is in a different order than all the methods in SolutionState.cs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym "a different order"?


return this.Branch(
// TODO(cyrusn): Is it ok to just pass this.Solution along here?
this.Solution,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, same thing. As we go to updating these as generation runs asynchronously, we'll see those new "updates" landing in the same way as these: we're updating the source generated documents and compilations, but the underlying base state is the same.


return this.Branch(
// TODO(cyrusn): Is it ok to just pass this.Solution along here?
this.Solution,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also imagine this implementation changes drastically, since "here's just a set of documents that are frozen and are synced back to OOP" is a common thing rather than a special case.

oldProjectState,
newProjectState,
// intentionally accessing this.Solution here not newSolutionState
newFilePathToDocumentIdsMap: this.Solution.CreateFilePathToDocumentIdsMapWithAddedDocuments(newDocumentStatesForProject));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree you have maintained the behavior, but I strongly suspect that original behavior is a bug, and in VS we never notice since we'll only ever see all the documents applying to a single project. I'm quite surprised we didn't have a test for that, so if you don't want to investigate as a part of this PR please do file a tracking bug for this.

Comment on lines 883 to 890
var oldDocument = GetRequiredDocumentState(documentId);
if (oldDocument.Attributes.Name == name)
{
return this;
var oldProject = GetRequiredProjectState(documentId.ProjectId);
return new(this, oldProject, oldProject);
}

return UpdateDocumentState(oldDocument.UpdateName(name), contentChanged: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely should add more helpers for this, but I fully agree with your decision to leave those for follow-ups. 😄

Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks like a great start. Obviously there's some follow up renames, further helpers to add/simplify, and a few bugs to fix (that you uncovered while doing this), but I don't see any reason this isn't a huge step in the right direction. Getting this in means we can do the follow ups in parallel (or just for ease of review.)

The only "bug" I see is what you called out in the add/remove documents where they're creating a new file name -> ID map from the original state, rather than the state it's built up so far while looping for each set of documents per project. That indeed looks like a bug, so do file a bug for that (or poke me to do it.)

Comment on lines +689 to +691
// If the project didn't change itself, there's no need to change the compilation state.
var stateChange = _state.RemoveMetadataReference(projectId, metadataReference);
if (stateChange.NewSolutionState == _state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This did jump out to me when looking at SolutionCompilationState as to why RemoveMetadataReferences took the state change unlike everything else -- it might be a bit less confusing to just have the Remove method here directly validate the reference exists, and then make everything else consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in followup.

@CyrusNajmabadi
Copy link
Member Author

Merging in. Going to do a followup PR immediately.

@CyrusNajmabadi CyrusNajmabadi merged commit 17a7dcc into dotnet:main Dec 21, 2023
28 checks passed
@CyrusNajmabadi CyrusNajmabadi deleted the solutionCompilationState branch December 21, 2023 18:45
@ghost ghost added this to the Next milestone Dec 21, 2023
@Cosifne Cosifne modified the milestones: Next, 17.9 P3 Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants