Use the compiler encoding for baseline source files recovered from file system#81933
Merged
tmat merged 4 commits intodotnet:mainfrom Jan 9, 2026
Merged
Use the compiler encoding for baseline source files recovered from file system#81933tmat merged 4 commits intodotnet:mainfrom
tmat merged 4 commits intodotnet:mainfrom
Conversation
…e system. Hot Reload compares the current document snapshot with a baseline document snapshot to find out what the semantic difference is between them. The the baseline solution is captured when the debugging session starts, but the document content at that point doesn't necessarily match the source code that the compiler used to compile the baseline assembly. We compare the checksum of the binary content of the document against the checksum that the compiler stored in the PDB. If the baseline document checksum doesn't match we read the source file from disk, in case it hasn't been overwritten yet and still contains the content used by the compiler for baseline compilation. If the checksum matches the PDB we know that the decoded text of the document can be used as a baseline for change detection. When reading the file content we need to use the exact encoding that the compiler used, otherwise we might interpret the binary content differently than the compiler did. Previously we used the IDE encoding, but it turns out the IDE doesn't necessarily know what encoding was used by the compiler. E.g. LSP doesn't have a concept of encoding and thus the LSP server always uses UTF8 when creating SourceText. We could try to make sure the encoding is always correctly set in the IDE. The LSP server could detect the encoding. However, Hot Reload already has all the information that the compiler had when compiling the assembly. The compiler auto-detects the encoding from the file content unless it's specified via CodePage project property. If the property is specified the value is stored in the compiler options record in the PDB, which we can read. This PR changes the code to always auto-detect the encoding from file content. Implementing support for CodePage property is tracked by a follow up issue: dotnet#81930
Member
Author
|
@DustinCampbell ptal |
noiseonwires
reviewed
Jan 9, 2026
DustinCampbell
approved these changes
Jan 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hot Reload compares the current document snapshot with a baseline document snapshot to find out what the semantic difference is between them.
P
The the baseline solution is captured when the debugging session starts, but the document content at that point doesn't necessarily match the source code that the compiler used to compile the baseline assembly. We compare the checksum of the binary content of the document against the checksum that the compiler stored in the PDB. If the baseline document checksum doesn't match we read the source file from disk, in case it hasn't been overwritten yet and still contains the content used by the compiler for baseline compilation. If the checksum matches the PDB we know that the decoded text of the document can be used as a baseline for change detection.
When reading the file content we need to use the exact encoding that the compiler used, otherwise we might interpret the binary content differently than the compiler did. Previously we used the IDE encoding, but it turns out the IDE doesn't necessarily know what encoding was used by the compiler. E.g. LSP doesn't have a concept of encoding and thus the LSP server always uses UTF8 when creating SourceText.
We could try to make sure the encoding is always correctly set in the IDE. The LSP server could detect the encoding. However, Hot Reload already has all the information that the compiler had when compiling the assembly. The compiler auto-detects the encoding from the file content unless it's specified via CodePage project property. If the property is specified the value is stored in the compiler options record in the PDB, which we can read.
This PR changes the code to always auto-detect the encoding from file content. Implementing support for CodePage property is tracked by a follow up issue: #81930
Partially fixes https://devdiv.visualstudio.com/DevDiv/_workitems/edit/2067885/ - if the file is not saved when Hot Reload is applied it will work. If the file is saved though it's still not working: #82434