-
Notifications
You must be signed in to change notification settings - Fork 229
Update file encoding to UTF-8 with BOM #8099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
For reference, the script I used to convert all files: using System.Diagnostics;
using System.Text;
var baseDir = "/home/fred/git/razor";
var utf8Preamble = Encoding.UTF8.GetPreamble();
var lsFilesOutput = Process.Start(new ProcessStartInfo("git", "ls-files --eol") { WorkingDirectory = baseDir, RedirectStandardOutput = true })!.StandardOutput.ReadToEnd();
var eolLines = lsFilesOutput.Split('\n', StringSplitOptions.RemoveEmptyEntries).Select(line => line.Split(' ', StringSplitOptions.RemoveEmptyEntries)).ToArray();
foreach (var eolLine in eolLines)
{
var file = Path.Combine(baseDir, eolLine[3].Trim());
if (file.EndsWith("sh") || file.Contains("node_modules") || file.EndsWith("json") || file.EndsWith("snap") || eolLine[0].Contains("-text"))
continue;
var bytes = File.ReadAllBytes(file);
if (bytes.Length < utf8Preamble.Length)
goto update;
for (int i = 0; i < utf8Preamble.Length; i++)
{
if (bytes[i] != utf8Preamble[i])
goto update;
}
continue;
update:
var updatedBytes = new byte[bytes.Length + utf8Preamble.Length];
utf8Preamble.CopyTo(updatedBytes, 0);
bytes.CopyTo(updatedBytes, utf8Preamble.Length);
File.WriteAllBytes(file, updatedBytes);
} |
|
GitHub is struggling to let me review this, but I approve! |
|
Looks good except it seems like your script might have modified some binary files (a picture in this case) and made them invalid: https://dev.azure.com/dnceng-public/public/_build/results?buildId=131223&view=logs&j=1f132584-ab08-5fbf-e5d0-5685de454342&t=76c675cd-a096-5dd5-8431-7aa49ff610be&l=345. |
|
I updated the script to be more intelligent about not updating binary files, using git's interpretation of what encoding a file has. |
|
Removed json files from the conversion, as the BOM is invalid JSON. |
Since integrating our repos, there have been a number of changes in PRs that touch the encoding of the file, adding or removing a BOM as the editor in question decides. I've standardized on UTF-8 with BOM, as we do in roslyn, and put it in the .gitattributes so it should hopefully stay consistent.
It seems that Shouldn't Shouldn't we create |
|
Note: This PR breaks CMD files. |
Yeah, looks like I forgot to reapply it in one of my force pushes.
I don't know that this would actually do anything.
Sure. |
* upstream/main: Fix generic tuple type name rewriting (dotnet#8085) Update src/Razor/benchmarks/Microsoft.AspNetCore.Razor.Microbenchmarks/Program.cs Add instructions on inserting Razor into O# (dotnet#8107) Fix comment grammar and remove empty file Update tooling micro benchmark runner Remove BOM from batch files Update BuildeFromSource.md to include StartVS script changes Update StartVS script Conditionally deploy Roslyn dependencies Try ignoring file header warnings Enforce code style on build, for Razor tooling, and treat warnings as errors from command line Update file encoding to UTF-8 with BOM (dotnet#8099) A bit of code clean up and clarification Warnings as errors only in CI Reduce allocations and O(n) lookup in completion Show position owner info in Syntax Visualizer correct docs file
Since integrating our repos, there have been a number of changes in PRs that touch the encoding of the file, adding or removing a BOM as the editor in question decides. I've standardized on UTF-8 with BOM, as we do in roslyn, and put it in the .gitattributes so it should hopefully stay consistent.