Skip to content

Conversation

@333fred
Copy link
Member

@333fred 333fred commented Jan 9, 2023

Since integrating our repos, there have been a number of changes in PRs that touch the encoding of the file, adding or removing a BOM as the editor in question decides. I've standardized on UTF-8 with BOM, as we do in roslyn, and put it in the .gitattributes so it should hopefully stay consistent.

@333fred 333fred requested review from a team as code owners January 9, 2023 23:04
@333fred
Copy link
Member Author

333fred commented Jan 9, 2023

For reference, the script I used to convert all files:

using System.Diagnostics;
using System.Text;

var baseDir = "/home/fred/git/razor";

var utf8Preamble = Encoding.UTF8.GetPreamble();

var lsFilesOutput = Process.Start(new ProcessStartInfo("git", "ls-files --eol") { WorkingDirectory = baseDir, RedirectStandardOutput = true })!.StandardOutput.ReadToEnd();

var eolLines = lsFilesOutput.Split('\n', StringSplitOptions.RemoveEmptyEntries).Select(line => line.Split(' ', StringSplitOptions.RemoveEmptyEntries)).ToArray();

foreach (var eolLine in eolLines)
{
    var file = Path.Combine(baseDir, eolLine[3].Trim());

    if (file.EndsWith("sh") || file.Contains("node_modules") || file.EndsWith("json") || file.EndsWith("snap") || eolLine[0].Contains("-text"))
        continue;

    var bytes = File.ReadAllBytes(file);
    if (bytes.Length < utf8Preamble.Length)
        goto update;

    for (int i = 0; i < utf8Preamble.Length; i++)
    {
        if (bytes[i] != utf8Preamble[i])
            goto update;
    }

    continue;

update:
    var updatedBytes = new byte[bytes.Length + utf8Preamble.Length];
    utf8Preamble.CopyTo(updatedBytes, 0);
    bytes.CopyTo(updatedBytes, utf8Preamble.Length);
    File.WriteAllBytes(file, updatedBytes);
}

@davidwengier
Copy link
Member

GitHub is struggling to let me review this, but I approve!

@ryanbrandenburg
Copy link
Contributor

Looks good except it seems like your script might have modified some binary files (a picture in this case) and made them invalid: https://dev.azure.com/dnceng-public/public/_build/results?buildId=131223&view=logs&j=1f132584-ab08-5fbf-e5d0-5685de454342&t=76c675cd-a096-5dd5-8431-7aa49ff610be&l=345.

@333fred
Copy link
Member Author

333fred commented Jan 10, 2023

I updated the script to be more intelligent about not updating binary files, using git's interpretation of what encoding a file has.

@333fred
Copy link
Member Author

333fred commented Jan 10, 2023

Removed json files from the conversion, as the BOM is invalid JSON.

Since integrating our repos, there have been a number of changes in PRs that touch the encoding of the file, adding or removing a BOM as the editor in question decides. I've standardized on UTF-8 with BOM, as we do in roslyn, and put it in the .gitattributes so it should hopefully stay consistent.
@333fred 333fred enabled auto-merge (squash) January 10, 2023 01:11
@333fred 333fred merged commit fcae931 into dotnet:main Jan 10, 2023
@jjonescz
Copy link
Member

jjonescz commented Jan 10, 2023

put it in the .gitattributes so it should hopefully stay consistent.

It seems that encoding=UTF-8 disappeared from .gitattributes in the first force push.

Shouldn't .editorconfig's encoding = utf-8-bom be applied to more files than just *.{cs,csx,vb,vbx} to be consistent with this change?

Shouldn't we create .git-blame-ignore-revs file for this commit?

@DustinCampbell
Copy link
Member

Note: This PR breaks CMD files.

@333fred
Copy link
Member Author

333fred commented Jan 10, 2023

It seems that encoding=UTF-8 disappeared from .gitattributes in the first force push.

Yeah, looks like I forgot to reapply it in one of my force pushes.

Shouldn't .editorconfig's encoding = utf-8-bom be applied to more files than just *.{cs,csx,vb,vbx} to be consistent with this change?

I don't know that this would actually do anything.

Shouldn't we create .git-blame-ignore-revs file for this commit?

Sure.

333fred added a commit to 333fred/razor that referenced this pull request Jan 13, 2023
* upstream/main:
  Fix generic tuple type name rewriting (dotnet#8085)
  Update src/Razor/benchmarks/Microsoft.AspNetCore.Razor.Microbenchmarks/Program.cs
  Add instructions on inserting Razor into O# (dotnet#8107)
  Fix comment grammar and remove empty file
  Update tooling micro benchmark runner
  Remove BOM from batch files
  Update BuildeFromSource.md to include StartVS script changes
  Update StartVS script
  Conditionally deploy Roslyn dependencies
  Try ignoring file header warnings
  Enforce code style on build, for Razor tooling, and treat warnings as errors from command line
  Update file encoding to UTF-8 with BOM (dotnet#8099)
  A bit of code clean up and clarification
  Warnings as errors only in CI
  Reduce allocations and O(n) lookup in completion
  Show position owner info in Syntax Visualizer
  correct docs file
@jjonescz jjonescz mentioned this pull request Jan 17, 2023
@jjonescz jjonescz mentioned this pull request Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants