feat: Saving the nogo fixes #4102

peng3141 · 2024-09-12T02:07:04Z

What type of PR is this?

Feature

What does this PR do? Why is it needed?

nogo analyzers may produce fixes as analysis.Diagnostic. This PR allows rules_go to save the fixes as patches, one per target. The patch and the command to apply the patch is printed out to the terminal for users to manually apply. The patch is also available in the "nogo_fix" output group. This allows people to get patches for all targets without failing the build by passing --norun_validations --output_groups nogo_fix.

Example output:

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging

nogo: errors found by nogo during build-time code analysis:
src/example.com/send_request.go:133:4: Add and Done should not both exist inside the same goroutine block (example_nogo_analyzer)

-------------------Suggested Fix-------------------
--- src/example.com/send_request.go
+++ src/example.com/send_request.go
@@ -123,6 +123,7 @@
 
 	for !isComplete {
 		<-ticker.C
+		concurrentJobs.Add(1)
 		go func() {
 			text, hasMore := <-line
 			if !hasMore {
@@ -130,7 +131,6 @@
 				return
 			}
 
-			concurrentJobs.Add(1)
 			defer concurrentJobs.Done()
 
 			atomic.AddInt32(&totalSend, 1)

-----------------------------------------------------

To apply the suggested fix, run the following command:
$ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/send_request.nogo.patch

Target //src/example.com:send_request failed to build
Use --verbose_failures to see the command lines of failed build steps.

Other notes for review
An analyzer may suggest multiple alternative fixes to one issue. Only the first one is selected by default, unless it conflicts with other fixes, in which case it moves on to try the next alternative. If all alternatives are tried but still have conflicts, they will be skipped. In such case, the user will have to apply the patch first, and run nogo again to get the fix to the issue.

google-cla · 2024-09-12T02:07:08Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

go/private/actions/compilepkg.bzl

go/tools/builders/nogo.go

go/private/rules/test.bzl

go/tools/builders/nogo_change.go

peng3141 · 2024-10-01T14:44:35Z

thanks for the comments, i am in the process of addressing them.

peng3141 · 2024-10-01T22:02:46Z

this shows the log of the latest version:
https://gist.github.com/peng3141/bdaefac333434cf2ecbef4edfd8d0200

peng3141 · 2024-10-03T18:39:38Z

@fmeum @linzhp the PR is ready for review. Could you take a look? thanks!

could you focus on the high-level design, once it is agreed upon, we can have one pass to address the readability nits.

linzhp

Only half way through, posting some comments so far. I will continue later this week

go/tools/builders/nogo_validation.go

linzhp · 2024-10-04T04:22:52Z

go/tools/builders/nogo_main.go

+		// Otherwise, bazel will complain "not all outputs were created or valid"
+		change, err := NewChangeFromDiagnostics(diagnostics, pkg.fset)
+		if err != nil {
+			errs = append(errs, fmt.Errorf("errors in dumping nogo fix, specifically in converting diagnostics to change %v", err))


If there are errors here, does it make sense to call ToPatches and SavePatchesToFile below?

yes, that was by design, although we may need to discuss about design choices here.

Btw, we need to write to empty string to nogoFixPath when there are errors. See nogo.go (line 100) for similar logic. Otherwise, bazel complains that the declared file is not defined.

Coming back to the design choices, here is another design in comparison:

if nogoFixPath != "" { // If nogo fixes are requested, save the fixes to the file even if they are empty. // This prevents Bazel from complaining about missing or invalid outputs. change, err := NewChangeFromDiagnostics(diagnostics, pkg.fset) if err != nil { // Ensure an empty patch file is saved when there's an error in generating the change. errs = append(errs, fmt.Errorf("error converting diagnostics to change: %v", err)) if saveErr := SavePatchesToFile(nogoFixPath, nil); saveErr != nil { errs = append(errs, fmt.Errorf("error saving empty patches file: %v", saveErr)) } } else { fileToPatch, err := ToPatches(Flatten(*change)) if err != nil { errs = append(errs, fmt.Errorf("error generating patches: %v", err)) if saveErr := SavePatchesToFile(nogoFixPath, nil); saveErr != nil { errs = append(errs, fmt.Errorf("error saving empty patches file: %v", saveErr)) } } else { if err := SavePatchesToFile(nogoFixPath, fileToPatch); err != nil { errs = append(errs, fmt.Errorf("error saving patches to file: %v", err)) } } } }

In my current design, when NewChangeFromDiagnostics returns error, the change has partial result of fixes which can be still applied. Let us consider this case:
There are high-quality analyzers that produce great fixes, and there are some poorly written analyzers that produce wrong fixes, e.g., they produced corrupted offsets. Current design still allows the fixes from the well-written analyzers to be applied.

There is one caveat case: the change may already include some edits from the bad analyzer, which have valid offsets but are of poor quality.

In my opinion, the nogo framework should faithfully apply the fixes that are applicable (i.e., those with valid offsets). It should not ban fixes from all analyzers in the case that one analyzer is bad. Also it is the responsibility of the monorepo owners to remove/fix bad analyzers.

Let me know your thoughts,

if we adopt my current design, I also updated NewChangeFromDiagnostics to more permissive.

I like the ability to let user choose analyzers to trust

done, this support of letting user choose has to happen on the patching side, where we show preview of diff and users answer "yes" or "no".

this is now supported.

discussed offline, users already select which linters to run in the bazel.

linzhp · 2024-10-04T04:32:06Z

go/tools/builders/nogo_change.go

+		panic("wrong size")
+	}
+
+	return string(out), nil


can we return bytes instead? The out is converted to string here and immediately converted back to []byte by its only caller

linzhp · 2024-10-04T04:34:01Z

go/tools/builders/nogo_change.go

+// The following is about the `Change`, a high-level abstraction of edits.
+// Change represents a set of edits to be applied to a set of files.
+type Change struct {
+	AnalyzerToFileToEdits map[string]map[string][]Edit `json:"analyzer_file_to_edits"`


Do we need 3 levels of nesting, only to be flattened later?

yes, we need this.

the two levels file:edits is required, since we track and apply patch per file.

besides, see the Flatten() function, it considers the cases that different analyzers produce conflicting edits, i.e., edits that overlap with each other. In this case, it is impossible to apply both edits.

we will ignore the latter analyzer (sorted already for determinism) but still allow the former analyzer to proceed.
This is why we add the extra indirection of indexing by analyzer.

Let's avoid overusing maps and create more informative data structure: https://abhinav.github.io/future-proof-packages-2023/#/%EF%B8%8F-map-overuse

good point, especially given this is across package boundary.

linzhp · 2024-10-05T00:30:35Z

go/tools/builders/nogo_change.go

+
+	// Trim left
+	for i := 0; i < len(lines); i++ {
+		if hasNonWhitespaceCharacter(lines[i]) {


Suggested change

if hasNonWhitespaceCharacter(lines[i]) {

if strings.TrimSpace(lines[i]) == "" {

this is the same, right?

no longer relevant, moved out of rules_go

done, simplified.

linzhp · 2024-10-05T00:48:59Z

go/tools/builders/nogo_change_serialization.go

+}
+
+// LoadPatchesFromFile loads the map[string]string (file paths to patch content) from a JSON file.
+// Note LoadPatchesFromFile is used for testing only.


Test utilities should be in _test.go. Putting here and export it means it's part of public API

no longer relevant, moved out of rules_go

linzhp · 2024-10-05T00:52:13Z

go/tools/builders/nogo_change.go

+		}
+
+		diff := UnifiedDiff{
+			// difflib.SplitLines does not handle well the whitespace at the beginning or the end.


if difflib.SplitLines doesn't work well, can we not use it? It's also inefficient: you first read the whole file into the memory and then split it, which doubles the memory usage. You could just read the file line by line with bufio.Scanner

no longer relevant, moved out of rules_go

SplitLines is inevitable, see another patching lib also does it: https://github.com/sergi/go-diff/blob/master/diffmatchpatch/patch.go#L483.

the problem here is the whitespace prefix or suffix, which is handled by the Trim

linzhp · 2024-10-05T01:02:58Z

go/tools/builders/nogo_main.go

+		// Otherwise, bazel will complain "not all outputs were created or valid"
+		change, err := NewChangeFromDiagnostics(diagnostics, pkg.fset)
+		if err != nil {
+			errs = append(errs, fmt.Errorf("errors in dumping nogo fix, specifically in converting diagnostics to change %v", err))


I like the ability to let user choose analyzers to trust

linzhp · 2024-10-05T01:09:11Z

go/tools/builders/nogo_change.go

+}
+
+// Flatten takes a Change and returns a map of FileToEdits, merging edits from all analyzers.
+func Flatten(change Change) map[string][]Edit {


Can we produce one patch file per analyzer? That way, we don't need to merge edits to the same file from different analyzers. Like you said, some analyzers may produce bad edits. When users apply patches one by one, they can ignore those patched produced by bad analyzers.

there will be a single file per target, as created at the .bzl files (the go files only fill in contents).
but still, in that single file, we can index patches with analyzer.

asking users to pick which analyzer to patch may be too much for developers. Ideally we want them simply say "yes/no". Also the conflicts may not happen often, we may ask only in the case of conflicts.

will think further.

done, the patching is moved out of rules_go,

fyi, it has the support of letting users preview the diff and pick. At the end, it applies the selected edits. In case of overlapping, we skip the first overlapping one for max progress, the next build can fix that one too.

asking users to pick which analyzer to patch may be too much for developers.

I don't think so. The output from rules_go can be simply:

nogo: errors found by nogo during build-time code analysis: src/example.com/foo/compute.go:28:2: self-assignment of x to x (assign) To apply the fixes run: $ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/foo/go_default_library.nogo.assign.patch

When there are multiple analyzers failing, we can list multiple patch command for users to copy:

To apply the fixes run: $ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/foo/go_default_library.nogo.assign.patch $ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/foo/go_default_library.nogo.composite.patch

I agree with what @fmeum said here: #4102 (comment)

I don't want rules_go to depend on a custom patch tool, while we can use a standard Unix utility.

Ideally we want them simply say "yes/no".

I don't think users' input matters there. When two fixes conflict, there is nothing the tool can do even the user reply "yes", right? The only thing we can do is to reject the patch and ask user to manually apply the fix, which is exactly what the standard patch command does.

discussed offline.

agreed that we should use standard tool like 'patch -p1'.

we cannot have multiple patch files, each for one analyzer, since the bazel creates a single file for each target. Also it is not easy to tell the analyzers at bazel resolving time.

what we currently do is:

for each file, we apply the edits (offset-based) and drop the analyzers that introduce overlapping edits with added edits. Analyzers are sorted for order. After this, we can compute the new content after fixing. Also, we can compute the patch from orig and new content.

we merge patches for all files into one, so that "patch -p1 < file" works.

users' inputs do not matter, for the analyzers they pick, they may still have overlapping edits that we need to resolve.

lpxz · 2024-12-16T05:45:39Z

addressed comments from @linzhp.

made the following design-level change:

move most logic out of rules_go into a third_party patcher tool (@linzhp let us discuss where we can host this, initially we can put it in my personal github repo, but in the long run, we may want to host it more properly).
rules_go does the minimal work now to export the []edits for each file and analyzer.

This way, we make sure we do not impact rules_go's performance.

Accordingly, we shall change the expectation of this PR as follows:
this PR does not provide e2e patching support. It only exports the fixes.

ready for review, @linzhp and @fmeum , thanks!

linzhp · 2024-12-16T20:16:04Z

go/private/actions/archive.bzl

+        # --run_validations (default=True) ensures nogo validation is applied to not only the input targets but also their dependent targets,
+        # thereby producing available fixes for all targets.
+        # Otherwise, if we externalize out_nogo_fix_tmp (not going through the ValidateNogo action) by putting it into a field (e.g., `nogo_fix`) in the OutputGroupInfo section of the input targets,
+        # we can see the fix for the input targets, but will miss the fixes for the dependent targets.


It's just how Bazel prints out the output artifacts, but out_nogo_fix_tmp is generated for dependent targets as long as you pass --output_group nogo_fix to bazel build.

question about output_group vs nogo validation action:
what is the benefit of this output_group option over validation?

also, I feel validation is more flexible, e.g., we can dump fixes immediately after error message there.

linzhp · 2024-12-16T21:23:54Z

go/tools/builders/.nogo_change.go.swp

what's this?

vim swap file... removed.

linzhp · 2024-12-17T00:38:24Z

go/tools/builders/nogo_change.go

+
+// DiagnosticEntry represents a diagnostic entry with the corresponding analyzer.
+type DiagnosticEntry struct {
+	analysis.Diagnostic


Avoid Embedding Types in Public Structs

Oh, I see, you just moved an existing struct here. That's fine then, but you should still make this struct private, otherwise people may start using it and we will run into issues mentioned in the link above

linzhp · 2024-12-17T01:27:58Z

go/tools/builders/nogo_change.go

+}
+
+// Flatten takes a Change and returns a map of FileToEdits, merging edits from all analyzers.
+func Flatten(change Change) map[string][]Edit {


asking users to pick which analyzer to patch may be too much for developers.

I don't think so. The output from rules_go can be simply:

nogo: errors found by nogo during build-time code analysis: src/example.com/foo/compute.go:28:2: self-assignment of x to x (assign) To apply the fixes run: $ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/foo/go_default_library.nogo.assign.patch

When there are multiple analyzers failing, we can list multiple patch command for users to copy:

To apply the fixes run: $ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/foo/go_default_library.nogo.assign.patch $ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/foo/go_default_library.nogo.composite.patch

I agree with what @fmeum said here: #4102 (comment)

I don't want rules_go to depend on a custom patch tool, while we can use a standard Unix utility.

Ideally we want them simply say "yes/no".

I don't think users' input matters there. When two fixes conflict, there is nothing the tool can do even the user reply "yes", right? The only thing we can do is to reject the patch and ask user to manually apply the fix, which is exactly what the standard patch command does.

lpxz · 2024-12-17T16:40:39Z

@linzhp
let me address your high-level feedback first.

your proposal does not really work:
$ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/foo/go_default_library.nogo.assign.patch
$ patch -p1 < bazel-out/k8-fastbuild/bin/src/example.com/foo/go_default_library.nogo.composite.patch

patch is highly unreliable tool that is based on the surrounding context. After you apply the first patch, the 2nd patching may get the context messed up and is not able to proceed.

I do not like such unreliable patching. The only reliable way is to perform the precise offset-based editing. This imposes a strict requirement though: the edits users want to apply should not overlap. This means we have to have some tool that is better than patch.

Also I think this PR is only for exporting edits, right? this is already an improvement over existing status that no fixes can be exported.

peng3141 · 2024-12-18T05:43:41Z

addressed the comments, tested with locally with

multiple files have errors of different analyzers.

one outstanding question is about output_group vs nogo_validation, I am not sure about advantage of output_group.
also i think nogo_validation offers more flexibility, e.g., it can print patch message immediately after error message.

…into one

peng3141 · 2024-12-18T16:41:22Z

discussed offline and made these improvements:

also will add the printing of patch in validation action immediately after the errors.
Also add the nogo_fix to output_group support the case of not running validations. tested via:
```
$ bazel build --output_groups=nogo_fix --norun_validations 
```
there is no need to have two fix files, one for run_nogo and one for validation action. We just need one file. Actually validation does not need to write to it at all.

go/private/actions/archive.bzl

go/tools/builders/difflib.go

go/tools/builders/nogo_change.go

go/private/rules/library.bzl

go/tools/builders/nogo_change.go

go/private/rules/binary.bzl

go/private/rules/library.bzl

go/tools/builders/nogo_change.go

linzhp · 2024-12-20T01:14:26Z

go/tools/builders/nogo_change.go

+	return result, nil
+}
+
+func trimWhitespaceHeadAndTail(lines []string) []string {


Why do we need this? The line numbers in the unified diff will be wrong if we remove the empty lines

it is because difflib.SplitLines blindly adds an extra "\n" at the end: https://github.com/pmezard/go-difflib/blob/master/difflib/difflib.go#L766C1-L772C2, I am not sure why it does so (does not look like a bug).

some examples:
"line1\nline2\nline3" -> ["line1\n", "line2\n", "line3\n"].
"line1\nline2\nline3\n" -> ["line1\n", "line2\n", "line3\n\n"].

This unintended additional newline (\n\n) can lead to incorrect diff outputs or unexpected behavior in downstream processes that consume these lines.

there may be a better workaround this.

linzhp · 2024-12-20T01:16:03Z

go/tools/builders/nogo_change.go

+		combinedPatch.WriteString("\n") // Ensure separation between file patches
+	}
+
+	// Remove trailing newline


Why do we need this? The new lines at the end of the patch don't seem harmful. If we don't need to delete the new line, we can pass a io.Writer into this function and call difflib.WriteUnifiedDiff instead, so we don't need to hold the patch in memory.

good point.

go/tools/builders/nogo_change_test.go

peng3141 · 2024-12-20T04:38:26Z

half way, will handle the rest tmr.

peng3141 marked this pull request as draft September 12, 2024 02:07

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch from 7b03e4c to 29f650b Compare September 24, 2024 03:03

peng3141 changed the title ~~[draft][do not review] hack to get nogo fix out of bazel sandbox~~ rules_go improvement to externalize the nogo fix Sep 24, 2024

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 9 times, most recently from f506b28 to c079a7f Compare September 25, 2024 15:09

peng3141 marked this pull request as ready for review September 25, 2024 15:33

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 2 times, most recently from 3b49ecb to b57424d Compare September 25, 2024 16:54

linzhp requested a review from fmeum September 29, 2024 04:51

linzhp reviewed Sep 29, 2024

View reviewed changes

fmeum reviewed Sep 29, 2024

View reviewed changes

go/private/rules/test.bzl Outdated Show resolved Hide resolved

go/tools/builders/nogo_change.go Outdated Show resolved Hide resolved

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 2 times, most recently from 4f41cce to f6bab4d Compare October 1, 2024 21:54

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 3 times, most recently from 47d7bf7 to ee5bf11 Compare October 2, 2024 15:23

linzhp reviewed Oct 4, 2024

View reviewed changes

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch from ee5bf11 to 23277ea Compare October 4, 2024 14:57

linzhp reviewed Oct 5, 2024

View reviewed changes

linzhp marked this pull request as draft December 14, 2024 16:28

linzhp self-assigned this Dec 16, 2024

linzhp marked this pull request as ready for review December 16, 2024 05:47

linzhp reviewed Dec 17, 2024

View reviewed changes

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 4 times, most recently from b4420cf to f4775d4 Compare December 18, 2024 05:41

12/17: switch back to the linux patch solution, all patches combiend …

fdb823a

…into one

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch from f4775d4 to fdb823a Compare December 18, 2024 16:38

linzhp reviewed Dec 19, 2024

View reviewed changes

12/18: import https://github.com/pmezard/go-difflib rather than copying

35a9492

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch from 894b2b4 to fd673f5 Compare December 19, 2024 18:17

12/18: address comments

8f805be

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch from fd673f5 to 8f805be Compare December 19, 2024 18:26

add com_github_pmezard_go_difflib to workspace

ef0986d

linzhp reviewed Dec 20, 2024

View reviewed changes

12/18: address comments batch 2

d792536

peng3141 force-pushed the rules_go_hack_for_dumping_fix branch from b00b27a to d792536 Compare December 20, 2024 18:07

linzhp added 5 commits December 22, 2024 00:10

Applying SugsestedFixes atomically

f45ee22

stylish changes

d003969

more stylish changes

091a525

Merge branch 'master' into rules_go_hack_for_dumping_fix

4dab25d

fixing test on Windows

e987c18

linzhp changed the title ~~rules_go improvement to externalize the nogo fix~~ feat: Saving the nogo fixes Dec 22, 2024

	if hasNonWhitespaceCharacter(lines[i]) {
	if strings.TrimSpace(lines[i]) == "" {

feat: Saving the nogo fixes #4102

Are you sure you want to change the base?

feat: Saving the nogo fixes #4102

Conversation

peng3141 commented Sep 12, 2024 • edited by linzhp Loading

google-cla bot commented Sep 12, 2024

peng3141 commented Oct 1, 2024

peng3141 commented Oct 1, 2024

peng3141 commented Oct 3, 2024 • edited Loading

linzhp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peng3141 Oct 4, 2024 • edited Loading

Choose a reason for hiding this comment

peng3141 Oct 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lpxz commented Dec 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lpxz commented Dec 17, 2024

peng3141 commented Dec 18, 2024

peng3141 commented Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peng3141 commented Dec 20, 2024

peng3141 commented Sep 12, 2024 •

edited by linzhp

Loading

peng3141 commented Oct 3, 2024 •

edited

Loading

peng3141 Oct 4, 2024 •

edited

Loading

peng3141 Oct 4, 2024 •

edited

Loading

peng3141 commented Dec 18, 2024 •

edited

Loading