Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/internal/lsp: SuggestedFix tests failing on arm64 builders starting between 2022-08-19T17:47:49 and 2022-08-23T21:04:06 #54655

Closed
bcmills opened this issue Aug 24, 2022 · 14 comments
Labels
arch-arm64 FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Aug 24, 2022

greplogs -l -e FAIL\\s+golang.org/x/tools/internal/lsp --since=2022-08-20 --omit=newcc --details

2022-08-24T16:06:46-587a153-2fc21b5/darwin-arm64-11
2022-08-24T16:06:46-587a153-2fc21b5/darwin-arm64-12
2022-08-24T16:06:46-587a153-2fc21b5/linux-arm64-aws
2022-08-24T16:00:07-587a153-3083529/darwin-arm64-11
2022-08-24T16:00:07-587a153-3083529/darwin-arm64-12
2022-08-24T16:00:07-587a153-3083529/linux-arm64-aws
2022-08-24T15:37:38-587a153-d5aa088/darwin-arm64-11
2022-08-24T15:37:38-587a153-d5aa088/darwin-arm64-12
2022-08-24T15:37:38-587a153-d5aa088/linux-arm64-aws
2022-08-24T14:31:08-587a153-f983a93/darwin-arm64-11
2022-08-24T14:31:08-587a153-f983a93/darwin-arm64-12
2022-08-24T14:31:08-587a153-f983a93/linux-arm64-aws
2022-08-24T12:12:12-587a153-b5a9459/darwin-arm64-11
2022-08-24T12:12:12-587a153-b5a9459/darwin-arm64-12
2022-08-24T12:12:12-587a153-b5a9459/linux-arm64-aws
2022-08-24T08:17:52-587a153-1ab6b79/darwin-arm64-11
2022-08-24T08:17:52-587a153-1ab6b79/darwin-arm64-12
2022-08-24T08:17:52-587a153-1ab6b79/freebsd-amd64-race
2022-08-24T08:17:52-587a153-1ab6b79/linux-arm64-aws
2022-08-24T05:40:28-587a153-1a8dfad/darwin-arm64-11
2022-08-24T05:40:28-587a153-1a8dfad/darwin-arm64-12
2022-08-24T05:40:28-587a153-1a8dfad/linux-arm64-aws
2022-08-24T02:54:32-587a153-75cdd2c/darwin-arm64-11
2022-08-24T02:54:32-587a153-75cdd2c/darwin-arm64-12
2022-08-24T02:54:32-587a153-75cdd2c/linux-arm64-aws
2022-08-24T02:23:58-587a153-7ee220c/darwin-arm64-11
2022-08-24T02:23:58-587a153-7ee220c/darwin-arm64-12
2022-08-24T02:23:58-587a153-7ee220c/linux-arm64-aws
2022-08-23T23:17:55-587a153-1dcef7b/darwin-arm64-11
2022-08-23T23:17:55-587a153-1dcef7b/darwin-arm64-12
2022-08-23T23:17:55-587a153-1dcef7b/linux-arm64-aws
2022-08-23T23:11:53-587a153-aa42997/darwin-arm64-11
2022-08-23T23:11:53-587a153-aa42997/darwin-arm64-12
2022-08-23T23:11:53-587a153-aa42997/linux-arm64-aws
2022-08-23T22:44:18-587a153-ab8a2c5/darwin-arm64-11
2022-08-23T22:44:18-587a153-ab8a2c5/darwin-arm64-12
2022-08-23T22:44:18-587a153-ab8a2c5/linux-arm64-aws
2022-08-23T22:01:05-587a153-0765da5/darwin-arm64-11
2022-08-23T22:01:05-587a153-0765da5/darwin-arm64-12
2022-08-23T22:01:05-587a153-0765da5/linux-arm64-aws
2022-08-23T21:24:14-587a153-60ad3c4/darwin-arm64-11
2022-08-23T21:24:14-587a153-60ad3c4/darwin-arm64-12
2022-08-23T21:24:14-587a153-60ad3c4/linux-arm64-aws
2022-08-23T21:22:15-587a153-a36a0c4/darwin-arm64-11
2022-08-23T21:22:15-587a153-a36a0c4/darwin-arm64-12
2022-08-23T21:04:06-587a153-790d605/darwin-arm64-12

(46 matching logs)

@gopherbot gopherbot added Tools This label describes issues relating to any tools in the x/tools repository. gopls Issues related to the Go language server, gopls. labels Aug 24, 2022
@bcmills bcmills added Soon This needs action soon. (recent regressions, service outages, unusual time-sensitive situations) and removed gopls Issues related to the Go language server, gopls. Tools This label describes issues relating to any tools in the x/tools repository. labels Aug 24, 2022
@gopherbot gopherbot added this to the Unreleased milestone Aug 24, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Aug 24, 2022

The first failure in the logs was at CL 421502 (attn @pjweinb), but It's not clear to me whether this is due to a regression in x/tools proper or in the main Go repo. The dashboard appears to be missing test runs for these builders for a run of several CLs due to the high CL churn rate with the tree reopened.

@bcmills bcmills modified the milestones: Unreleased, Go1.20 Aug 24, 2022
@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 24, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Aug 24, 2022

I'm seeing a lot of these messages in the logs:

 analysis unusedvariable for package golang.org/x/tools/internal/lsp/snippets panicked: invalid start/end token.Pos

CL 424920 would fit that timeframe (CC @mvdan @adonovan), though it's not at all clear to me how that would result in arm64-specific breakage. 🤔

@bcmills bcmills changed the title x/tools/internal/lsp: tests failing on arm64 builders since 2022-08-23 x/tools/internal/lsp: SuggestedFix tests failing on arm64 builders since 2022-08-23 Aug 24, 2022
@adonovan
Copy link
Member

adonovan commented Aug 24, 2022

That change looks completely safe, and it shouldn't even affect timing significantly. I think the scope of the recover is too large---it should include only the analysers, not the setup and postprocessing code, which is where I think this problem is occurring. I'll take a look.

@bcmills
Copy link
Contributor Author

bcmills commented Aug 24, 2022

The last successful run on these builders was at CL 424902 (2022-08-19T17:47:49Z).

@bcmills bcmills changed the title x/tools/internal/lsp: SuggestedFix tests failing on arm64 builders since 2022-08-23 x/tools/internal/lsp: SuggestedFix tests failing on arm64 builders between 2022-08-19T17:47:49 and 2022-08-23T21:04:06 Aug 24, 2022
@bcmills bcmills changed the title x/tools/internal/lsp: SuggestedFix tests failing on arm64 builders between 2022-08-19T17:47:49 and 2022-08-23T21:04:06 x/tools/internal/lsp: SuggestedFix tests failing on arm64 builders starting between 2022-08-19T17:47:49 and 2022-08-23T21:04:06 Aug 24, 2022
@pjweinb
Copy link

pjweinb commented Aug 24, 2022 via email

@adonovan
Copy link
Member

adonovan commented Aug 24, 2022

I've been debugging this for a while (on darwin/arm64), but I still can't reproduce it as shown in the logs. I agree the hover code is not to blame. But there's a lot going on in the log: We have a number of abuses of token.Pos, an overly broad recover intended to cover just the analysis plugins but that masks other bugs, and various nested "return err" statements that make it hard to trace where the errors originate. I'll be sending https://go.dev/cl/425356 to beef up logging momentarily.

@bcmills
Copy link
Contributor Author

bcmills commented Aug 25, 2022

I've been debugging this for a while (on darwin/arm64), but I still can't reproduce it as shown in the logs.

Can you reproduce it on the actual builders using a gomote? (There may be hardware or environment differences relevant to the test. Some tests unfortunately even vary their behavior based on the GO_BUILDER_NAME environment variable. 😩)

gopherbot pushed a commit to golang/tools that referenced this issue Aug 25, 2022
This is a stopgap until I can diagnost the problem, but in the meantime
we need to fix the builders.

Updates golang/go#54655

Change-Id: I6260828e8c07e3121c45f99166a26d51aa9805a4
Reviewed-on: https://go-review.googlesource.com/c/tools/+/425575
Run-TryBot: Alan Donovan <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
Reviewed-by: Tim King <[email protected]>
gopherbot pushed a commit to golang/tools that referenced this issue Aug 26, 2022
This change ensures that the End position provided to span.NewRange
in suggestedAnalysisFixes is valid even when the diagnostic has
only a valid start position. This seems to be the cause of
some panics observed in the ARM builders in the attached issue.

Also, reduce the scope of the recover operation to just the analyzer's
run method: we don't want to hide further bugs (or discard stack traces)
in the setup or postprocessing logic.

Also:
- split a single assertion in span.NewRange into two.
- Add information to various error messages to help identify causes.
- Add TODO comments about inconsistent treatment of token.File
  in span.FileSpan, and temporarily remove bug.Errorf that
  is obviously reachable from valid inputs.
- Add TODO to fix another panic in an analyzer that is covered
  by our tests but was hitherto suppressed.
- Add TODO to use bug.Errorf after recover to prevent recurrences.
  We can't do that until the previous panic is fixed.

Updates golang/go#54655

Change-Id: I0576d03fcfffe0c8df157cf6c6520c9d402f8803
Reviewed-on: https://go-review.googlesource.com/c/tools/+/425356
Run-TryBot: Alan Donovan <[email protected]>
gopls-CI: kokoro <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Suzy Mueller <[email protected]>
@adonovan
Copy link
Member

adonovan commented Aug 26, 2022

Update:

  • I temporarily disabled two test cases that were failing only on ARM. The fillstruct analyzer's quickfix is inserting zero values (0, "") instead of names of local variables, which means that this call to FindBestMatch returned nil. That could be because its behavior has changed, or because its list of candidates from here is empty. I still have no hypothesis as to why its behavior is CPU-dependent (yet not reproducible on my local ARM machine).
  • I narrowed the scope of a recover in the analysis driver to include only Analyzer.Run. This revealed panics in the analysisDiagnosticDiagnostics post-processing logic that are routinely produced during tests but were being swept under the rug. I fixed one of these panics (invalid end pos passed to span.Range).
  • Another panic was caused by a bug.Reportf in span logic that was clearly reachable given valid inputs, so I suppressed it temporarily. We need to decide what the correct assertion is, if any.
  • The fillreturns analyzer is routinely panicking due to unsafe assumptions about ill-typed code. This is suppressed by the recover. We need to fix it then insert a bug.Reportf after recover so that we always notice panics in our tests.

So, TODO:

@bcmills
Copy link
Contributor Author

bcmills commented Aug 29, 2022

This might be fixed by CL 425188 once that is merged.

@bcmills bcmills removed the Soon This needs action soon. (recent regressions, service outages, unusual time-sensitive situations) label Aug 29, 2022
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/425188 mentions this issue: cmd/compile/obj/arm64: fix encoding error of FMOVD/FMOVS $0|ZR

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/426017 mentions this issue: internal/lsp/tests: re-enable ARM tests

gopherbot pushed a commit to golang/tools that referenced this issue Aug 29, 2022
This is a simple revert of CL 425497, which disabled the tests.

The problem was due to a misencoded FMOVD instruction causing
some 0.0 constants to compile to nonzero; fixed by CL 425188.

Updates golang/go#54655
Updates golang/go#425188

Change-Id: I2231b57b7b78ac1ae2e8b60cda62898ea7122cda
Reviewed-on: https://go-review.googlesource.com/c/tools/+/426017
gopls-CI: kokoro <[email protected]>
Reviewed-by: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Run-TryBot: Alan Donovan <[email protected]>
Auto-Submit: Alan Donovan <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/426018 mentions this issue: internal/lsp/cache: report analysis panics during testing

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/426019 mentions this issue: internal/lsp/analysis/fillreturns: be defensive w.r.t. type errors

gopherbot pushed a commit to golang/tools that referenced this issue Aug 31, 2022
In the presence of type errors, TypeOf may return nil, and this appears
to have contributed to a crash in the fillreturns analysis. I haven't
been able to find or deduce a test case, but this change makes the logic
more defensive.

Also remove a stale pre-go1.17 test that used to trigger a panic
(the one fixed here? unclear) to assert that panics were recoverable.

Updates golang/go#54655

Change-Id: Ic9ca9a307eede50a2d4db6424822a155dd43e635
Reviewed-on: https://go-review.googlesource.com/c/tools/+/426019
Auto-Submit: Alan Donovan <[email protected]>
Reviewed-by: Robert Findley <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
gopls-CI: kokoro <[email protected]>
Run-TryBot: Alan Donovan <[email protected]>
@adonovan
Copy link
Member

adonovan commented Sep 1, 2022

Update:

  • the ARM problems were fixed by a patch to the assembler; the tests were reenabled.
  • fillreturns was made robust.
  • bug.Reportf is used after recover (pending, hopefully today).
  • analyses with RunDespiteErrors=false are no longer run after a type error (pending, today).
  • analyses are no longer run after failure of a prerequisite. I suspect this was responsible for a number of nondeterminstic failures after context timeouts caused analysis units to fail.
  • the span invariants question remains as a TODO in the code.

gopherbot pushed a commit to golang/tools that referenced this issue Sep 29, 2022
Report a bug when an analysis dependency is found to be
neither "vertical" (for facts) or "horizontal" (for results).
This has been observed in practise: the lookup from a package
ID to a package object is not consistent.

The bug report is suppressed for command-line-arguments
packages to see whether it occurs on ordinary packages too.

Also, add disabled logic to disable recovery and dump all
goroutines, for temporary use when debugging.

Unrelated things found during debugging:
- simplify package key used in analysis cache: the mode is
  always the same constant.
- move TypeErrorAnalyzers logic closer to where needed.

Updates golang/go#54655
Updates golang/go#54762

Change-Id: I0c2afd9ddd4fcc21748a03f8fa0769a2de09a56f
Reviewed-on: https://go-review.googlesource.com/c/tools/+/426018
Reviewed-by: Robert Findley <[email protected]>
@golang golang locked and limited conversation to collaborators Sep 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Projects
None yet
Development

No branches or pull requests

4 participants