Skip to content
This repository has been archived by the owner on Jan 2, 2021. It is now read-only.

ShakeSession and shakeEnqueue #554

Merged
merged 20 commits into from
Jun 8, 2020
Merged

Conversation

pepeiborra
Copy link
Collaborator

@pepeiborra pepeiborra commented May 8, 2020

Currently we start a new Shake session for every interaction with the Shake
database, including type checking, hovers, code actions, completions, etc.
Since only one Shake session can ever exist, we abort the active session if any
in order to execute the new command in a responsive manner.

This is suboptimal in many, many ways:

  • A hover in module M aborts the typechecking of module M, only to start over!
  • Read-only commands (hover, code action, completion) need to typecheck all the
    modules! (or rather, ask Shake to check that the typechecks are current)
  • There is no way to run non-interfering commands concurrently

This is an experiment inspired by @mpickering ShakeQueue and
the follow-up discussion in mpickering#7

We introduce the concept of the ShakeSession as part of the IDE state.
The ShakeSession is initialized by a call to shakeRun, and survives until
the next call to shakeRun. It is important that the session is restarted as
soon as the filesystem changes, to ensure that the database is current.

The ShakeSession enables a new command shakeEnqueue, which appends work to
the existing ShakeSession. This command can be called in parallel without any
restriction.

shakeEnqueue then replaces shakeRun for all user actions that do not involve a change to the filesystem. It's very neat that this requires a single change in runAction.

The result is lightning fast performance for hover and code actions. Not so for completions, which usually arise while typing, and do not find an existing, warm ShakeSession. More thought is needed for these.

The hack used to inject more work in the ShakeSession quite possibly breaks internal Shake invariants, which is why this is a Request For Comments. I have tested it with GHC and ghcide itself, and haven't observed any issues.

TODO

  • rework progress reporting mechanism

@mpickering
Copy link
Contributor

Am I correct that this approach doesn't guarantee that all actions are completed? How would you implement the initial loading action or the propagating to parents behaviour of the shake queue?

It also seems hard to make completions fast using this approach without implementing something quite similar to what is implemented in the shake queue approach.

@pepeiborra
Copy link
Collaborator Author

Am I correct that this approach doesn't guarantee that all actions are completed? How would you implement the initial loading action or the propagating to parents behaviour of the shake queue?

It also seems hard to make completions fast using this approach without implementing something quite similar to what is implemented in the shake queue approach.

Correct, this is a minimal change to address only the issue of suboptimal cancellations. The idea is to address the performance issues one by one, step by step.

How does the queue make completions fast?

@mpickering
Copy link
Contributor

mpickering commented May 8, 2020

The queue does not help directly but because completions uses stale information, the request to recompute the completions is placed into the queue after the request is dealt with to ensure that the completions information is refreshed.

The main benefit of the queue in my opinion is that every action is guaranteed to be completed so you can remove the very suboptimal kick.

@pepeiborra
Copy link
Collaborator Author

pepeiborra commented May 9, 2020

What is the actual impact of kick?

It causes a typecheck of all the files of interest, so if I only have one file open, then kick is actually optimal. But even in that scenario completions take >1s to appear, and that would not get better with improvements to kick as it is already optimal in this situation.

In another experiment (#555) I have confirmed that completions are blazingly fast with no kick at all, at the cost of soundness of course :)

@mpickering
Copy link
Contributor

Is your question rhetorical or would you like me to give you my opinion?

@ndmitchell
Copy link
Collaborator

As to the Shake parts, that seems fine. Nothing breaks any internal invariants. In fact, I think you can go even further, and have pumpAction both continue pumping and fork off a new action in parallel. I'm not totally sure you can do that without consuming stack space though.

Something along these lines (either this approach or the one in @mpickering 's branch) seems like a good thing to do. I think we are quite quickly approaching the trade-off between correctness and performance, so maybe it's worth discussing that in words first? I believe the position of @cocreature is (was?) that we should never give incorrect information to an LSP query? It seems from this code and the branch that people (@pepeiborra and @mpickering) want to weaken that. How far? What invariants are required? What can we relax? How far can we relax it?

@pepeiborra
Copy link
Collaborator Author

Thanks Neil for confirming that this trick won't break any Shake invariants. Hopefully it doesn't break any ghcide invariants either.

While there is indeed a tension between accuracy and responsiveness, and I think accuracy can be relaxed in some cases (completions), I would not expect this PR to alter the current balance in ghcide. It increases the responsiveness of hovers, code actions and go-to definition, for free!

@pepeiborra
Copy link
Collaborator Author

Promoting this from RFC to Merge Request

@pepeiborra pepeiborra changed the title RFC - ShakeSession and shakeRunGently ShakeSession and shakeRunGently May 13, 2020
@pepeiborra pepeiborra changed the title ShakeSession and shakeRunGently ShakeSession and shakeEnqueue May 13, 2020
@pepeiborra pepeiborra force-pushed the shakeRunGently branch 3 times, most recently from f6ce146 to 75675f3 Compare May 16, 2020 13:03
@pepeiborra
Copy link
Collaborator Author

One issue with this approach is that ghcide doesn't send the WorkDoneProgressEnd notification timely, as the runShake thread stays up to support the ShakeSession.

I plan to fix this by attaching the progress reporting directly to kick

@pepeiborra pepeiborra force-pushed the shakeRunGently branch 3 times, most recently from 00e32fc to 55b5d25 Compare May 22, 2020 10:33
@pepeiborra
Copy link
Collaborator Author

pepeiborra commented May 22, 2020

Hmm, progress reporting still not working as expected.

UPDATE: pushed a change to fix progress reporting

@pepeiborra
Copy link
Collaborator Author

pepeiborra commented May 23, 2020

This is ready for final review. We have been using this branch at Strats for the past week, without issues, and:

Copy link
Collaborator

@ndmitchell ndmitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Shake bits look fine to me, and if it gives a measurable performance boost, that's all win. I appreciate @mpickering has gone off in a slightly different direction, so it might be worth thinking if that's the ultimate state, and if this is a good step in the direction, but in isolation it looks good and better to me.

The ghcide kick stuff is wrong, since if you have a file that used to be open (e.g. Temp.hs), and then shut it, and Temp.hs is no longer relevant, you should get rid of the warnings/errors about Temp.hs. In DAML that happens with garbage collect and a carefully written kick that knows about project roots etc. In ghcide that doesn't happen at all, since we wrote a crappy kick to get something released. Moving the kick logic into ghcide and doing it properly between DAML and Ghcide would be a good thing, but I think we should keep garbageCollect etc alive until we've done that.

src/Development/IDE/Core/OfInterest.hs Show resolved Hide resolved
src/Development/IDE/Core/Shake.hs Show resolved Hide resolved
src/Development/IDE/Core/Shake.hs Outdated Show resolved Hide resolved
src/Development/IDE/Core/Shake.hs Outdated Show resolved Hide resolved
src/Development/IDE/Core/Shake.hs Outdated Show resolved Hide resolved
Copy link
Collaborator

@cocreature cocreature left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks a lot! Having to kill the current shake session for actions that don’t change anything was by far my biggest gripe with the current architecture.

It would be great if @mpickering could weigh in here. I know he has a different plan here but I’m not too familiar with the details. If they don’t conflict, let’s merge this first. If they do conflict, I’d like to get some idea of the end state before we go off in one direction.

src/Development/IDE/Core/OfInterest.hs Show resolved Hide resolved
src/Development/IDE/Core/Shake.hs Show resolved Hide resolved
src/Development/IDE/Core/Shake.hs Outdated Show resolved Hide resolved
src/Development/IDE/Core/Shake.hs Outdated Show resolved Hide resolved
src/Development/IDE/Core/FileStore.hs Outdated Show resolved Hide resolved
A Barrier is a smaller abstraction than an MVar, and the next version of the extra package will come with a suitably small implementation:

ndmitchell/extra@98c2a83
The action returned by shakeRun now blocks until another call to shakeRun is made, which is a change in behaviour,. but all the current uses of shakeRun ignore this action.

Since the new behaviour is not useful, this change simplifies and updates the docs and name accordingly
@pepeiborra
Copy link
Collaborator Author

I have pushed changes to restart the Shake session when a new component added. For this to work well (and pass the test suite), I had to implement re-enqueueing of cancelled user actions too.

@pepeiborra
Copy link
Collaborator Author

Nope, it's still failing occasionally, although I can no longer repro in VS Code

@pepeiborra
Copy link
Collaborator Author

Bumping the delay from 5 to 6 seconds in the multi2 test makes it always pass locally.

,shakeClose :: IO ()
,shakeExtras :: ShakeExtras
{shakeDb :: ShakeDatabase
,shakeSession :: MVar ShakeSession
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is shakeSession wrapped in an MVar if you are already using a TQueue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TQueue cannot be closed, so the MVar is needed to ensure that no more actions are added while the session is being restarted. It's likely that this can be simplified further, this is mostly an iteration on top of the previous design.

cancelShakeSession = do
cancel workThread
atomically $ do
q <- flushTQueue actionQueue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what is going on here? How are things requeued?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cancelShakeSession returns a list of the pending tasks, which are then fed back into the new session in line 438

-- Set up a new 'ShakeSession' with a set of initial system and user actions
-- Will crash if there is an existing 'ShakeSession' running.
-- Progress is reported only on the system actions.
-- Only user actions will get re-enqueued
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is the right way around. It is really the system actions we want to requeue in the end I think. This distinction seems premature anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'system' and 'user' actions are not the best names, but are the best I could come up late at night, sorry about that.

  • system actions: typechecks. We don't want to reenqueue typechecks since a fresh set of them are always provided when restarting the session.

  • user actions: everything else. They must always be completed (unless the lsp-client cancels them) or the clientMsgChan thread will deadlock.

@pepeiborra pepeiborra force-pushed the shakeRunGently branch 2 times, most recently from be9a027 to 5b1647c Compare June 6, 2020 20:25
Copy link
Collaborator

@cocreature cocreature left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks a lot!

exe/Main.hs Outdated Show resolved Hide resolved
@cocreature cocreature merged commit 0ff88c6 into haskell:master Jun 8, 2020
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
* ShakeSession and shakeRunGently

Currently we start a new Shake session for every interaction with the Shake
database, including type checking, hovers, code actions, completions, etc.
Since only one Shake session can ever exist, we abort the active session if any
in order to execute the new command in a responsive manner.

This is suboptimal in many, many ways:

- A hover in module M aborts the typechecking of module M, only to start over!
- Read-only commands (hover, code action, completion) need to typecheck all the
  modules! (or rather, ask Shake to check that the typechecks are current)
- There is no way to run non-interfering commands concurrently

This is an experiment inspired by the 'ShakeQueue' of @mpickering, and
the follow-up discussion in mpickering/ghcide#7

We introduce the concept of the 'ShakeSession' as part of the IDE state.
The 'ShakeSession' is initialized by a call to 'shakeRun', and survives until
the next call to 'shakeRun'. It is important that the session is restarted as
soon as the filesystem changes, to ensure that the database is current.

The 'ShakeSession' enables a new command 'shakeRunGently', which appends work to
the existing 'ShakeSession'. This command can be called in parallel without any
restriction.

* Simplify by assuming there is always a ShakeSession

* Improved naming and docs

* Define runActionSync on top of shakeEnqueue

shakeRun is not correct as it never returns anymore

* Drive progress reporting from newSession

The previous approach reused the shakeProgress thread,  which doesn't work anymore as ShakeSession keeps the ShakeDatabase open until the next edit

* Deterministic progress messages in tests

Dropping the 0.1s sleep to ensure that progress messages during tests are
deterministic

* Make kick explicit

This is required for progress reporting to work, see notes in shakeRun

As to whether this is the right thing to do:

1. Less magic, more explicit
2. There's only 2 places where kick is actually used

* apply Neil's feedback

* avoid a deadlock when the enqueued action throws

* Simplify runAction + comments

* use a Barrier for clarity

A Barrier is a smaller abstraction than an MVar, and the next version of the extra package will come with a suitably small implementation:

ndmitchell/extra@98c2a83

* Log timings for code actions, hovers and completions

* Rename shakeRun to shakeRestart

The action returned by shakeRun now blocks until another call to shakeRun is made, which is a change in behaviour,. but all the current uses of shakeRun ignore this action.

Since the new behaviour is not useful, this change simplifies and updates the docs and name accordingly

* delete runActionSync as it's just runAction

* restart shake session on new component created

* requeue pending actions on session restart

* hlint

* Bumped the delay from 5 to 6

* Add a test for the non-lsp command line

* Update exe/Main.hs

Co-authored-by: Moritz Kiefer <[email protected]>
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
* ShakeSession and shakeRunGently

Currently we start a new Shake session for every interaction with the Shake
database, including type checking, hovers, code actions, completions, etc.
Since only one Shake session can ever exist, we abort the active session if any
in order to execute the new command in a responsive manner.

This is suboptimal in many, many ways:

- A hover in module M aborts the typechecking of module M, only to start over!
- Read-only commands (hover, code action, completion) need to typecheck all the
  modules! (or rather, ask Shake to check that the typechecks are current)
- There is no way to run non-interfering commands concurrently

This is an experiment inspired by the 'ShakeQueue' of @mpickering, and
the follow-up discussion in mpickering/ghcide#7

We introduce the concept of the 'ShakeSession' as part of the IDE state.
The 'ShakeSession' is initialized by a call to 'shakeRun', and survives until
the next call to 'shakeRun'. It is important that the session is restarted as
soon as the filesystem changes, to ensure that the database is current.

The 'ShakeSession' enables a new command 'shakeRunGently', which appends work to
the existing 'ShakeSession'. This command can be called in parallel without any
restriction.

* Simplify by assuming there is always a ShakeSession

* Improved naming and docs

* Define runActionSync on top of shakeEnqueue

shakeRun is not correct as it never returns anymore

* Drive progress reporting from newSession

The previous approach reused the shakeProgress thread,  which doesn't work anymore as ShakeSession keeps the ShakeDatabase open until the next edit

* Deterministic progress messages in tests

Dropping the 0.1s sleep to ensure that progress messages during tests are
deterministic

* Make kick explicit

This is required for progress reporting to work, see notes in shakeRun

As to whether this is the right thing to do:

1. Less magic, more explicit
2. There's only 2 places where kick is actually used

* apply Neil's feedback

* avoid a deadlock when the enqueued action throws

* Simplify runAction + comments

* use a Barrier for clarity

A Barrier is a smaller abstraction than an MVar, and the next version of the extra package will come with a suitably small implementation:

ndmitchell/extra@98c2a83

* Log timings for code actions, hovers and completions

* Rename shakeRun to shakeRestart

The action returned by shakeRun now blocks until another call to shakeRun is made, which is a change in behaviour,. but all the current uses of shakeRun ignore this action.

Since the new behaviour is not useful, this change simplifies and updates the docs and name accordingly

* delete runActionSync as it's just runAction

* restart shake session on new component created

* requeue pending actions on session restart

* hlint

* Bumped the delay from 5 to 6

* Add a test for the non-lsp command line

* Update exe/Main.hs

Co-authored-by: Moritz Kiefer <[email protected]>
pepeiborra added a commit to pepeiborra/ide that referenced this pull request Dec 29, 2020
* ShakeSession and shakeRunGently

Currently we start a new Shake session for every interaction with the Shake
database, including type checking, hovers, code actions, completions, etc.
Since only one Shake session can ever exist, we abort the active session if any
in order to execute the new command in a responsive manner.

This is suboptimal in many, many ways:

- A hover in module M aborts the typechecking of module M, only to start over!
- Read-only commands (hover, code action, completion) need to typecheck all the
  modules! (or rather, ask Shake to check that the typechecks are current)
- There is no way to run non-interfering commands concurrently

This is an experiment inspired by the 'ShakeQueue' of @mpickering, and
the follow-up discussion in mpickering/ghcide#7

We introduce the concept of the 'ShakeSession' as part of the IDE state.
The 'ShakeSession' is initialized by a call to 'shakeRun', and survives until
the next call to 'shakeRun'. It is important that the session is restarted as
soon as the filesystem changes, to ensure that the database is current.

The 'ShakeSession' enables a new command 'shakeRunGently', which appends work to
the existing 'ShakeSession'. This command can be called in parallel without any
restriction.

* Simplify by assuming there is always a ShakeSession

* Improved naming and docs

* Define runActionSync on top of shakeEnqueue

shakeRun is not correct as it never returns anymore

* Drive progress reporting from newSession

The previous approach reused the shakeProgress thread,  which doesn't work anymore as ShakeSession keeps the ShakeDatabase open until the next edit

* Deterministic progress messages in tests

Dropping the 0.1s sleep to ensure that progress messages during tests are
deterministic

* Make kick explicit

This is required for progress reporting to work, see notes in shakeRun

As to whether this is the right thing to do:

1. Less magic, more explicit
2. There's only 2 places where kick is actually used

* apply Neil's feedback

* avoid a deadlock when the enqueued action throws

* Simplify runAction + comments

* use a Barrier for clarity

A Barrier is a smaller abstraction than an MVar, and the next version of the extra package will come with a suitably small implementation:

ndmitchell/extra@98c2a83

* Log timings for code actions, hovers and completions

* Rename shakeRun to shakeRestart

The action returned by shakeRun now blocks until another call to shakeRun is made, which is a change in behaviour,. but all the current uses of shakeRun ignore this action.

Since the new behaviour is not useful, this change simplifies and updates the docs and name accordingly

* delete runActionSync as it's just runAction

* restart shake session on new component created

* requeue pending actions on session restart

* hlint

* Bumped the delay from 5 to 6

* Add a test for the non-lsp command line

* Update exe/Main.hs

Co-authored-by: Moritz Kiefer <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants