Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor job scheduler to make it easier to test #719

Closed
radeksimko opened this issue Nov 18, 2021 · 4 comments · Fixed by #782
Closed

refactor job scheduler to make it easier to test #719

radeksimko opened this issue Nov 18, 2021 · 4 comments · Fixed by #782
Assignees
Milestone

Comments

@radeksimko
Copy link
Member

radeksimko commented Nov 18, 2021

Problem Statement

Currently we have at least one test which has to have time.Sleep due to the fact that we run certain operations in Defer which is in a separate goroutine and we don't have any synchronization to allow the test to wait for the completion.

Ideally tests should not have to implement time.Sleep to test regular scenarios with multiple modules.

// module manifest-dependent tasks are scheduled & executed
// asynchronously and we currently have no way of waiting
// for them to complete.
// TODO remove once we support synchronous dependent tasks
time.Sleep(1500 * time.Millisecond)

Requirements Summary

The scheduler has a number of requirements:

  • testability (as mentioned above)
  • progress reporting - to be able to inform user of the progress in the UI
  • dependent tasks - e.g. ensure that decoding for a particular module runs after parsing
  • priorities - ensure that jobs scheduled from didOpen or didChange events are not affected by background indexing workload of a large workspace
  • deduplication - to avoid indexing the same document/module version twice (from workspace indexing vs didOpen/didChange handlers)
  • querying task status - e.g. be able to block in didOpen/didChange/... handlers until essential jobs finish, to avoid sync issues in other handlers (e.g. completion)
  • ability to add or remove jobs affecting a particular path from queue before processing - e.g. when workspace/didChangeWorkspaceFolders is received

Proposal

In order to be able to effectively sort tasks (based on dependencies or document state open/closed), dequeue tasks by path and block in handlers, it would be reasonable to store certain data in memdb. Two new tables are proposed for creation.

documents

Currently documents metadata is stored as a map and is tightly coupled with the filesystem package:

docMeta map[string]*documentMetadata
docMetaMu *sync.RWMutex

A new memdb table can be created within state package to hold the following structure:

type Document struct {
	Path       string
	LanguageID string
	Version    uint
	IsOpen     bool
	Content    source.Lines // or []byte or both
}

jobs

Currently job data is maintained as a queue (internally a slice) within the terraform/module package:
https://github.com/hashicorp/terraform-ls/blob/main/internal/terraform/module/module_ops_queue.go

A new memdb table can be created within state package to hold the following structure:

type Job struct {
	ID      uint
	DirPath string
	State   uint // Queued / Running / Done
	Func    func()
	Type    string // e.g. GetTerraformVersion; mainly for stats collection
}

The same "jobs" table can hold jobs submitted "on the background" from the watcher/walker and from didOpen/didChange handlers, to enable deduplication. Two different "indexer" implementations can be exposed with different criteria of how to pull the next (if any) job from the table.

Notes / Open Questions

  • Could memdb "watchers" be used in tests and in handlers where we need to block?
  • Main exec loop can use txn.FirstWatch method to block until next job is available
@radeksimko radeksimko self-assigned this Jan 12, 2022
@radeksimko radeksimko changed the title refactor task scheduler to make it easier to test refactor job scheduler to make it easier to test Jan 17, 2022
@dbanck
Copy link
Member

dbanck commented Jan 18, 2022

Just a quick thought: I've used a couple of queuing/scheduling libraries with other programming languages. Most of them are well-architected and support all listed requirements. It might be worth checking out if there is something for Go, too, which we can leverage?

@radeksimko
Copy link
Member Author

radeksimko commented Jan 18, 2022

It might be worth checking out if there is something for Go, too, which we can leverage?

I did actually use one library early on, but it didn't support priority queues. ef82870 but there isn't that many out there that I'm aware of. This is probably because Go actually offers decent primitives in the standard library (go routine + synchronization via sync package).

While the requirements apply to the whole solution, lots of them are in fact satisfied by memdb, which means that the implementation of the scheduler can be relatively simple - all it needs to know is what the "next" job is and run that job in a go routine.

Much of the existing complexity IMO comes from wanting to satisfy all the requirements within the scheduler and I'm hoping to get rid of that complexity by decoupling the code, as proposed.

@github-actions
Copy link

This functionality has been released in v0.26.0 of the language server.
If you use the official Terraform VS Code extension, it will prompt you to upgrade to this version automatically upon next launch or within the next 24 hours.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants