Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return list of ongoing task in /health endpoint #4961

Merged
merged 3 commits into from
Mar 24, 2020
Merged

Conversation

mangalaman93
Copy link
Member

@mangalaman93 mangalaman93 commented Mar 18, 2020

We return the list of ongoing tasks out of snapshot, indexing or/and rollup in the health endpoint. This endpoint can now be used to check whether indexing is still going on after an Alter operation completes.

TODO

  • add docs

This change is Reviewable

Docs Preview: Dgraph Preview

@mangalaman93 mangalaman93 requested review from manishrjain and a team as code owners March 18, 2020 06:48
Copy link
Contributor

@pawanrawal pawanrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test case for this?

Also please verify that this info should also be returned from the GraphQL endpoint. You'd just need to add the appropriate type for the new field inside GraphQL schema at graphql/admin/admin.go. Happy to help with it.

Reviewed 6 of 6 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @mangalaman93, @manishrjain, and @parasssh)


worker/draft.go, line 154 at r1 (raw file):

			return "snapshot"
		case opIndexing:
			return "indexing"

Would also be nice to get info about the predicate for which indexing is going on. Can indexing for multiple predicates go on at the same time?

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get info from all the other servers via Heartbeat.

Reviewed 6 of 6 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @mangalaman93 and @parasssh)

Copy link
Contributor

@parasssh parasssh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mangalaman93)


worker/draft.go, line 166 at r1 (raw file):

		tasks = append(tasks, toStr(id))
	}
	return tasks

Since we are only trying to indicate that some "indexing, rollup and/or snapshot" is ongoing, it may be good to remove dupes before returning.

Alternatively, we add more info per task such as predicate being indexed, snapshot ts etc.

Copy link
Member Author

@mangalaman93 mangalaman93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a cyclic dependency in providing tasks information in heartbeats.
worker -> conn -> worker (for tasks information)

Reviewable status: 0 of 9 files reviewed, 2 unresolved discussions (waiting on @manishrjain, @parasssh, and @pawanrawal)


worker/draft.go, line 154 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

Would also be nice to get info about the predicate for which indexing is going on. Can indexing for multiple predicates go on at the same time?

Done.


worker/draft.go, line 166 at r1 (raw file):

Previously, parasssh wrote…

Since we are only trying to indicate that some "indexing, rollup and/or snapshot" is ongoing, it may be good to remove dupes before returning.

Alternatively, we add more info per task such as predicate being indexed, snapshot ts etc.

A map cannot have duplicates to begin with. Added more info

Copy link
Contributor

@pawanrawal pawanrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: is this something that we could have an automated test case for?

Reviewed 9 of 9 files at r2.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @parasssh)

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 9 of 9 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mangalaman93 and @parasssh)


edgraph/server.go, line 725 at r2 (raw file):

		Uptime:   int64(time.Since(x.WorkerConfig.StartTime) / time.Second),
		LastEcho: time.Now().Unix(),
		Ongoing:  worker.GetOngoingTasks(),

Add a TODO that we need to find a way to get the health from others.

Copy link
Member Author

@mangalaman93 mangalaman93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pawanrawal the test could become flaky because indexing won't take too long if the data size is small.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain and @parasssh)


edgraph/server.go, line 725 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Add a TODO that we need to find a way to get the health from others.

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants