Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tags for datasets #5832

Merged
merged 15 commits into from
Dec 6, 2021
Merged

Add tags for datasets #5832

merged 15 commits into from
Dec 6, 2021

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Nov 5, 2021

Adding a tag system to Datasets, analogous to what Annotations already have.

URL of deployed dev instance (used for testing):

Backend Changes

  • Dataset json now contains additional field tags (list of string)
  • Dataset update route now expects tags as well (it being missing will be interpreted as empty list), changes tags in database

Steps to test:

  • Open the dataset table. There should be a new row for tags.
  • Try adding some tags for a few datasets.
  • Clicking on one of the tags should add the tag to the search and hide all datasets that do not have this tag.
  • The clicked tag should appear on the top left of the dataset table.
  • Clicking on close on the list of tags the datasets are filtered for, should remove that tag from the filtering
  • Reloading the page should preserve the filtered tags.
  • Adding or removing a tag from a dataset, then waiting a second for the backend call and then reloading the whole page, should show the dataset with the added/removed tag.
  • Reloading a dataset should also not change the displayed tags of a dataset.

Issues:


@fm3
Copy link
Member Author

fm3 commented Nov 5, 2021

@MichaelBuessemeyer The backend part should be done. I could only test it in a limited way so far, as there is no frontend yet. If you encounter problems, feel free to ping me :)

@fm3 fm3 requested a review from jstriebel November 5, 2021 08:22
Copy link
Contributor

@jstriebel jstriebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend mostly LGTM, please see my two comments @fm3:

app/models/binary/DataSetService.scala Show resolved Hide resolved
@@ -98,6 +98,7 @@ CREATE TABLE webknossos.dataSets(
logoUrl VARCHAR(2048),
sortingKey TIMESTAMPTZ NOT NULL,
details JSONB,
tags VARCHAR(256)[] NOT NULL DEFAULT '{}',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also use the json or jsonb type (seems like it's available from pg 9.6). It's also possible to create indices on jsonb columns and query them efficiently (e.g. when searching/selecting by tag):

https://www.postgresql.org/docs/9.4/datatype-json.html#JSON-INDEXING

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also totally fine as-is for now I guess.

Copy link
Member

@philippotto philippotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The front-end code looks great 👍 However, I think that I've found a potential race condition (see my comment).

Other than that, I have some UI suggestions:

  • turn the cursor into a pointer when hovering the tags (via CSS)
  • show a tooltip which explains the tag interactions (e.g., "Click to only show datasets with this tag", "Add a new tag", "Remove this tag from this dataset")

Would be great if you could generalize these UI improvements so that the tags for the explorative annotations also profit from these :)

Comment on lines 172 to 173
if (isLoading) return;
setIsLoading(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I change the tags for two datasets and don't wait for the first update to be finished, the second update will get lost then, right? I think, a clean solution would be to have some sort of "mutex promise" which is either null (then, one can execute the update) or it's a promise which needs to be awaited. Also, maybe not the entire function needs to block on that promise (but instead only the await updateDataset part maybe?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this. I implemented a queueing like behaviour for all updates on the same dataset. This queuing ensures that each previous request is finished before the current request starts. Therefore it is ensured that the backend always gets the updates in the correct order. Thus no update call that (by network magic) was faster at the server gets lost because a slower update request that is much slower overwrites its changes afterwards.
Here is a screenshot of a few dataset changes made on a single dataset in a row by changing the tags of a single dataset. The connection is throttled very strongly to test this corner case. The graph on the right illustrates that one call only starts after the previous updates finished.
image

@MichaelBuessemeyer MichaelBuessemeyer marked this pull request as ready for review November 29, 2021 09:53
CHANGELOG.unreleased.md Outdated Show resolved Hide resolved
Copy link
Member

@philippotto philippotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, looks merge-ready to me :)

@MichaelBuessemeyer
Copy link
Contributor

MichaelBuessemeyer commented Dec 2, 2021

Is it correct, that I had to adjust the version of the tools/postgres/schema.sql manually?

If yes, hopefully the CI works now 😅

@MichaelBuessemeyer MichaelBuessemeyer merged commit 23c40fa into master Dec 6, 2021
@MichaelBuessemeyer MichaelBuessemeyer deleted the dataset-tags branch December 6, 2021 09:29
MichaelBuessemeyer added a commit that referenced this pull request Dec 22, 2021
* Add tags for datasets

* update schema version

* add tag column to dataset table

* add filtering and persistence for dataset tags

* unify tag handling for datasets and explorative annotations view

* adjust version of migration

* adjust revision of dataset migration

* ensure qdataset updates to be in fixed order

* Add tags for datasets

* undo accidental changes due to merging

* Update frontend/javascripts/dashboard/dataset/dataset_cache_provider.js

Co-authored-by: Philipp Otto <[email protected]>

* update version in schema.sql manually

* fix flow

Co-authored-by: Michael Büßemeyer <[email protected]>
Co-authored-by: MichaelBuessemeyer <[email protected]>
Co-authored-by: Michael Büßemeyer <[email protected]>
Co-authored-by: Philipp Otto <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Organization of datasets with tags
4 participants