Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to tag documents #997

Closed
10 tasks done
lukavdplas opened this issue Nov 25, 2022 · 1 comment
Closed
10 tasks done

Allow users to tag documents #997

lukavdplas opened this issue Nov 25, 2022 · 1 comment
Labels
enhancement improvements to user functionality major major changes to functionality and/or the code base

Comments

@lukavdplas
Copy link
Contributor

lukavdplas commented Nov 25, 2022

Allow users to assign tags to documents.

The document view should include a component to edit the document's tags. Also, there should be some kind of interface where users can edit their own tags and see their tagged documents.

Since tags are personal, it probably makes the most sense to store them in the SQL database with the document ID and corpus name. However, this makes it difficult to use elasticsearch functionality with tags, such as implementing tags as a filter.

@lukavdplas lukavdplas added the enhancement improvements to user functionality label Nov 25, 2022
@lukavdplas
Copy link
Contributor Author

lukavdplas commented Jan 27, 2023

We've been discussing potential uses for tags. Perhaps a few common uses would be the following:

  • Bookmark a small number of relevant documents for close reading or downloading.
  • Tag everything matching a query as, for instance, 'mentions democracy'. Then, when you browse other queries in the future, you can immediately see this tag on some documents, giving you a quick indication of relevant properties or topics.
  • Tag a complex query, then use the tag as a filter in future queries. For instance, filter all social-democratic parties in the Dutch parliamentary debates and tag them 'social democrat'. Now future searches for social democrats are a lot quicker.

The latter two are a bit tricky if we implement tags by storing document IDs in the SQL database. Tagging a few queries can result in millions of (user, document, tag) datarows. Using a tag as a filter is also tricky since elasticsearch doesn't have access to the tags.

It may be better to store tags not as document IDs, but as a queries. If the user tags a single document, this is just a query for that document ID. To get all documents with a tag, you just have to get the union of all the queries that are assigned to it.

@lukavdplas lukavdplas added the major major changes to functionality and/or the code base label May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement improvements to user functionality major major changes to functionality and/or the code base
Projects
None yet
Development

No branches or pull requests

1 participant