Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggesting lookup list entries #255

Open
jhoetter opened this issue May 16, 2023 · 0 comments
Open

Suggesting lookup list entries #255

jhoetter opened this issue May 16, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@jhoetter
Copy link
Member

Is your feature request related to a problem? Please describe.
I want to quickly extend my lookup lists with further values, and want to find further values of records that I didn't even label yet.

Describe the solution you'd like
With a token-based embedding, we should be able to compute n-grams (see below for more context) and compute similarity search based on the entries we already have. That way, we could find synonyms etc. from the corpus we have at hand, which could be super helpful.

Again, this could be something that is actively requested by pressing a button in the lookup list, which then goes on and does the similarity search and creates suggestions.

Describe alternatives you've considered
-

Additional context
Google search for n-grams

An n-gram is a sequence. n-gram. of n words: a 2-gram (which we'll call bigram) is a two-word sequence of words. like “please turn”, “turn your”, or ”your homework”, and a 3-gram (a trigram) is a three-word sequence of words like “please turn your”, or “turn your homework”.

@jhoetter jhoetter added the enhancement New feature or request label May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant