-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escape quotes for with_bm25/with_hybrid/with_generate #422
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR on this one!
Could you add another option to the parametrization of test_bm25
to demonstrate what happens when the quotes convention is reversed? I.e. "what is an 'airport'?"
. Cheers 😁
To avoid any confusion in the future about your contribution to Weaviate, we work with a Contributor License Agreement. If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge. |
Thanks for your contribution, for merging we need you to agree to the CLA. ! I also think that your escaping might be useful for a few hybrid and generative search. Could you check where we strip newlines in the same file and add your "dump" there, too and add tests for these? SHould be pretty straight forward :) |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #422 +/- ##
==========================================
- Coverage 95.43% 95.11% -0.32%
==========================================
Files 64 64
Lines 8165 8129 -36
==========================================
- Hits 7792 7732 -60
- Misses 373 397 +24
☔ View full report in Codecov by Sentry. |
I agree with the CLA |
8831fad
to
d7b20b7
Compare
This PR fixes a bug in BM25 retrieval by using json.dumps method on the query for escaping quotes.
I kept
strip_newlines
to maintain previous behavior on the "\n" character (instead of escaping it,strip_newlines
replace it with a space).Problem
Similar to the issue#121,
I am getting an error in with_bm25() method if query text contains double quotes, e.g.
throws this message
...'message': 'Syntax Error GraphQL request (1:61) Expected :, found String\n...... bm25:{query: "what is an "airport""