Add Cohere Reranker #269

izellevy · 2024-01-30T11:18:53Z

Problem

We currently do not rerank results. Reranking the results can result in better quality responses.

Solution

Added Cohere reranker

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update
Infrastructure change (CI configs, etc)
Non-code change (docs, etc)
None of the above: (explain here)

Test Plan

Added the relevant tests.

acatav

LGTM, none of my comments are critical

tests/system/reranker/test_cohere_reranker.py

src/canopy/knowledge_base/reranker/cohere.py

igiloh-pinecone · 2024-01-30T13:36:05Z

src/canopy/knowledge_base/reranker/cohere.py

+        reranked_query_results: List[KBQueryResult] = []
+        for result in results:
+            texts = [doc.text for doc in result.documents]
+            response = self._client.rerank(query=result.query,


Also needs to be wrapped in try clause.
Transient errors like rate limits etc should be retried (if the Cohere client itself doesn't do that for us already).

Errors that are caused by wrong configuration (like wrong model name or bad API key) need to be re-raised with an actionable error message

Cohere retries internally. Since Cohere does not return different error types it is hard to understand what the message is. For now I am raising a RuntimeError from the actual error.

igiloh-pinecone · 2024-01-30T13:39:03Z

tests/system/reranker/test_cohere_reranker.py

+
+def test_bad_api_key(should_run_test, query_result):
+    from cohere import CohereAPIError
+    with pytest.raises(CohereAPIError, match="invalid api token"):


We try to eliminate underlying service's errors like CohereAPIError or OpenAIError, and replace them with actionable error message (like something the user needs to change in the Canopy config, or the explicit env var to set).

In the future we will have our own error types like EncoderError, AuthenticationError etc. In the meantime simply re-raise RuntimeError for all of these cases (the CLI catches RuntimeError and prints them nicely)

I checked the client, client does not return a specific error for different errors we always get a CohereAPIError. For now I am raising RuntimeError from that error, if they improve the client we can write actionable messages.

igiloh-pinecone · 2024-01-30T13:40:07Z

tests/system/reranker/test_cohere_reranker.py

+    from cohere import CohereAPIError
+    with pytest.raises(CohereAPIError, match="invalid api token"):
+        CohereReranker(api_key="bad key").rerank([query_result])
+


Missing more negative tests - wrong model name, bad input (e.g. not strings) etc.

Added wrong model name, bad input is not possible since we validate our data with pydantic.

acatav

LGTM. see the small error

src/canopy/knowledge_base/knowledge_base.py

acatav · 2024-01-31T14:20:40Z

src/canopy/knowledge_base/knowledge_base.py

-            ) for r in results
+                    for d in rr.documents
+                ],
+                debug_info={"db_result": QueryResult(


I just relised that we want debug info to be only dicts with literals like str or int. This allows easier serialisation of this object

izellevy added 2 commits January 29, 2024 18:30

Add cohere reranker

9674051

Add tests

4cd8142

izellevy requested review from igiloh-pinecone and acatav January 30, 2024 11:19

Fix static

837fa0b

acatav approved these changes Jan 30, 2024

View reviewed changes

tests/system/reranker/test_cohere_reranker.py Show resolved Hide resolved

src/canopy/knowledge_base/reranker/cohere.py Show resolved Hide resolved

src/canopy/knowledge_base/reranker/cohere.py Show resolved Hide resolved

src/canopy/knowledge_base/reranker/cohere.py Show resolved Hide resolved

igiloh-pinecone reviewed Jan 30, 2024

View reviewed changes

src/canopy/knowledge_base/reranker/cohere.py Outdated Show resolved Hide resolved

igiloh-pinecone reviewed Jan 30, 2024

View reviewed changes

Fix comments

14e4328

acatav approved these changes Jan 30, 2024

View reviewed changes

src/canopy/knowledge_base/knowledge_base.py Show resolved Hide resolved

Add assert

c583a71

izellevy added this pull request to the merge queue Jan 31, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 31, 2024

acatav reviewed Jan 31, 2024

View reviewed changes

izellevy added 2 commits January 31, 2024 17:53

Fix dict

04b6d42

Change name

11060a7

izellevy added this pull request to the merge queue Jan 31, 2024

Merged via the queue into pinecone-io:main with commit 95c7b24 Jan 31, 2024
7 checks passed

izellevy deleted the feature/cohere_reranker branch January 31, 2024 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Cohere Reranker #269

Add Cohere Reranker #269

izellevy commented Jan 30, 2024 •

edited

Loading

acatav left a comment

igiloh-pinecone Jan 30, 2024

izellevy Jan 30, 2024

igiloh-pinecone Jan 30, 2024

izellevy Jan 30, 2024

igiloh-pinecone Jan 30, 2024

izellevy Jan 30, 2024

acatav left a comment

acatav Jan 31, 2024

Add Cohere Reranker #269

Add Cohere Reranker #269

Conversation

izellevy commented Jan 30, 2024 • edited Loading

Problem

Solution

Type of Change

Test Plan

acatav left a comment

Choose a reason for hiding this comment

igiloh-pinecone Jan 30, 2024

Choose a reason for hiding this comment

izellevy Jan 30, 2024

Choose a reason for hiding this comment

igiloh-pinecone Jan 30, 2024

Choose a reason for hiding this comment

izellevy Jan 30, 2024

Choose a reason for hiding this comment

igiloh-pinecone Jan 30, 2024

Choose a reason for hiding this comment

izellevy Jan 30, 2024

Choose a reason for hiding this comment

acatav left a comment

Choose a reason for hiding this comment

acatav Jan 31, 2024

Choose a reason for hiding this comment

izellevy commented Jan 30, 2024 •

edited

Loading