Azure OpenAI LLM implementation #188

MichaelAnckaert · 2023-11-27T10:34:24Z

Problem

Canopy can't be used with the Azure OpenAI offering.

Solution

This PR creates a new AzureOpenAILLM class (inherits from OpenAILLM) to be used with the Azure OpenAI offering.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update
Infrastructure change (CI configs, etc)
Non-code change (docs, etc)
None of the above: (explain here)

Test Plan

No tests added yet.

miararoy

Overall direction seems good, main changes is:

env loading should be handled by client or by explicit passing parameters
llm object creation in app.py
adding proper docstrings and adding readme

src/canopy/knowledge_base/record_encoder/openai.py

src/canopy/llm/openai.py

src/canopy_server/app.py

src/canopy/llm/openai.py

src/canopy_server/app.py

MichaelAnckaert · 2023-11-30T08:53:46Z

Overall direction seems good, main changes is:

* env loading should be handled by client or by explicit passing parameters

* llm object creation in app.py

* adding proper docstrings and adding readme

Thanks for your review @miararoy Happy to know the direction is the correct one.
I'll address your comments and update the PR.

MichaelAnckaert · 2023-11-30T09:15:59Z

Adressed review comments:

Environment variable loading removed from implementation
Llm creation removed from app.py, relying on config loading
Updated docstrings
Updated README and added sample config

miararoy · 2023-11-30T09:45:17Z

Thanks!
Just letting you know we have internally started to work on this as well, we will make sure to merge the work so this could be shipped asap

igiloh-pinecone

@MichaelAnckaert Thank you very much for your contribution!!

Please see a few required changes.
Also, please note that tests are required - a unit test for OpenAIRecordEncoder and a system test for OpenAILLM (see comments inline).

config/azure.yaml

src/canopy/knowledge_base/record_encoder/openai.py

src/canopy/llm/openai.py

src/canopy_server/app.py

src/canopy/llm/openai.py

src/canopy/knowledge_base/record_encoder/openai.py

src/canopy/llm/openai.py

igiloh-pinecone · 2023-12-18T11:42:18Z

@MichaelAnckaert there hasn't been any activity in this PR since the review two weeks ago.
Please advise if you are planning to finalize this PR so it can be merged.

This better conforms with our code base styling

…enai

Inheriting from OpenAI allows simplifying the implementation

To conform with our coding style

For very specific model versions and API version - AzureOpenAI's function calling capability simply works like regular OpenAI. For all other deployments - we will simply error out

This way derived classes like AzureOpenAILLM can handle the errors more explicitly

Instead of a dedicated, copy-pasted test file, I paramerteized the existing test_openai file to test both OpenAI and AzureOpenAI classes.

Improve OpenAI and AzureOpenAI's error handling messages

Give a more explicit error in FunctionCallingQueryGenerator if we suspect that function calling isn't supported.

Changed to new format + minimal changes required for using AzureOpenAI

gitguardian · 2024-01-15T08:57:42Z

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request

GitGuardian id	Secret	Commit	Filename
-	Generic High Entropy Secret	`1f3add7`	tests/system/llm/test_azure_openai.py	View secret
-	Generic High Entropy Secret	`2c80fbd`	tests/system/llm/test_azure_openai.py	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secrets safely. Learn here the best practices.
Revoke and rotate these secrets.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!}

igiloh-pinecone · 2024-01-15T08:58:43Z

@MichaelAnckaert Thank you very much for your contribution!!

Since this PR has been inactive, we have finalized it with all missing tests and prepared it for merging.
All of your original commits are included of course, and will be merged to Canopy.

acatav

LGTM overall, look at the minor suggestions. I mainly think we should decide if api version goes to env vars or config, and can consider enable model name in config at some sense

README.md

config/azure.yaml

acatav · 2024-01-15T09:53:56Z

config/azure.yaml

+    type: AzureOpenAILLM                # Options: [OpenAILLM, AzureOpenAILLM]
+    params:
+      model_name: your-deployment-name  # Specify the name of the LLM deployment to use.
+      api_version: 2023-07-01


I wonder if this should come from env var, we shouldn't mix env vars and config

acatav · 2024-01-15T09:57:07Z

src/canopy/knowledge_base/record_encoder/azure_openai.py

+    def __init__(
+        self,
+        *,
+        model_name: str = "text-embedding-ada-002",


we shouldn't have this default, the user is more likely to have a different name. I'm not sure if it breaks our config with defaults pattern since azure encoder is not a default encoder. If it breaks and complicate stuff we can leave it as is for now

src/canopy/llm/azure_openai_llm.py

- Bump pinecone-text version - Add error handling - Add system tests - Finalize the config

The fixture is shared across multiple test files

igiloh-pinecone

LGTM!

Code changed and approved

The CI workflow needs these secrets to run the AzureOpenAI tests

aulorbe and others added 7 commits November 16, 2023 14:24

azure openai integration -- save work

fb2137a

Ignore empty chunks

fc38a6a

Use AzureOpenAIEncoder

c4f98fa

Implement AzureOpenAI

15477b5

Updates

9b68c2f

Work in progress

ccde9e9

Inherit AzureOpenAILLM from OpenAILLM

bbfc8c4

miararoy previously requested changes Nov 30, 2023

View reviewed changes

Merge branch 'main' into azure-openai

ab4aefe

MichaelAnckaert requested a review from miararoy November 30, 2023 09:16

Address comments from PR review

b347120

MichaelAnckaert force-pushed the azure-openai branch from 17b1174 to b347120 Compare November 30, 2023 09:19

Update readme and add Azure OpenAI example config

27dd272

MichaelAnckaert changed the title ~~Azure OpenAI LLM (PoC)~~ Azure OpenAI LLM implementation Nov 30, 2023

igiloh-pinecone suggested changes Dec 3, 2023

View reviewed changes

aulorbe added 8 commits December 4, 2023 16:19

Add AzureOpenAILLM class

042fcad

Add better handling of env vars

9377f12

Add better handling of env vars, 2

46e5958

Add WIP tests

2c80fbd

Update class in docstring

fee93b8

Add NotImplementedError() for available_models() method for Azure class

6bc541f

Shoot, remove secret

1f3add7

Shoot, remove secret

0780cc8

igiloh-pinecone added 3 commits January 2, 2024 14:56

Undo redundant lint changes

1d2c263

[llm] Move OpenAILLm to its own file

164fabe

This better conforms with our code base styling

Merge remote-tracking branch 'audrey/audrey-azure-chat' into azure-op…

946ac2c

…enai

igiloh-pinecone added 11 commits January 3, 2024 12:30

[kb] Simplify AzureRecordEncoder

c1cc5a3

Inheriting from OpenAI allows simplifying the implementation

[kb] Move AzureOpenAIEncoder to its own file

40fbf97

To conform with our coding style

[llm] AzureOpenAI - support function calling

5b7b410

For very specific model versions and API version - AzureOpenAI's function calling capability simply works like regular OpenAI. For all other deployments - we will simply error out

Merge remote-tracking branch 'upstream/main' into azure-openai

0f9784e

[LLM] Added handle_error() for OpenAILLM

98840f9

This way derived classes like AzureOpenAILLM can handle the errors more explicitly

[test] Refactor AzureOpenAI tests

2b5cf78

Instead of a dedicated, copy-pasted test file, I paramerteized the existing test_openai file to test both OpenAI and AzureOpenAI classes.

[LLM] Further improve error handing

bcb6b02

Improve OpenAI and AzureOpenAI's error handling messages

[chat] Explicit error in FunctionCallingQG

56d6e5f

Give a more explicit error in FunctionCallingQueryGenerator if we suspect that function calling isn't supported.

[llm] Fix typo in error message

02f01e2

make linters happy

9d804fe

Finalize azure.config

b3aad4d

Changed to new format + minimal changes required for using AzureOpenAI

acatav previously requested changes Jan 15, 2024

View reviewed changes

igiloh-pinecone added 4 commits January 15, 2024 17:25

[kb] Finalize Azure RecordEncoder

bbe34a8

- Bump pinecone-text version - Add error handling - Add system tests - Finalize the config

make linters happy

b7ca6e6

[test] Fix AzureOpenAI tests

92ca6c2

[test] Fix OpenAI tests

cda4fc1

The fixture is shared across multiple test files

igiloh-pinecone enabled auto-merge January 15, 2024 19:27

igiloh-pinecone approved these changes Jan 15, 2024

View reviewed changes

igiloh-pinecone added this pull request to the merge queue Jan 15, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 15, 2024

[CI] Added Azure env var

c8a9a31

The CI workflow needs these secrets to run the AzureOpenAI tests

igiloh-pinecone enabled auto-merge January 15, 2024 21:13

igiloh-pinecone added this pull request to the merge queue Jan 15, 2024

Merged via the queue into pinecone-io:main with commit a52a022 Jan 15, 2024
9 of 10 checks passed

igiloh-pinecone mentioned this pull request Jan 16, 2024

[draft] Azure OpenAI chat integration #197

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Azure OpenAI LLM implementation #188

Azure OpenAI LLM implementation #188

MichaelAnckaert commented Nov 27, 2023 •

edited

Loading

miararoy left a comment

MichaelAnckaert commented Nov 30, 2023

MichaelAnckaert commented Nov 30, 2023 •

edited

Loading

miararoy commented Nov 30, 2023

igiloh-pinecone left a comment

igiloh-pinecone commented Dec 18, 2023

gitguardian bot commented Jan 15, 2024 •

edited

Loading

igiloh-pinecone commented Jan 15, 2024

acatav left a comment

acatav Jan 15, 2024

acatav Jan 15, 2024

igiloh-pinecone left a comment

Azure OpenAI LLM implementation #188

Azure OpenAI LLM implementation #188

Conversation

MichaelAnckaert commented Nov 27, 2023 • edited Loading

Problem

Solution

Type of Change

Test Plan

miararoy left a comment

Choose a reason for hiding this comment

MichaelAnckaert commented Nov 30, 2023

MichaelAnckaert commented Nov 30, 2023 • edited Loading

miararoy commented Nov 30, 2023

igiloh-pinecone left a comment

Choose a reason for hiding this comment

igiloh-pinecone commented Dec 18, 2023

gitguardian bot commented Jan 15, 2024 • edited Loading

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

igiloh-pinecone commented Jan 15, 2024

acatav left a comment

Choose a reason for hiding this comment

acatav Jan 15, 2024

Choose a reason for hiding this comment

acatav Jan 15, 2024

Choose a reason for hiding this comment

igiloh-pinecone left a comment

Choose a reason for hiding this comment

MichaelAnckaert commented Nov 27, 2023 •

edited

Loading

MichaelAnckaert commented Nov 30, 2023 •

edited

Loading

gitguardian bot commented Jan 15, 2024 •

edited

Loading