Skip to content
This repository has been archived by the owner on Nov 13, 2024. It is now read-only.

Azure OpenAI LLM implementation #188

Merged
merged 38 commits into from
Jan 15, 2024

Conversation

MichaelAnckaert
Copy link
Contributor

@MichaelAnckaert MichaelAnckaert commented Nov 27, 2023

Problem

Canopy can't be used with the Azure OpenAI offering.

Solution

This PR creates a new AzureOpenAILLM class (inherits from OpenAILLM) to be used with the Azure OpenAI offering.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Infrastructure change (CI configs, etc)
  • Non-code change (docs, etc)
  • None of the above: (explain here)

Test Plan

No tests added yet.

Copy link
Contributor

@miararoy miararoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall direction seems good, main changes is:

  • env loading should be handled by client or by explicit passing parameters
  • llm object creation in app.py
  • adding proper docstrings and adding readme

src/canopy/knowledge_base/record_encoder/openai.py Outdated Show resolved Hide resolved
src/canopy/knowledge_base/record_encoder/openai.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
src/canopy_server/app.py Outdated Show resolved Hide resolved
src/canopy_server/app.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
src/canopy_server/app.py Outdated Show resolved Hide resolved
@MichaelAnckaert
Copy link
Contributor Author

Overall direction seems good, main changes is:

* env loading should be handled by client or by explicit passing parameters

* llm object creation in app.py

* adding proper docstrings and adding readme

Thanks for your review @miararoy Happy to know the direction is the correct one.
I'll address your comments and update the PR.

@MichaelAnckaert
Copy link
Contributor Author

MichaelAnckaert commented Nov 30, 2023

Adressed review comments:

  • Environment variable loading removed from implementation
  • Llm creation removed from app.py, relying on config loading
  • Updated docstrings
  • Updated README and added sample config

@MichaelAnckaert MichaelAnckaert changed the title Azure OpenAI LLM (PoC) Azure OpenAI LLM implementation Nov 30, 2023
@miararoy
Copy link
Contributor

Thanks!
Just letting you know we have internally started to work on this as well, we will make sure to merge the work so this could be shipped asap

Copy link
Contributor

@igiloh-pinecone igiloh-pinecone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelAnckaert Thank you very much for your contribution!!

Please see a few required changes.
Also, please note that tests are required - a unit test for OpenAIRecordEncoder and a system test for OpenAILLM (see comments inline).

config/azure.yaml Show resolved Hide resolved
src/canopy/knowledge_base/record_encoder/openai.py Outdated Show resolved Hide resolved
src/canopy/knowledge_base/record_encoder/openai.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
src/canopy_server/app.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
src/canopy/knowledge_base/record_encoder/openai.py Outdated Show resolved Hide resolved
src/canopy/llm/openai.py Outdated Show resolved Hide resolved
@igiloh-pinecone
Copy link
Contributor

@MichaelAnckaert there hasn't been any activity in this PR since the review two weeks ago.
Please advise if you are planning to finalize this PR so it can be merged.

Inheriting from OpenAI allows simplifying the implementation
To conform with our coding style
For very specific model versions and API version - AzureOpenAI's function calling capability simply works like regular OpenAI.
For all other deployments - we will simply error out
This way derived classes like AzureOpenAILLM can handle the errors more explicitly
Instead of a dedicated, copy-pasted test file, I paramerteized the existing test_openai file to test both OpenAI and AzureOpenAI classes.
Improve OpenAI and AzureOpenAI's error handling messages
Give a more explicit error in FunctionCallingQueryGenerator if we suspect that function calling isn't supported.
Changed to new format + minimal changes required for using AzureOpenAI
Copy link

gitguardian bot commented Jan 15, 2024

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id Secret Commit Filename
- Generic High Entropy Secret 1f3add7 tests/system/llm/test_azure_openai.py View secret
- Generic High Entropy Secret 2c80fbd tests/system/llm/test_azure_openai.py View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

@igiloh-pinecone
Copy link
Contributor

@MichaelAnckaert Thank you very much for your contribution!!

Since this PR has been inactive, we have finalized it with all missing tests and prepared it for merging.
All of your original commits are included of course, and will be merged to Canopy.

acatav
acatav previously requested changes Jan 15, 2024
Copy link
Contributor

@acatav acatav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, look at the minor suggestions. I mainly think we should decide if api version goes to env vars or config, and can consider enable model name in config at some sense

README.md Show resolved Hide resolved
config/azure.yaml Outdated Show resolved Hide resolved
config/azure.yaml Show resolved Hide resolved
type: AzureOpenAILLM # Options: [OpenAILLM, AzureOpenAILLM]
params:
model_name: your-deployment-name # Specify the name of the LLM deployment to use.
api_version: 2023-07-01
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should come from env var, we shouldn't mix env vars and config

def __init__(
self,
*,
model_name: str = "text-embedding-ada-002",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't have this default, the user is more likely to have a different name. I'm not sure if it breaks our config with defaults pattern since azure encoder is not a default encoder. If it breaks and complicate stuff we can leave it as is for now

src/canopy/llm/azure_openai_llm.py Show resolved Hide resolved
- Bump pinecone-text version
- Add error handling
- Add system tests
- Finalize the config
The fixture is shared across multiple test files
Copy link
Contributor

@igiloh-pinecone igiloh-pinecone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@igiloh-pinecone igiloh-pinecone dismissed stale reviews from acatav and miararoy January 15, 2024 19:29

Code changed and approved

@igiloh-pinecone igiloh-pinecone added this pull request to the merge queue Jan 15, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 15, 2024
The CI workflow needs these secrets to run the AzureOpenAI tests
@igiloh-pinecone igiloh-pinecone added this pull request to the merge queue Jan 15, 2024
Merged via the queue into pinecone-io:main with commit a52a022 Jan 15, 2024
9 of 10 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants