Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/add google models #50

Merged
merged 18 commits into from
Jun 27, 2024
Merged

Feat/add google models #50

merged 18 commits into from
Jun 27, 2024

Conversation

LucBERTON
Copy link
Collaborator

@LucBERTON LucBERTON commented Jun 15, 2024

implemented basic content generation for google genai provider.

Several points of attention :

  • I could not record the VCR cassette.
    I think I execute the correct command but it does not seem to generate the cassette with Google provider :
    pytest --record-mode=once tests/test_google.py

  • Data Models
    I only added a record related to gemini-1.5-flash model
    The number of token count is purely fictionnal

  • impacts data in google tracer
    With other providers, we use model_dump() method to get everything from the standard response and then add the computed impacts.
    For instance here with openai provider :

ChatCompletion(**response.model_dump(), impacts=impacts)

In our case here with google genai, model_dump() method is not available on GenerateContentResponse object.

Another solution that I found was to use __dict__ attribute.
It seems to be working properly but the code is less clean compared to other providers

@LucBERTON LucBERTON linked an issue Jun 15, 2024 that may be closed by this pull request
@adrienbanse adrienbanse marked this pull request as draft June 18, 2024 07:54
@adrienbanse
Copy link
Member

@LucBERTON It might be that the HTTP client used by google.generativeai is not supported by VCR. The list of supported clients is here: https://vcrpy.readthedocs.io/en/latest/installation.html#compatibility.

It's probably another problem because I see that googleapiclient.http uses httplib2, which is normally supported
(see https://github.com/googleapis/google-api-python-client/blob/73015a64302d4560fd4783b260526729f47c2d9c/googleapiclient/http.py#L38C8-L38C16)

@adrienbanse
Copy link
Member

Yes it seems that vcrpy is not able to capture requests made by google.generativeai:

import google.generativeai as genai
import vcr
import httplib2

with vcr.use_cassette('test.yaml'):
    http = httplib2.Http()
    content = http.request("http://www.something.com") # Recorded in `test.yaml`

    model = genai.GenerativeModel('gemini-1.5-flash')
    response = model.generate_content("Write a story about a magic backpack.") # Not recorded in `test.yaml`

The test.yaml file:

interactions:
- request:
    body: null
    headers:
      accept-encoding:
      - gzip, deflate
      user-agent:
      - Python-httplib2/0.22.0 (gzip)
    method: GET
    uri: http://www.something.com/
  response:
    body:
      string: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

        <html><head>

        <title>301 Moved Permanently</title>

        </head><body>

        <h1>Moved Permanently</h1>

        <p>The document has moved <a href="https://www.something.com/">here</a>.</p>

        </body></html>

        '
    headers:
      Age:
      - '994'
      CF-Cache-Status:
      - HIT
      CF-RAY:
      - 895c4a224b2d0e80-AMS
      Cache-Control:
      - max-age=14400
      Connection:
      - keep-alive
      Content-Type:
      - text/html; charset=iso-8859-1
      Date:
      - Tue, 18 Jun 2024 15:25:01 GMT
      Location:
      - https://www.something.com/
      NEL:
      - '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}'
      Report-To:
      - '{"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=gtSza1HvQTn%2FGYwoyP6CTfu7sdMIUEHsO7ga71jzl%2B%2FxsM%2F%2FZs6QiDa2qRViIt1lWVn0kv4RA1E4PBZIFUn1Y2seQJY3mPUY%2B4HfRWPy4IXjk9CExcLLVoswY2C4XxJphXiE1g%3D%3D"}],"group":"cf-nel","max_age":604800}'
      Server:
      - cloudflare
      Transfer-Encoding:
      - chunked
      Vary:
      - Accept-Encoding
      alt-svc:
      - h3=":443"; ma=86400
    status:
      code: 301
      message: Moved Permanently
- request:
    body: null
    headers:
      accept-encoding:
      - gzip, deflate
      user-agent:
      - Python-httplib2/0.22.0 (gzip)
    method: GET 
    uri: https://www.something.com/
  response:
    body:
      string: !!binary |
        H4sIAAAAAAAAA7PJKMnNsbPJSE1MsbMpySzJSbULzs9NLcnIzEvXs9GHiNjog+W5bJLyUypR5MEC
        XDb6YFO4AApUZMBNAAAA
    headers:
      Age:
      - '4584'
      CF-Cache-Status:
      - HIT
      CF-RAY:
      - 895c4a22a8326716-AMS
      Cache-Control:
      - max-age=14400
      Connection:
      - keep-alive
      Content-Encoding:
      - gzip
      Content-Type:
      - text/html; charset=UTF-8
      Date:
      - Tue, 18 Jun 2024 15:25:01 GMT
      Last-Modified:
      - Mon, 07 Mar 2022 03:36:52 GMT
      NEL:
      - '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}'
      Report-To:
      - '{"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=BjMiba83pwa5KgRcdFhG8914GfGmEJ6rPWNODrzbEyvnytqtwNDYSgll9d3WyBJjphZ9kUb8QZmOgfNc6lyfCdbbFdp%2FX96q9YNNQEUUGm%2B7acKxFCrCrtYrE7x%2BLiFQQFP58w%3D%3D"}],"group":"cf-nel","max_age":604800}'
      Server:
      - cloudflare
      Transfer-Encoding:
      - chunked
      Vary:
      - Accept-Encoding
      alt-svc:
      - h3=":443"; ma=86400
    status:
      code: 200
      message: OK
version: 1

@samuelrince
Copy link
Member

Works if you change the test as follows:

@pytest.mark.vcr
def test_google_chat(tracer_init):
+   genai.configure(transport='rest')
    model = genai.GenerativeModel("gemini-1.5-flash")
    response = model.generate_content("Write a story about a magic backpack.")
    assert len(response.text) > 0
    assert response.impacts.energy.value > 0

And have the API key set in GOOGLE_API_KEY env var.

Also need to add in conftest.py

@pytest.fixture(scope="session")
def vcr_config():
    return {"filter_headers": [
        "authorization",
        "api-key",
        "x-api-key",
+       "x-goog-api-key"
    ]}

I am pushing the updates in a minute.

@adrienbanse
Copy link
Member

Many thanks @samuelrince, let's implement async and stream before merging it

@adrienbanse adrienbanse self-assigned this Jun 19, 2024
@adrienbanse
Copy link
Member

The stream implementation I just pushed is not optimal. Consider the following piece of code

import google.generativeai as genai

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content('Tell me a very short joke', stream=True)

print(response) # Print 1

for chunk in response:
    print(chunk) # Print 2.i

print(response) # Print 3

All prints will show different GenerateContentResponse. The first one basically only serves for storing the iterator. All the 2.i are truncated GenerateContentResponse only corresponding to the chunk, without any iterator, and the last one is a GenerateContentResponse that contains both the iterator and the joint chunks.

The implementation as it is now is such that the following piece of code:

import google.generativeai as genai
from ecologits import EcoLogits
EcoLogits.init()

model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content('Tell me a very short joke', stream=True)

print(response) # Print 1 -> Not a GenerateContentResponse, so no impact

for chunk in response:
    print(chunk.impacts) # Print 2.i

print(response) # Print 3 -> Not a GenerateContentResponse, so no impact

will only work for the parts 2.i. Although to me the first GenerateContentResponse is useless in most applications, it would be nice to have the last one summarizing the impacts of the chunks.

@adrienbanse
Copy link
Member

The combination of transport=rest and async is broken ingoogle-generativeai, see google-gemini/generative-ai-python#203
This prevents us from testing async with cassettes for now, I removed the async tests for now, let's wait for this to be fixed

@adrienbanse adrienbanse marked this pull request as ready for review June 21, 2024 16:54
@samuelrince
Copy link
Member

Possible to also include the documentation in this PR? 😇

You can take example on the other providers. To run the website locally, you can do:

mkdocs serve

@adrienbanse adrienbanse requested review from samuelrince and removed request for samuelrince June 26, 2024 14:38
gemini-1.5-flash is equivalent to gpt-3.5-turbo

gemini-1.5-pro is equivalent to gpt-4o

gemini-1.0-pro is equivalent to gpt-3.5-turbo
Copy link
Member

@samuelrince samuelrince left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge!

I've changed the gemini models following this convention:

gemini-1.5-flash ~ gpt-3.5-turbo
gemini-1.5-pro ~ gpt-4o
gemini-1.0-pro ~ gpt-3.5-turbo (old model and very cheap compared to 1.5-pro)

@adrienbanse adrienbanse merged commit 808f314 into main Jun 27, 2024
2 checks passed
@adrienbanse adrienbanse deleted the feat/add-google-models branch June 27, 2024 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Google generative AI models
3 participants