Add `UsageLimits.count_tokens_before_request` using Gemini `count_tokens` API #2137

kauabh · 2025-07-05T23:15:57Z

Gemini Provides an endpoint to count tokens https://ai.google.dev/api/tokens#method:-models.counttokens.
I think it will be useful and address some concerns in this issue #1794 (at least for gemini).

@DouweM Wanted to check if this will be helpful. If yes and if the approach is right, wanted to know if you can share some pointers around adding it in usage_limits for gemini. Happy to work on other models too, if this one make it through.

Gemini Provides an endpoint to count token before sending an response https://ai.google.dev/api/tokens#method:-models.counttokens

added type adaptor

Removed extra assignment

Linting

Removed White Space

DouweM · 2025-07-07T16:44:12Z

@kauabh I agree that if a model API has a method to count tokens, it would be nice to expose that on the Model class.

But I don't think we should automatically use it when UsageLimits(request_tokens_limit=...) is used, as it adds an extra request and the overhead and latency that comes with that, unlike OpenAI's tiktoken which was mentioned in #1794 and can be run locally. So if we'd like to give users the option to better enforce request_tokens_limit by doing a separate count-tokens request ahead of the actual LLM request, that should be opt-in with some flag on UsageLimits and appropriate warnings in the docs about the extra overhead.

That check would need to be implemented here, just before we call model.request, once we have the messages, model settings, and model request params ready:

pydantic-ai/pydantic_ai_slim/pydantic_ai/_agent_graph.py

Lines 379 to 393 in b31c77d

    
           async def _make_request( 
        
               self, ctx: GraphRunContext[GraphAgentState, GraphAgentDeps[DepsT, NodeRunEndT]] 
        
           ) -> CallToolsNode[DepsT, NodeRunEndT]: 
        
               if self._result is not None: 
        
                   return self._result  # pragma: no cover 
        
               model_settings, model_request_parameters = await self._prepare_request(ctx) 
        
               model_request_parameters = ctx.deps.model.customize_request_parameters(model_request_parameters) 
        
               message_history = await _process_message_history( 
        
                   ctx.state.message_history, ctx.deps.history_processors, build_run_context(ctx) 
        
               ) 
        
               model_response = await ctx.deps.model.request(message_history, model_settings, model_request_parameters) 
        
               ctx.state.usage.incr(_usage.Usage()) 
        
               return self._finish_handling(ctx, model_response)

This would require a method that exists on every model, so it'd be implemented as an abstract method on the base Model class with a default implementation of raise NotImplementedError(...), and only models that have a count-tokens method would override it with a concrete implementation.

As for that concrete implementation, I recommend adding it to GoogleModel instead of GeminiModel, as you can directly use the google-genai library there, and reducing the duplication with the request-preparation logic in _generate_content as much as possible.

kauabh · 2025-07-08T14:33:35Z

@DouweM make sense, let me rework on this. Thanks for detailed input, appreciate your time

* adding count token for google * resolved conflicts

kauabh · 2025-07-25T09:50:32Z

Hey @DouweM I have made changes as per comments, looks like quite a few files got touched, It will would be great if you can provide some feedback on the changes till now. Also if you can share some thoughts on changing "instrumented.py" with count_tokens

DouweM

@kauabh Thanks! We're almost there :)

pydantic_ai_slim/pydantic_ai/models/__init__.py

pydantic_ai_slim/pydantic_ai/models/fallback.py

pydantic_ai_slim/pydantic_ai/models/gemini.py

pydantic_ai_slim/pydantic_ai/models/google.py

pydantic_ai_slim/pydantic_ai/messages.py

pydantic_ai_slim/pydantic_ai/_agent_graph.py

pydantic_ai_slim/pydantic_ai/usage.py

tests/models/test_google.py

pydantic_ai_slim/pydantic_ai/_agent_graph.py

DouweM · 2025-08-01T13:51:46Z

@kauabh Let me know when this is ready for another round of review! I'm seeing some test failures

kauabh · 2025-08-01T15:02:09Z

Hey @DouweM just building around test cases, will share soon

kauabh · 2025-08-01T16:43:31Z

Hey @DouweM could you please have look at changes till now, I made updates as per your comments. Also need your thought on the gemini count token not allowing system instruction and tools. I added details after your PR comment

kauabh · 2025-08-06T15:21:23Z

Hey @DouweM please share your thoughts

pydantic_ai_slim/pydantic_ai/_agent_graph.py

pydantic_ai_slim/pydantic_ai/models/__init__.py

pydantic_ai_slim/pydantic_ai/models/google.py

tests/models/test_google.py

Co-authored-by: Douwe Maan <[email protected]>

kauabh · 2025-08-07T17:36:34Z

Hey @DouweM I have made changes as per the comments, the issue is with Vertex ai test case, it works with cassette with i in local, but I tried both adding cassette and removing but getting same error (CI failing),

async def test_google_vertexai_model_usage_limit_exceeded(allow_model_requests: None, google_provider: GoogleProvider):
    model = GoogleModel('gemini-2.5-flash', provider='google-vertex')

    agent = Agent(model, system_prompt='You are a chatbot.')

    @agent.tool_plain
    async def get_user_country() -> str:
        return 'Mexico'

    with pytest.raises(UsageLimitExceeded, match='Exceeded the request_tokens_limit of 9 \\(request_tokens=36\\)'):
        await agent.run(
            'What is the largest city in the user country? Use the get_user_country tool and then your own world knowledge.',
            usage_limits=UsageLimits(request_tokens_limit=50, count_tokens_before_request=True),
        )

Cassette


interactions:
- request:
    body: '[REDACTED]'
    headers:
      Accept:
      - '*/*'
      Accept-Encoding:
      - gzip, deflate
      Connection:
      - keep-alive
      Content-Length:
      - '268'
      Content-Type:
      - application/x-www-form-urlencoded
      User-Agent:
      - python-requests/2.32.3
      x-goog-api-client:
      - gl-python/3.12.7 auth/2.38.0 cred-type/u
    method: POST
    uri: https://oauth2.googleapis.com/token
  response:
    body:
      string: "{\n  \"access_token\": \"[REDACTED]\",\n  \"expires_in\": 3599,\n  \"scope\":
        \"openid https://www.googleapis.com/auth/sqlservice.login https://www.googleapis.com/auth/cloud-platform
        https://www.googleapis.com/auth/userinfo.email\",\n  \"token_type\": \"Bearer\",\n
        \ \"id_token\": \"[REDACTED]\"\n}"
    headers:
      Alt-Svc:
      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
      Cache-Control:
      - no-cache, no-store, max-age=0, must-revalidate
      Content-Type:
      - application/json; charset=utf-8
      Date:
      - Thu, 07 Aug 2025 16:48:01 GMT
      Expires:
      - Mon, 01 Jan 1990 00:00:00 GMT
      Pragma:
      - no-cache
      Server:
      - scaffolding on HTTPServer2
      Transfer-Encoding:
      - chunked
      Vary:
      - Origin
      - X-Origin
      - Referer
      X-Content-Type-Options:
      - nosniff
      X-Frame-Options:
      - SAMEORIGIN
      X-XSS-Protection:
      - '0'
      content-length:
      - '1467'
    status:
      code: 200
      message: OK
- request:
    body: '{"contents": [{"parts": [{"text": "What is the largest city in the user
      country? Use the get_user_country tool and then your own world knowledge."}],
      "role": "user"}], "systemInstruction": {"parts": [{"text": "You are a chatbot."}],
      "role": "user"}, "tools": [{"functionDeclarations": [{"description": "", "name":
      "get_user_country", "parameters": {"additional_properties": false, "properties":
      {}, "type": "OBJECT"}}]}]}'
    headers:
      Content-Type:
      - application/json
      user-agent:
      - google-genai-sdk/1.25.0 gl-python/3.12.7
      x-goog-api-client:
      - google-genai-sdk/1.25.0 gl-python/3.12.7
    method: post
    uri: https://us-central1-aiplatform.googleapis.com/v1beta1/projects/[REDACTED]-[REDACTED]/locations/us-central1/publishers/google/models/gemini-2.5-flash:countTokens
  response:
    body:
      string: "{\n  \"totalTokens\": 36,\n  \"totalBillableCharacters\": 123,\n  \"promptTokensDetails\":
        [\n    {\n      \"modality\": \"TEXT\",\n      \"tokenCount\": 36\n    }\n
        \ ]\n}\n"
    headers:
      Alt-Svc:
      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
      Content-Type:
      - application/json; charset=UTF-8
      Date:
      - Thu, 07 Aug 2025 16:48:02 GMT
      Server:
      - scaffolding on HTTPServer2
      Transfer-Encoding:
      - chunked
      Vary:
      - Origin
      - X-Origin
      - Referer
      X-Content-Type-Options:
      - nosniff
      X-Frame-Options:
      - SAMEORIGIN
      X-XSS-Protection:
      - '0'
      content-length:
      - '151'
    status:
      code: 200
      message: OK
version: 1

kauabh · 2025-08-13T14:37:57Z

Hey @DouweM I have made changes as per the comments, the issue is with Vertex ai test case, it works with cassette with i in local, but I tried both adding cassette and removing but getting same error (CI failing),


async def test_google_vertexai_model_usage_limit_exceeded(allow_model_requests: None, google_provider: GoogleProvider):

    model = GoogleModel('gemini-2.5-flash', provider='google-vertex')



    agent = Agent(model, system_prompt='You are a chatbot.')



    @agent.tool_plain

    async def get_user_country() -> str:

        return 'Mexico'



    with pytest.raises(UsageLimitExceeded, match='Exceeded the request_tokens_limit of 9 \\(request_tokens=36\\)'):

        await agent.run(

            'What is the largest city in the user country? Use the get_user_country tool and then your own world knowledge.',

            usage_limits=UsageLimits(request_tokens_limit=50, count_tokens_before_request=True),

        )

Cassette




interactions:

- request:

    body: '[REDACTED]'

    headers:

      Accept:

      - '*/*'

      Accept-Encoding:

      - gzip, deflate

      Connection:

      - keep-alive

      Content-Length:

      - '268'

      Content-Type:

      - application/x-www-form-urlencoded

      User-Agent:

      - python-requests/2.32.3

      x-goog-api-client:

      - gl-python/3.12.7 auth/2.38.0 cred-type/u

    method: POST

    uri: https://oauth2.googleapis.com/token

  response:

    body:

      string: "{\n  \"access_token\": \"[REDACTED]\",\n  \"expires_in\": 3599,\n  \"scope\":

        \"openid https://www.googleapis.com/auth/sqlservice.login https://www.googleapis.com/auth/cloud-platform

        https://www.googleapis.com/auth/userinfo.email\",\n  \"token_type\": \"Bearer\",\n

        \ \"id_token\": \"[REDACTED]\"\n}"

    headers:

      Alt-Svc:

      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

      Cache-Control:

      - no-cache, no-store, max-age=0, must-revalidate

      Content-Type:

      - application/json; charset=utf-8

      Date:

      - Thu, 07 Aug 2025 16:48:01 GMT

      Expires:

      - Mon, 01 Jan 1990 00:00:00 GMT

      Pragma:

      - no-cache

      Server:

      - scaffolding on HTTPServer2

      Transfer-Encoding:

      - chunked

      Vary:

      - Origin

      - X-Origin

      - Referer

      X-Content-Type-Options:

      - nosniff

      X-Frame-Options:

      - SAMEORIGIN

      X-XSS-Protection:

      - '0'

      content-length:

      - '1467'

    status:

      code: 200

      message: OK

- request:

    body: '{"contents": [{"parts": [{"text": "What is the largest city in the user

      country? Use the get_user_country tool and then your own world knowledge."}],

      "role": "user"}], "systemInstruction": {"parts": [{"text": "You are a chatbot."}],

      "role": "user"}, "tools": [{"functionDeclarations": [{"description": "", "name":

      "get_user_country", "parameters": {"additional_properties": false, "properties":

      {}, "type": "OBJECT"}}]}]}'

    headers:

      Content-Type:

      - application/json

      user-agent:

      - google-genai-sdk/1.25.0 gl-python/3.12.7

      x-goog-api-client:

      - google-genai-sdk/1.25.0 gl-python/3.12.7

    method: post

    uri: https://us-central1-aiplatform.googleapis.com/v1beta1/projects/[REDACTED]-[REDACTED]/locations/us-central1/publishers/google/models/gemini-2.5-flash:countTokens

  response:

    body:

      string: "{\n  \"totalTokens\": 36,\n  \"totalBillableCharacters\": 123,\n  \"promptTokensDetails\":

        [\n    {\n      \"modality\": \"TEXT\",\n      \"tokenCount\": 36\n    }\n

        \ ]\n}\n"

    headers:

      Alt-Svc:

      - h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

      Content-Type:

      - application/json; charset=UTF-8

      Date:

      - Thu, 07 Aug 2025 16:48:02 GMT

      Server:

      - scaffolding on HTTPServer2

      Transfer-Encoding:

      - chunked

      Vary:

      - Origin

      - X-Origin

      - Referer

      X-Content-Type-Options:

      - nosniff

      X-Frame-Options:

      - SAMEORIGIN

      X-XSS-Protection:

      - '0'

      content-length:

      - '151'

    status:

      code: 200

      message: OK

version: 1

Hey @DouweM please share your thoughts on changes and also on VCR for this VERTEXAI limit test, not able to run it in CI

pydantic_ai_slim/pydantic_ai/models/__init__.py

DouweM · 2025-08-13T22:16:28Z

@kauabh I was able to fix it up, thanks a lot!

kauabh · 2025-08-15T07:13:21Z

Hey @DouweM thank you so much, just wanted to ask what mistake I was making in creating that VERTEX AI cassette

DouweM · 2025-08-15T13:46:21Z

@kauabh Vertex is a bit tricky, it took me some time to make it work as well.

I had to comment out these lines:

pydantic-ai/tests/conftest.py

Lines 440 to 441 in eeeb32d

    
           if not os.getenv('CI', False): 
        
               pytest.skip('Requires properly configured local google vertex config to pass')

, and make sure I authenticate with a service account just like happens in CI, so that the VCR requests are exactly the same.

* Add `priority` `service_tier` to `OpenAIModelSettings` and respect it in `OpenAIResponsesModel` (pydantic#2368) * Add an example of using RunContext to pass data among tools (pydantic#2316) Co-authored-by: Douwe Maan <[email protected]> * Rename gemini-2.5-flash-lite-preview-06-17 to gemini-2.5-flash-lite as it's out of preview (pydantic#2387) * Fix toggleable toolset example so toolset state is not shared across agent runs (pydantic#2396) * Support custom thinking tags specified on the model profile (pydantic#2364) Co-authored-by: jescudero <[email protected]> Co-authored-by: Douwe Maan <[email protected]> * Add convenience functions to handle AG-UI requests with request-specific deps (pydantic#2397) * docs: add missing optional packages in `install.md` (pydantic#2412) * Include default values in tool arguments JSON schema (pydantic#2418) * Fix "test_download_item_no_content_type test fails on macOS" (pydantic#2404) * Allow string format, pattern and others in OpenAI strict JSON mode (pydantic#2420) * Let more `BaseModel`s use OpenAI strict JSON mode by defaulting to `additionalProperties=False` (pydantic#2419) * BREAKING CHANGE: Change type of 'source' field on EvaluationResult (pydantic#2388) Co-authored-by: Douwe Maan <[email protected]> * Fix ImageUrl, VideoUrl, AudioUrl and DocumentUrl not being serializable (pydantic#2422) * BREAKING CHANGE: Support printing reasons in the console output for pydantic-evals (pydantic#2163) * Document performance implications of async vs sync tools (pydantic#2298) Co-authored-by: Douwe Maan <[email protected]> * Mention that tools become toolset internally (pydantic#2395) Co-authored-by: Douwe Maan <[email protected]> * Fix tests for Logfire>=3.22.0 (pydantic#2346) * tests: speed up the test suite (pydantic#2414) * google: add more information about schema on union (pydantic#2426) * typo in output docs (pydantic#2427) * Deprecate `GeminiModel` in favor of `GoogleModel` (pydantic#2416) * Use `httpx` on `GoogleProvider` (pydantic#2438) * Remove older deprecated models and add new model of Anthropic (pydantic#2435) * Remove `next()` method from `Graph` (pydantic#2440) * BREAKING CHANGE: Remove `data` from `FinalResult` (pydantic#2443) * BREAKING CHANGE: Remove `get_data` and `validate_structured_result` from `StreamedRunResult` (pydantic#2445) * docs: add `griffe_warnings_deprecated` (pydantic#2444) * BREAKING CHANGE: Remove `format_as_xml` module (pydantic#2446) * BREAKING CHANGE: Remove `result_type` parameter and similar from `Agent` (pydantic#2441) * Deprecate `GoogleGLAProvider` and `GoogleVertexProvider` (pydantic#2450) * BREAKING CHANGE: drop 4 months old deprecation warnings (pydantic#2451) * Automatically use OpenAI strict mode for strict-compatible native output types (pydantic#2447) * Make `InlineDefsJsonSchemaTransformer` public (pydantic#2455) * Send `ThinkingPart`s back to Anthropic used through Bedrock (pydantic#2454) * Bump boto3 to support `AWS_BEARER_TOKEN_BEDROCK` API key env var (pydantic#2456) * Add new Heroku models (pydantic#2459) * Add `builtin_tools` to `Agent` (pydantic#2102) Co-authored-by: Marcelo Trylesinski <[email protected]> Co-authored-by: Douwe Maan <[email protected]> * Bump mcp-run-python (pydantic#2470) * Remove fail_under from top-level coverage config so <100% html-coverage step doesn't end CI run (pydantic#2475) * Add AbstractAgent, WrapperAgent, Agent.event_stream_handler, Toolset.id, Agent.override(tools=...) in preparation for Temporal (pydantic#2458) * Let toolsets be built dynamically based on run context (pydantic#2366) Co-authored-by: Douwe Maan <[email protected]> * Add ToolsetFunc to API docs (fix CI) (pydantic#2486) * tests: change time of evals example (pydantic#2501) * ci: remove html and xml reports (pydantic#2491) * fix: Add gpt-5 models to reasoning model detection for temperature parameter handling (pydantic#2483) Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> Co-authored-by: Douwe Maan <[email protected]> Co-authored-by: Marcelo Trylesinski <[email protected]> * History processor replaces message history (pydantic#2324) Co-authored-by: Marcelo Trylesinski <[email protected]> * ci: split test suite (pydantic#2436) Co-authored-by: Douwe Maan <[email protected]> * ci: use the right install command (pydantic#2506) * Update config.yaml (pydantic#2514) * Skip testing flaky evals example (pydantic#2518) * Fix error when parsing usage details for video without audio track in Google models (pydantic#2507) * Make OpenAIResponsesModelSettings.openai_builtin_tools work again (pydantic#2520) * Let Agent be run in a Temporal workflow by moving model requests, tool calls, and MCP to Temporal activities (pydantic#2225) * Install only dev in CI (pydantic#2523) * Improve CLAUDE.md (pydantic#2524) * Add best practices regarding to coverage to CLAUDE.md (pydantic#2527) * Add support for `"openai-responses"` model inference string (pydantic#2528) Co-authored-by: Claude <[email protected]> * docs: Confident AI (pydantic#2529) * chore: mention what to do with the documentation when deprecating a class (pydantic#2530) * chore: drop hyperlint (pydantic#2531) * ci: improve matrix readability (pydantic#2532) * Add pip to dev deps for PyCharm (pydantic#2533) Co-authored-by: Marcelo Trylesinski <[email protected]> * Add genai-prices to dev deps and a basic test (pydantic#2537) * Add `--durations=100` to all pytest calls in CI (pydantic#2534) * Cleanup snapshot in test_evaluate_async_logfire (pydantic#2538) * Make some minor tweaks to the temporal docs (pydantic#2522) Co-authored-by: Douwe Maan <[email protected]> * Add new OpenAI GPT-5 models (pydantic#2503) * Fix `FallbackModel` to respect each model's model settings (pydantic#2540) * Add support for OpenAI verbosity parameter in Responses API (pydantic#2493) Co-authored-by: Claude <[email protected]> Co-authored-by: Douwe Maan <[email protected]> * Add `UsageLimits.count_tokens_before_request` using Gemini `count_tokens` API (pydantic#2137) Co-authored-by: Douwe Maan <[email protected]> * chore: Fix uv.lock (pydantic#2546) * Stop calling MCP server `get_tools` ahead of `agent run` span (pydantic#2545) * Disable instrumentation by default in tests (pydantic#2535) Co-authored-by: Marcelo Trylesinski <[email protected]> * Only wrap necessary parts of type aliases in forward annotations (pydantic#2548) * Remove anthropic-beta default header set in `AnthropicModel` (pydantic#2544) Co-authored-by: Marcelo Trylesinski <[email protected]> * docs: Clarify why AG-UI example links are on localhost (pydantic#2549) * chore: Fix path to agent class in CLAUDE.md (pydantic#2550) * Ignore leading whitespace when streaming from Qwen or DeepSeek (pydantic#2554) * Ask model to try again if it produced a response without text or tool calls, only thinking (pydantic#2556) Co-authored-by: Douwe Maan <[email protected]> * chore: Improve Temporal test to check trace as tree instead of list (pydantic#2559) * Fix: Forward max_uses parameter to Anthropic WebSearchTool (pydantic#2561) * Let message history end on ModelResponse and execute pending tool calls (pydantic#2562) * Fix type issues * skip tests requiring API keys * add `google-genai` dependency * add other provider deps * add pragma: no cover for untested logic --------- Co-authored-by: akenar <[email protected]> Co-authored-by: Tony Woland <[email protected]> Co-authored-by: Douwe Maan <[email protected]> Co-authored-by: Yi-Chen Lin <[email protected]> Co-authored-by: José I. Escudero <[email protected]> Co-authored-by: jescudero <[email protected]> Co-authored-by: Marcelo Trylesinski <[email protected]> Co-authored-by: William Easton <[email protected]> Co-authored-by: David Montague <[email protected]> Co-authored-by: Guillermo <[email protected]> Co-authored-by: Hamza Farhan <[email protected]> Co-authored-by: Mohamed Amine Zghal <[email protected]> Co-authored-by: Yinon Ehrlich <[email protected]> Co-authored-by: Matthew Brandman <[email protected]> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> Co-authored-by: Douwe Maan <[email protected]> Co-authored-by: Alex Enrique <[email protected]> Co-authored-by: Jerry Yan <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: Mayank <[email protected]> Co-authored-by: Alex Hall <[email protected]> Co-authored-by: Jerry Lin <[email protected]> Co-authored-by: Raymond Xu <[email protected]> Co-authored-by: kauabh <[email protected]> Co-authored-by: Victorien <[email protected]> Co-authored-by: Ethan Brooks <[email protected]> Co-authored-by: eballesteros <[email protected]>

kauabh added 9 commits July 6, 2025 04:27

Adding CountToken to Gemini

6f86735

Gemini Provides an endpoint to count token before sending an response https://ai.google.dev/api/tokens#method:-models.counttokens

Update gemini.py

5cd88e0

added type adaptor

Update gemini.py

a302345

Removed extra assignment

Update gemini.py

3b2e26a

Linting

Update gemini.py

dc4d29b

Linting

Update gemini.py

16f18dc

Update gemini.py

24d6c25

Update gemini.py

90fc8bb

Linting

Update gemini.py

2bfc8d0

Removed White Space

DouweM self-assigned this Jul 7, 2025

DouweM added the awaiting author revision label Jul 7, 2025

kauabh added 6 commits July 16, 2025 17:53

Merge branch 'pydantic:main' into patch-2

8be5932

Merge branch 'pydantic:main' into patch-2

c644a5e

Merge branch 'pydantic:main' into patch-2

e07c989

Enabling Request Token Count in Google (#1)

bae4ca9

* adding count token for google * resolved conflicts

removed extra argument

72c8125

updated gemini, removed redundant code

fa9de61

DouweM requested changes Jul 29, 2025

View reviewed changes

kauabh added 2 commits July 30, 2025 10:48

Merge branch 'pydantic:main' into patch-2

d1a97fb

updated logic for count token

27b1fb6

DouweM reviewed Aug 1, 2025

View reviewed changes

pydantic_ai_slim/pydantic_ai/_agent_graph.py Show resolved Hide resolved

kauabh added 3 commits August 1, 2025 21:40

Merge branch 'pydantic:main' into patch-2

b9801a0

updated test, count token function is async

ac57d72

corrected agent_graph, typo in test

4d2e480

DouweM requested changes Aug 6, 2025

View reviewed changes

kauabh and others added 4 commits August 7, 2025 22:38

Update pydantic_ai_slim/pydantic_ai/models/__init__.py

f2ffd58

Co-authored-by: Douwe Maan <[email protected]>

Merge branch 'pydantic:main' into patch-2

87d7f3b

updated test with vertexai, updated logic for countToken

f9dd73d

removed ai_model_usage_limit_exceeded cassette

b6ea989

kauabh added 6 commits August 8, 2025 11:59

Merge branch 'main' into patch-2

12c55cc

ran local tests

d24f9d4

typos and merge conflicts

a6c129c

simplied logic for generate in google

055469d

trying to create VCR for new test

1890f3b

changed model for vertex ai

688a1ed

DouweM added 2 commits August 13, 2025 21:06

Add Vertex cassette

187885f

Various improvements

bfc2fc4

DouweM reviewed Aug 13, 2025

View reviewed changes

pydantic_ai_slim/pydantic_ai/models/__init__.py Outdated Show resolved Hide resolved

DouweM added 2 commits August 13, 2025 22:02

Test request that doesn't exceed usage limit

1a7f372

Fix test coverage

a5d3bd9

DouweM changed the title ~~Adding CountToken to Gemini~~ Add UsageLimits.count_tokens_before_request and using Gemini count_tokens API Aug 13, 2025

DouweM changed the title ~~Add UsageLimits.count_tokens_before_request and using Gemini count_tokens API~~ Add UsageLimits.count_tokens_before_request using Gemini count_tokens API Aug 13, 2025

DouweM merged commit 2293595 into pydantic:main Aug 13, 2025
33 checks passed

kauabh deleted the patch-2 branch August 14, 2025 10:22

Add UsageLimits.count_tokens_before_request using Gemini count_tokens API #2137

Add UsageLimits.count_tokens_before_request using Gemini count_tokens API #2137

Uh oh!

Conversation

kauabh commented Jul 5, 2025

Uh oh!

DouweM commented Jul 7, 2025

Uh oh!

kauabh commented Jul 8, 2025

Uh oh!

kauabh commented Jul 25, 2025

Uh oh!

DouweM left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DouweM commented Aug 1, 2025

Uh oh!

kauabh commented Aug 1, 2025

Uh oh!

kauabh commented Aug 1, 2025

Uh oh!

kauabh commented Aug 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kauabh commented Aug 7, 2025

Uh oh!

kauabh commented Aug 13, 2025

Uh oh!

Uh oh!

Uh oh!

DouweM commented Aug 13, 2025

Uh oh!

kauabh commented Aug 15, 2025

Uh oh!

DouweM commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add `UsageLimits.count_tokens_before_request` using Gemini `count_tokens` API #2137

Add `UsageLimits.count_tokens_before_request` using Gemini `count_tokens` API #2137