feat: add dashscope/qwen3-max model with tiered pricing#20919
Merged
ishaan-jaff merged 1 commit intoBerriAI:mainfrom Feb 11, 2026
Merged
feat: add dashscope/qwen3-max model with tiered pricing#20919ishaan-jaff merged 1 commit intoBerriAI:mainfrom
ishaan-jaff merged 1 commit intoBerriAI:mainfrom
Conversation
Add support for Alibaba Cloud's Qwen3-Max model with: - 258K input tokens, 65K output tokens - Tiered pricing based on context window usage (0-32K, 32K-128K, 128K-252K) - Function calling and tool choice support - Reasoning capabilities enabled Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Contributor
Greptile OverviewGreptile SummaryAdds the
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| model_prices_and_context_window.json | Adds dashscope/qwen3-max model entry with tiered pricing, identical configuration to existing dashscope/qwen3-max-preview. Structure follows established patterns correctly. |
| litellm/model_prices_and_context_window_backup.json | Backup file mirrors the same dashscope/qwen3-max entry added in the primary JSON file. Both files are in sync. |
krrishdholakia
pushed a commit
that referenced
this pull request
Feb 11, 2026
…e OpenAI (#20883) * feat: add opus 4.5 and 4.6 to use outout_format param * generate poetry lock with 2.3.2 poetry * restore poetry lock * e2e tests, key delete, update tpm rpm, and regenerate * Split e2e ui testing for browser * new login with sso button in login page * option to hide usage indicator * fix(cloudzero): update CBF field mappings per LIT-1907 (#20906) * fix(cloudzero): update CBF field mappings per LIT-1907 Phase 1 field updates for CloudZero integration: ADD/UPDATE: - resource/account: Send concat(api_key_alias, '|', api_key_prefix) - resource/service: Send model_group instead of service_type - resource/usage_family: Send provider instead of hardcoded 'llm-usage' - action/operation: NEW - Send team_id - resource/id: Send model name instead of CZRN - resource/tag:organization_alias: Add if exists - resource/tag:project_alias: Add if exists - resource/tag:user_alias: Add if exists REMOVE: - resource/tag:total_tokens: Removed - resource/tag:team_id: Removed (team_id now in action/operation) Fixes LIT-1907 * Update litellm/integrations/cloudzero/transform.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: define api_key_alias variable, update CBFRecord docstring - Fix F821 lint error: api_key_alias was used but not defined - Update CBFRecord docstring to reflect LIT-1907 field mappings - Remove unused Optional import --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add banner notifying of breaking change * Add semgrep & Fix OOMs (#20912) * [Feat] Policies - Allow connecting Policies to Tags, Simulating Policies, Viewing how many keys, teams it applies on (#20904) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update litellm/proxy/policy_engine/policy_resolve_endpoints.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * fix: type error & better error handling (#20689) * [Docs] Add docs guide for using policies (#20914) * init schema with TAGS * ui: add policy test * resolvePoliciesCall * add_policy_sources_to_metadata + headers * types Policy * preview Impact * def _describe_match_reason( * match based on TAGs * TestTagBasedAttachments * test fixes * add policy_resolve_router * add_guardrails_from_policy_engine * TestMatchAttribution * refactor * fix * fix: address Greptile review feedback on policy resolve endpoints - Track unnamed keys/teams as separate counts instead of inflating affected_keys_count with duplicate "(unnamed key)" placeholders. Added unnamed_keys_count and unnamed_teams_count to response. - Push alias pattern matching to DB via _build_alias_where() which converts exact patterns to Prisma "in" and suffix wildcards to "startsWith" filters. - Gate sync_policies_from_db/sync_attachments_from_db behind force_sync query param (default false) to avoid 2 DB round-trips on every /policies/resolve request. - Remove worktree-only conftest.py that cleared sys.modules at import time — no longer needed since code moved to main repo. - Rename MAX_ESTIMATE_IMPACT_ROWS → MAX_POLICY_ESTIMATE_IMPACT_ROWS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: eliminate duplicate DB queries and fix header delimiter ambiguity - Fetch teams table once in estimate_attachment_impact and reuse for both tag-based and alias-based lookups (was querying teams twice when both tag_patterns and team_patterns were provided). - Convert tag/team filter functions from async DB queries to sync filters that operate on pre-fetched data (_filter_keys_by_tags, _filter_teams_by_tags). - Fix comma ambiguity in x-litellm-policy-sources header: use '; ' as entry delimiter since matched_via values can contain commas. - Use '+' as the within-value separator in matched_via reason strings (e.g. "tag:healthcare+team:health-team") to avoid conflict with header delimiters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs v1 guide with UI imgs * docs fix --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: add dashscope/qwen3-max model with tiered pricing (#20919) Add support for Alibaba Cloud's Qwen3-Max model with: - 258K input tokens, 65K output tokens - Tiered pricing based on context window usage (0-32K, 32K-128K, 128K-252K) - Function calling and tool choice support - Reasoning capabilities enabled Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * fix linting * docs: add Greptile review requirement to PR template (#20762) * fix(azure): preserve content_policy_violation error details from Azure OpenAI Closes #20811 Azure OpenAI returns rich error payloads for content policy violations (inner_error with ResponsibleAIPolicyViolation, content_filter_results, revised_prompt). Previously these details were lost when: 1. The top-level error code was not "content_policy_violation" but the inner_error.code was "ResponsibleAIPolicyViolation" -- the structured check only examined the top-level code. 2. The DALL-E image generation polling path stringified the error JSON into the message field instead of setting the structured body, making it impossible for exception_type() to extract error details. 3. The string-based fallback detector used "invalid_request_error" as a content-policy indicator, which is too broad and could misclassify regular bad-request errors. Changes: - exception_mapping_utils.py: Check inner_error.code for ResponsibleAIPolicyViolation when top-level code is not content_policy_violation. Replace overly broad "invalid_request_error" string match with specific Azure safety-system messages. - azure.py: Set structured body on AzureOpenAIError in both async and sync DALL-E polling paths so exception_type() can inspect error details. - test_azure_exception_mapping.py: Add regression tests covering the exact error payloads from issue #20811. - Fix pre-existing lint: duplicate PerplexityResponsesConfig dict key, unused RouteChecks top-level import. --------- Co-authored-by: Kelvin Tran <kelvin-tran@users.noreply.github.com> Co-authored-by: yuneng-jiang <yuneng.jiang@gmail.com> Co-authored-by: shin-bot-litellm <shin-bot-litellm@berri.ai> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Alexsander Hamir <alexsanderhamirgomesbaptista@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Harshit Jain <48647625+Harshit28j@users.noreply.github.com> Co-authored-by: ken <122603020@qq.com> Co-authored-by: Sameer Kankute <sameer@berri.ai>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add support for Alibaba Cloud's Qwen3-Max model with:
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unitCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes