You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Z.ai: add GLM-4.5-X, AirX, Flash (expand model coverage) (#8745)
* feat(zai): add GLM-4.5-X, AirX, Flash; sync with Z.ai docs; keep canonical api line keys
* feat(zai): add GLM-4.5V vision model (supportsImages, pricing, 16K max output); add tests
* feat(types,zai): sync Z.AI international model map and tests
- Update pricing, context window, and capabilities for:
glm-4.5-x, glm-4.5-airx, glm-4.5-flash, glm-4.5v, glm-4.6
- Add glm-4-32b-0414-128k
- Align tests with new model specs
* fix(zai): align handler generics with expanded model ids to satisfy CI compile step
* chore(zai): remove tier pricing blocks for Z.ai models
* fix(zai): simplify names in zaiApiLineConfigs for clarity
* chore(zai): set default temperature to 0.6
---------
Co-authored-by: Roo Code <[email protected]>
"GLM-4.5-Air is the lightweight version of GLM-4.5. It balances performance and cost-effectiveness, and can flexibly switch to hybrid thinking models.",
34
37
},
38
+
"glm-4.5-x": {
39
+
maxTokens: 98_304,
40
+
contextWindow: 131_072,
41
+
supportsImages: false,
42
+
supportsPromptCache: true,
43
+
inputPrice: 2.2,
44
+
outputPrice: 8.9,
45
+
cacheWritesPrice: 0,
46
+
cacheReadsPrice: 0.45,
47
+
description:
48
+
"GLM-4.5-X is a high-performance variant optimized for strong reasoning with ultra-fast responses.",
49
+
},
50
+
"glm-4.5-airx": {
51
+
maxTokens: 98_304,
52
+
contextWindow: 131_072,
53
+
supportsImages: false,
54
+
supportsPromptCache: true,
55
+
inputPrice: 1.1,
56
+
outputPrice: 4.5,
57
+
cacheWritesPrice: 0,
58
+
cacheReadsPrice: 0.22,
59
+
description: "GLM-4.5-AirX is a lightweight, ultra-fast variant delivering strong performance with lower cost.",
60
+
},
61
+
"glm-4.5-flash": {
62
+
maxTokens: 98_304,
63
+
contextWindow: 131_072,
64
+
supportsImages: false,
65
+
supportsPromptCache: true,
66
+
inputPrice: 0,
67
+
outputPrice: 0,
68
+
cacheWritesPrice: 0,
69
+
cacheReadsPrice: 0,
70
+
description: "GLM-4.5-Flash is a free, high-speed model excellent for reasoning, coding, and agentic tasks.",
71
+
},
72
+
"glm-4.5v": {
73
+
maxTokens: 16_384,
74
+
contextWindow: 131_072,
75
+
supportsImages: true,
76
+
supportsPromptCache: true,
77
+
inputPrice: 0.6,
78
+
outputPrice: 1.8,
79
+
cacheWritesPrice: 0,
80
+
cacheReadsPrice: 0.11,
81
+
description:
82
+
"GLM-4.5V is Z.AI's multimodal visual reasoning model (image/video/text/file input), optimized for GUI tasks, grounding, and document/video understanding.",
"GLM-4.6 is Zhipu's newest model with an extended context window of up to 200k tokens, providing enhanced capabilities for processing longer documents and conversations.",
46
95
},
96
+
"glm-4-32b-0414-128k": {
97
+
maxTokens: 98_304,
98
+
contextWindow: 131_072,
99
+
supportsImages: false,
100
+
supportsPromptCache: false,
101
+
inputPrice: 0.1,
102
+
outputPrice: 0.1,
103
+
cacheWritesPrice: 0,
104
+
cacheReadsPrice: 0,
105
+
description: "GLM-4-32B is a 32 billion parameter model with 128k context length, optimized for efficiency.",
"GLM-4.5 is Zhipu's latest featured model. Its comprehensive capabilities in reasoning, coding, and agent reach the state-of-the-art (SOTA) level among open-source models, with a context length of up to 128k.",
"GLM-4.5-Air is the lightweight version of GLM-4.5. It balances performance and cost-effectiveness, and can flexibly switch to hybrid thinking models.",
95
-
tiers: [
96
-
{
97
-
contextWindow: 32_000,
98
-
inputPrice: 0.07,
99
-
outputPrice: 0.4,
100
-
cacheReadsPrice: 0.014,
101
-
},
102
-
{
103
-
contextWindow: 128_000,
104
-
inputPrice: 0.1,
105
-
outputPrice: 0.6,
106
-
cacheReadsPrice: 0.02,
107
-
},
108
-
{
109
-
contextWindow: Infinity,
110
-
inputPrice: 0.1,
111
-
outputPrice: 0.6,
112
-
cacheReadsPrice: 0.02,
113
-
},
114
-
],
135
+
},
136
+
"glm-4.5-x": {
137
+
maxTokens: 98_304,
138
+
contextWindow: 131_072,
139
+
supportsImages: false,
140
+
supportsPromptCache: true,
141
+
inputPrice: 0.29,
142
+
outputPrice: 1.14,
143
+
cacheWritesPrice: 0,
144
+
cacheReadsPrice: 0.057,
145
+
description:
146
+
"GLM-4.5-X is a high-performance variant optimized for strong reasoning with ultra-fast responses.",
147
+
},
148
+
"glm-4.5-airx": {
149
+
maxTokens: 98_304,
150
+
contextWindow: 131_072,
151
+
supportsImages: false,
152
+
supportsPromptCache: true,
153
+
inputPrice: 0.1,
154
+
outputPrice: 0.6,
155
+
cacheWritesPrice: 0,
156
+
cacheReadsPrice: 0.02,
157
+
description: "GLM-4.5-AirX is a lightweight, ultra-fast variant delivering strong performance with lower cost.",
158
+
},
159
+
"glm-4.5-flash": {
160
+
maxTokens: 98_304,
161
+
contextWindow: 131_072,
162
+
supportsImages: false,
163
+
supportsPromptCache: true,
164
+
inputPrice: 0,
165
+
outputPrice: 0,
166
+
cacheWritesPrice: 0,
167
+
cacheReadsPrice: 0,
168
+
description: "GLM-4.5-Flash is a free, high-speed model excellent for reasoning, coding, and agentic tasks.",
169
+
},
170
+
"glm-4.5v": {
171
+
maxTokens: 16_384,
172
+
contextWindow: 131_072,
173
+
supportsImages: true,
174
+
supportsPromptCache: true,
175
+
inputPrice: 0.29,
176
+
outputPrice: 0.93,
177
+
cacheWritesPrice: 0,
178
+
cacheReadsPrice: 0.057,
179
+
description:
180
+
"GLM-4.5V is Z.AI's multimodal visual reasoning model (image/video/text/file input), optimized for GUI tasks, grounding, and document/video understanding.",
"GLM-4.6 is Zhipu's newest model with an extended context window of up to 200k tokens, providing enhanced capabilities for processing longer documents and conversations.",
0 commit comments