Skip to content

Commit 6886416

Browse files
committed
feat: new multi service router
1 parent 76f1e53 commit 6886416

File tree

255 files changed

+5232
-3764
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

255 files changed

+5232
-3764
lines changed

.cspell.json

+2-3
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,7 @@
99
"dictionaries": ["project-words"],
1010
"ignorePaths": [
1111
"node_modules",
12-
"./src/docs/public/apidocs",
13-
"./src/ax/dsp/stopwords.ts",
14-
"./src/examples/qna-tune.ts"
12+
"./src/docs/src/content/docs/03-apidocs",
13+
"./src/ax/dsp/stopwords.ts"
1514
]
1615
}

.cspell/project-words.txt

+1
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ Macbook
4545
minilm
4646
Mixtral
4747
modelinfo
48+
multiservice
4849
nanos
4950
nemo
5051
Nemo

README.md

+113-42
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,9 @@ Use Ax and get an end-to-end streaming, multi-modal DSPy framework with agents a
2626

2727
<img width="860" alt="shapes at 24-03-31 00 05 55" src="https://github.com/dosco/llm-client/assets/832235/0f0306ea-1812-4a0a-9ed5-76cd908cd26b">
2828

29-
Efficient type-safe prompts are auto-generated from a simple signature. A prompt signature is made up of a `"task description" inputField:type "field description" -> "outputField:type`. The idea behind prompt signatures is based on work done in the "Demonstrate-Search-Predict" paper.
29+
Efficient type-safe prompts are auto-generated from a simple signature. A prompt signature is made up of a `"task description" inputField:type "field description" -> "outputField:type`. The idea behind prompt signatures is based on work done in the "Demonstrate-Search-Predict" paper.
3030

31-
You can have multiple input and output fields, and each field can be of the types `string`, `number`, `boolean`, `date`, `datetime`, `class "class1, class2"`, `JSON`, or an array of any of these, e.g., `string[]`. When a type is not defined, it defaults to `string`. The underlying AI is encouraged to generate the correct JSON when the `JSON` type is used.
31+
You can have multiple input and output fields, and each field can be of the types `string`, `number`, `boolean`, `date`, `datetime`, `class "class1, class2"`, `JSON`, or an array of any of these, e.g., `string[]`. When a type is not defined, it defaults to `string`. The suffix `?` makes the field optional (required by default) and `!` makes the field internal which is good for things like reasoning.
3232

3333
## Output Field Types
3434

@@ -46,6 +46,7 @@ You can have multiple input and output fields, and each field can be of the type
4646
| `date[]` | An array of dates. | `holidayDates:date[]` | `["2023-10-01", "2023-10-02"]` |
4747
| `datetime[]` | An array of date and time values. | `logTimestamps:datetime[]` | `["2023-10-01T12:00:00Z", "2023-10-02T12:00:00Z"]` |
4848
| `class[] "class1,class2"` | Multiple classes | `categories:class[]` | `["class1", "class2", "class3"]` |
49+
| `code "language"` | A code block in a specific language | `code:code "python"` | `print('Hello, world!')` |
4950

5051

5152

@@ -167,7 +168,7 @@ Launch Apache Tika
167168
docker run -p 9998:9998 apache/tika
168169
```
169170

170-
Convert documents to text and embed them for retrieval using the `AxDBManager`, which also supports a reranker and query rewriter. Two default implementations, `AxDefaultResultReranker` and `AxDefaultQueryRewriter`, are available.
171+
Convert documents to text and embed them for retrieval using the `AxDBManager`, which also supports a reranker and query rewriter. Two default implementations, `AxDefaultResultReranker` and `AxDefaultQueryRewriter`, are available.
171172

172173
```typescript
173174
const tika = new AxApacheTika();
@@ -182,7 +183,7 @@ console.log(matches);
182183

183184
## Multi-modal DSPy
184185

185-
When using models like `GPT-4o` and `Gemini` that support multi-modal prompts, we support using image fields, and this works with the whole DSP pipeline.
186+
When using models like `GPT-4o` and `Gemini` that support multi-modal prompts, we support using image fields, and this works with the whole DSP pipeline.
186187

187188
```typescript
188189
const image = fs
@@ -197,7 +198,7 @@ const res = await gen.forward(ai, {
197198
});
198199
```
199200

200-
When using models like `gpt-4o-audio-preview` that support multi-modal prompts with audio support, we support using audio fields, and this works with the whole DSP pipeline.
201+
When using models like `gpt-4o-audio-preview` that support multi-modal prompts with audio support, we support using audio fields, and this works with the whole DSP pipeline.
201202

202203
```typescript
203204
const audio = fs
@@ -289,49 +290,119 @@ const processor = new AxFieldProcessor(gen, 'next10Numbers', processorFunction,
289290
const res = await gen.forward({ startNumber: 1 });
290291
```
291292

293+
## AI Routing and Load Balancing
292294

293-
<!-- ## Fast LLM Router
295+
Ax provides two powerful ways to work with multiple AI services: a load balancer for high availability and a router for model-specific routing.
294296

295-
A special router that uses no LLM calls, only embeddings, to route user requests smartly.
297+
### Load Balancer
296298

297-
Use the Router to efficiently route user queries to specific routes designed to handle certain questions or tasks. Each route is tailored to a particular domain or service area. Instead of using a slow or expensive LLM to decide how user input should be handled, use our fast "Semantic Router," which uses inexpensive and fast embedding queries.
299+
The load balancer automatically distributes requests across multiple AI services based on performance and availability. If one service fails, it automatically fails over to the next available service.
298300

299301
```typescript
300-
# npm run tsx ./src/examples/routing.ts
301-
302-
const customerSupport = new AxRoute('customerSupport', [
303-
'how can I return a product?',
304-
'where is my order?',
305-
'can you help me with a refund?',
306-
'I need to update my shipping address',
307-
'my product arrived damaged, what should I do?'
308-
]);
302+
import { AxAI, AxBalancer } from '@ax-llm/ax'
309303

310-
const technicalSupport = new AxRoute('technicalSupport', [
311-
'how do I install your software?',
312-
'I’m having trouble logging in',
313-
'can you help me configure my settings?',
314-
'my application keeps crashing',
315-
'how do I update to the latest version?'
316-
]);
304+
// Setup multiple AI services
305+
const openai = new AxAI({
306+
name: 'openai',
307+
apiKey: process.env.OPENAI_APIKEY,
308+
})
317309

318-
const ai = new AxAI({ name: 'openai', apiKey: process.env.OPENAI_APIKEY as string });
310+
const ollama = new AxAI({
311+
name: 'ollama',
312+
config: { model: "nous-hermes2" }
313+
})
319314

320-
const router = new AxRouter(ai);
321-
await router.setRoutes(
322-
[customerSupport, technicalSupport],
323-
{ filename: 'router.json' }
324-
);
315+
const gemini = new AxAI({
316+
name: 'google-gemini',
317+
apiKey: process.env.GOOGLE_APIKEY
318+
})
325319

326-
const tag = await router.forward('I need help with my order');
320+
// Create a load balancer with all services
321+
const balancer = new AxBalancer([openai, ollama, gemini])
327322

328-
if (tag === "customerSupport") {
329-
...
330-
}
331-
if (tag === "technicalSupport") {
332-
...
323+
// Use like a regular AI service - automatically uses the best available service
324+
const response = await balancer.chat({
325+
chatPrompt: [{ role: 'user', content: 'Hello!' }],
326+
})
327+
328+
// Or use the balance with AxGen
329+
const gen = new AxGen(`question -> answer`)
330+
const res = await gen.forward(balancer,{ question: 'Hello!' })
331+
```
332+
333+
### Multi-Service Router
334+
335+
The router lets you use multiple AI services through a single interface, automatically routing requests to the right service based on the model specified.
336+
337+
```typescript
338+
import { AxAI, AxMultiServiceRouter, AxAIOpenAIModel } from '@ax-llm/ax'
339+
340+
// Setup OpenAI with model list
341+
const openai = new AxAI({
342+
name: 'openai',
343+
apiKey: process.env.OPENAI_APIKEY,
344+
models: [
345+
{
346+
key: 'basic',
347+
model: AxAIOpenAIModel.GPT4OMini,
348+
description: 'Fast model for simple tasks',
349+
},
350+
{
351+
key: 'expert',
352+
model: AxAIOpenAIModel.GPT4O,
353+
description: 'Expert model for specialized tasks',
354+
}
355+
]
356+
})
357+
358+
// Setup Gemini with model list
359+
const gemini = new AxAI({
360+
name: 'google-gemini',
361+
apiKey: process.env.GOOGLE_APIKEY,
362+
models: [
363+
{
364+
key: 'basic',
365+
model: 'gemini-2.0-flash',
366+
description: 'Basic Gemini model for simple tasks',
367+
},
368+
{
369+
key: 'expert',
370+
model: 'gemini-2.0-pro',
371+
description: 'Expert Gemini model for complex tasks',
372+
}
373+
]
374+
})
375+
376+
const ollama = new AxAI({
377+
name: 'ollama',
378+
config: { model: "nous-hermes2" }
379+
})
380+
381+
const secretService = {
382+
key: 'sensitive-secret',
383+
service: ollama,
384+
description: 'Ollama model for sensitive secrets tasks'
333385
}
334-
``` -->
386+
387+
// Create a router with all services
388+
const router = new AxMultiServiceRouter([openai, gemini, secretService])
389+
390+
// Route to OpenAI's expert model
391+
const openaiResponse = await router.chat({
392+
chatPrompt: [{ role: 'user', content: 'Hello!' }],
393+
model: 'expert'
394+
})
395+
396+
// Or use the router with AxGen
397+
const gen = new AxGen(`question -> answer`)
398+
const res = await gen.forward(router, { question: 'Hello!' })
399+
```
400+
401+
The load balancer is ideal for high availability while the router is perfect when you need specific models for specific tasks Both can be used with any of Ax's features like streaming, function calling, and chain-of-thought prompting.
402+
403+
**They can also be used together**
404+
405+
You can also use the balancer and the router together either the multiple balancers can be used with the router or the router can be used with the balancer.
335406

336407
## Vercel AI SDK Integration
337408

@@ -397,7 +468,7 @@ const result = await streamUI({
397468

398469
## OpenTelemetry support
399470

400-
The ability to trace and observe your llm workflow is critical to building production workflows. OpenTelemetry is an industry-standard, and we support the new `gen_ai` attribute namespace.
471+
The ability to trace and observe your llm workflow is critical to building production workflows. OpenTelemetry is an industry-standard, and we support the new `gen_ai` attribute namespace.
401472

402473
```typescript
403474
import { trace } from '@opentelemetry/api';
@@ -531,7 +602,7 @@ console.log(res);
531602

532603
## Check out all the examples
533604

534-
Use the `tsx` command to run the examples. It makes the node run typescript code. It also supports using an `.env` file to pass the AI API Keys instead of putting them in the command line.
605+
Use the `tsx` command to run the examples. It makes the node run typescript code. It also supports using an `.env` file to pass the AI API Keys instead of putting them in the command line.
535606

536607
```shell
537608
OPENAI_APIKEY=openai_key npm run tsx ./src/examples/marketing.ts
@@ -640,7 +711,7 @@ const cot = new AxGen(ai, `question:string -> answer:string`, { functions });
640711
## Enable debug logs
641712

642713
```ts
643-
const ai = new AxOpenAI({ apiKey: process.env.OPENAI_APIKEY } as AxOpenAIArgs);
714+
const ai = new AxAI({ name: "openai", apiKey: process.env.OPENAI_APIKEY } as AxOpenAIArgs);
644715
ai.setOptions({ debug: true });
645716
```
646717

@@ -681,6 +752,6 @@ conf.model = OpenAIModel.GPT4Turbo;
681752

682753
## Monorepo tips & tricks
683754

684-
It is essential to remember that we should only run `npm install` from the root directory. This prevents the creation of nested `package-lock.json` files and avoids non-deduplicated `node_modules`.
755+
It is essential to remember that we should only run `npm install` from the root directory. This prevents the creation of nested `package-lock.json` files and avoids non-deduplicated `node_modules`.
685756

686-
Adding new dependencies in packages should be done with e.g. `npm install lodash --workspace=ax` (or just modify the appropriate `package.json` and run `npm install` from root).
757+
Adding new dependencies in packages should be done with e.g. `npm install lodash --workspace=ax` (or just modify the appropriate `package.json` and run `npm install` from root).

customFrontmatter.mjs

+11-7
Original file line numberDiff line numberDiff line change
@@ -34,19 +34,23 @@ function replaceAndFormat(input) {
3434
return input.replace(
3535
/(\[`?[^`\]]+`?\]\()([^)]+)(\))/g,
3636
(match, linkText, path, closing) => {
37+
38+
if (path.startsWith('https://')) {
39+
return path;
40+
}
41+
3742
// Remove file extension
3843
let transformedPath = path.replace(/\.md$/, '');
39-
44+
4045
// Remove special characters like dots and convert to lowercase
4146
transformedPath = transformedPath
4247
.toLowerCase()
4348
.replace(/[^a-z0-9-]/g, '');
44-
45-
// Add hashtag prefix if it doesn't exist
46-
if (!transformedPath.startsWith('#')) {
47-
transformedPath = '#apidocs/' + transformedPath;
48-
}
49-
49+
50+
transformedPath = '/api/#03-apidocs/' + transformedPath;
51+
52+
console.log(transformedPath)
53+
5054
return `${linkText}${transformedPath}${closing}`;
5155
}
5256
);

src/ai-sdk-provider/provider.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@ export class AxAIProvider implements LanguageModelV1 {
135135
constructor(ai: AxAIService, config?: Readonly<AxConfig>) {
136136
this.ai = ai
137137
this.config = config
138-
this.modelId = this.ai.getModelInfo().name
138+
this.modelId = this.ai.getName()
139139
}
140140

141141
get provider(): string {

src/ax/ai/anthropic/api.ts

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ import type { AxAPI } from '../../util/apicall.js'
22
import { AxBaseAI, axBaseAIDefaultConfig } from '../base.js'
33
import { GoogleVertexAuth } from '../google-vertex/auth.js'
44
import type {
5-
AxAIModelList,
5+
AxAIInputModelList,
66
AxAIServiceImpl,
77
AxAIServiceOptions,
88
AxChatRequest,
@@ -41,7 +41,7 @@ export interface AxAIAnthropicArgs {
4141
region?: string
4242
config?: Readonly<Partial<AxAIAnthropicConfig>>
4343
options?: Readonly<AxAIServiceOptions>
44-
models?: AxAIModelList<AxAIAnthropicModel | AxAIAnthropicVertexModel>
44+
models?: AxAIInputModelList<AxAIAnthropicModel | AxAIAnthropicVertexModel>
4545
}
4646

4747
class AxAIAnthropicImpl

0 commit comments

Comments
 (0)