diff --git a/README.md b/README.md index 7acfc1c9..5aedeab2 100644 --- a/README.md +++ b/README.md @@ -42,11 +42,13 @@ We showcase an impressive voice agent called Astra, powered by TEN, demonstratin #### Prerequisites -- Agora App ID and App Certificate([read here on how](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web)) -- Azure's [speech-to-text](https://azure.microsoft.com/en-us/products/ai-services/speech-to-text) and [text-to-speech](https://azure.microsoft.com/en-us/products/ai-services/text-to-speech) API keys -- [OpenAI](https://openai.com/index/openai-api/) API key -- [Docker](https://www.docker.com/) / [Docker Compose](https://docs.docker.com/compose/) -- [Node.js(LTS) v18](https://nodejs.org/en) +- **Keys** + - Agora App ID and App Certificate([read here on how](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web)) + - Azure's [speech-to-text](https://azure.microsoft.com/en-us/products/ai-services/speech-to-text) and [text-to-speech](https://azure.microsoft.com/en-us/products/ai-services/text-to-speech) API keys + - [OpenAI](https://openai.com/index/openai-api/) API key +- **Downloads** + - [Docker](https://www.docker.com/) / [Docker Compose](https://docs.docker.com/compose/) + - [Node.js(LTS) v18](https://nodejs.org/en) #### Docker setting on apple silicon You will need to uncheck "Use Rosetta for x86_64/amd64 emulation on apple silicon" option for Docker if you are on Apple Silicon, otherwise the server is not gonna work. @@ -59,22 +61,23 @@ You will need to uncheck "Use Rosetta for x86_64/amd64 emulation on apple silico #### 1. Prepare config files - +In the root of the project, create these files from the examples. They will be used to store information for Docker Compose later. ```bash # Create property.json from the example cp ./agents/property.json.example ./agents/property.json + # Create .env from the example cp ./.env.example ./.env ``` -#### 2. Setup API keys & Environment varialbes in .env file +#### 2. Setup API keys & Environment variables in .env file +Open the `.env` file and fill in the keys and regions. This is also where you can choose to use any different extensions: ``` -... # Agora App ID and Agora App Certificate # required: this variable must be set AGORA_APP_ID= AGORA_APP_CERTIFICATE= -... + # Extension: agora_rtc # Azure STT key and region AZURE_STT_KEY= @@ -84,21 +87,21 @@ AZURE_STT_REGION= # Azure TTS key and region AZURE_TTS_KEY= AZURE_TTS_REGION= -... + # Extension: openai_chatgpt # OpenAI API key OPENAI_API_KEY= ``` #### 3. Start agent builder toolkit containers - +In the same directory, run the `docker` command to compose containers: ```bash # Execute docker compose up to start the services docker compose up ``` #### 4. Build your agent and start server - +Open up a separate terminal window, build the agent and start the server: ```bash # Enter container to build agent docker exec -it astra_agents_dev bash @@ -112,26 +115,6 @@ make run-server You can open `localhost:3000` in your browser to test your own agent, or open `localhost:3001` in your browser to build your workflow by Graph Designer. -
-

Voice agent architecture

- -To explore further, the voice agent is an excellent starting point. It incorporates various extensions, some of which are interchangeable. Feel free to select the ones that best suit your needs and maximize its capabilities. - - -| Extension | Feature | Description | -| ------------------ | -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| openai_chatgpt | LLM | [ GPT-4o ](https://platform.openai.com/docs/models/gpt-4o), [ GPT-4 Turbo ](https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4), [ GPT-3.5 Turbo ](https://platform.openai.com/docs/models/gpt-3-5-turbo) | -| elevenlabs_tts | Text-to-speech | [ElevanLabs text to speech](https://elevenlabs.io/) converts text to audio | -| azure_tts | Text-to-speech | [Azure text to speech](https://azure.microsoft.com/en-us/products/ai-services/text-to-speech) converts text to audio | -| azure_stt | Speech-to-text | [Azure speech to text](https://azure.microsoft.com/en-us/products/ai-services/speech-to-text) converts audio to text | -| chat_transcriber | Transcriber | A utility ext to forward chat logs into channel | -| agora_rtc | Transporter | A low latency transporter powered by agora_rtc | -| interrupt_detector | Interrupter | A utility ext to help interrupt agent | - -

Voice Agent Diagram

- -![voice agent diagram](./images/image-2.png) -

TEN Service

Discover More

diff --git a/docs/readmes/README-CN.md b/docs/readmes/README-CN.md index 096943dc..f8d6c75d 100644 --- a/docs/readmes/README-CN.md +++ b/docs/readmes/README-CN.md @@ -40,11 +40,10 @@

如何在本地搭建 Astra

#### 先决条件 - - Agora App ID 和 App Certificate([点击此处了解详情](https://docs.agora.io/en/video-calling/get-started/manage-agora-account?platform=web)) -- Azure 的 [STT](https://azure.microsoft.com/en-us/products/ai-services/speech-to-text) 和 [TTS](https://azure.microsoft.com/en-us/products/ai-services/text-to-speech) API key -- [OpenAI](https://openai.com/index/openai-api/) API key -- [Docker](https://www.docker.com/) +- Azure 的 [STT](https://azure.microsoft.com/en-us/products/ai-services/speech-to-text) 和 [TTS](https://azure.microsoft.com/en-us/products/ai-services/text-to-speech) API 密钥 +- [OpenAI](https://openai.com/index/openai-api/) API 密钥 +- [Docker](https://www.docker.com/) - [Node.js(LTS) v18](https://nodejs.org/en) #### Apple Silicon 上 Docker 设置 @@ -64,115 +63,59 @@ $ go env -w GO111MODULE=on $ go env -w GOPROXY=https://goproxy.cn,direct ``` -#### 1.创建 manifest 配置文件 - -Clone 代码后在根目录通过下面的命令创建配置文件。 - +#### 1. 准备设置文件 +Clone 项目后,在根目录下跑下面的命创建 `property.json` 和 `.env`: ```bash -# 在命令行从示例文件创建 manifest.json -cp ./agents/manifest.json.example ./agents/manifest.json -``` +# 创建 property.json 文件 +cp ./agents/property.json.example ./agents/property.json -#### 2. 定制化 - -`cd` 到 `/agents` 后可以看到 `manifest.json`,这里可以自定义 `prompt` 和 `greeting`。 - -```js -// 在 manifest.json 可以直接改 prompt 和问候语 -"property": { - "base_url": "", - "api_key": "", - "frequency_penalty": 0.9, - "model": "gpt-3.5-turbo", - "max_tokens": 512, - "prompt": "", //这里修改 propmt - "proxy_url": "", - "greeting": "Astra agent connected. How can I help you today?", //这里修改问候语 - "max_memory_length": 10 -} +# 创建 .env 文件 +cp ./.env.example ./.env ``` -#### 3. 在 Docker 容器中构建 agent - -在命令行,逐一跑下面的命令。 -```bash -# 命令行拉取带有开发工具的 Docker 镜像,并将当前文件夹挂载为工作区 -docker run -itd -v $(pwd):/app -w /app -p 8080:8080 --name astra_agents_dev ghcr.io/rte-design/astra_agents_build - -# 对于 Windows Git Bash -# docker run -itd -v //$(pwd):/app -w //app -p 8080:8080 --name astra_agents_dev ghcr.io/rte-design/astra_agents_build - -# 进入 Docker 容器 -docker exec -it astra_agents_dev bash - -# 在容器里构建 agent -make build +#### 2. 绑定积木的 keys +打开 `.env` 文件,绑定对应的积木 keys,这里可以通过配置不同的 keys 选用不用的积木: ``` - -#### 4. 启动本地服务器 - - -```bash # Agora App ID and Agora App Certificate -export AGORA_APP_ID= -export AGORA_APP_CERTIFICATE= - -# OpenAI API key -export OPENAI_API_KEY= +AGORA_APP_ID= +AGORA_APP_CERTIFICATE= +# Extension: agora_rtc # Azure STT key and region -export AZURE_STT_KEY= -export AZURE_STT_REGION= +AZURE_STT_KEY= +AZURE_STT_REGION= +# Extension: azure_tts # Azure TTS key and region -export AZURE_TTS_KEY= -export AZURE_TTS_REGION= +AZURE_TTS_KEY= +AZURE_TTS_REGION= -# 端口 8080 -make run-server +# Extension: openai_chatgpt +# OpenAI API key +OPENAI_API_KEY= ``` -#### 5. 运行 voice agent 界面 - -同时,再打开一个 Terminal 窗口, 通过下列命令创建环境文件并跑起界面。 - +#### 3. 开启 Docker 容器 +在同一个目录下,通过 Docker 镜像构建 Docker 容器: ```bash -# 创建一个本地的环境文件 -cd playground -cp .env.example .env - -# 安装依赖并开启界面 -npm install && npm run dev +# 开启 Docker 容器: +docker compose up ``` -#### 6. 验证您定制的 voice agent 🎉 - -在浏览器中打开 `localhost:3000`,您应该能够看到一个与示例项目一样的 voice angent,但是这次是带有定制的 voice agent。 - - -
-

Voice agent 架构

-要进一步探索, voice agent 是一个绝佳的起点。它包含以下扩展功能,其中一些将在不久的将来可以互换使用。请随意选择最适合您需求并最大化 ASTRA 功能的扩展。 - -| 扩展功能 | 特点 | 描述 | -| ------------------ | -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| openai_chatgpt | 语言模型 | [ GPT-4o ](https://platform.openai.com/docs/models/gpt-4o), [ GPT-4 Turbo ](https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4), [ GPT-3.5 Turbo ](https://platform.openai.com/docs/models/gpt-3-5-turbo) | -| elevenlabs_tts | 文本转语音 | [ElevanLabs 文本转语音](https://elevenlabs.io/) 将文本转换为音频 | -| azure_tts | 文本转语音 | [Azure 文本转语音](https://azure.microsoft.com/en-us/products/ai-services/text-to-speech) 将文本转换为音频 | -| azure_stt | 语音转文本 | [Azure 语音转文本](https://azure.microsoft.com/en-us/products/ai-services/speech-to-text) 将音频转换为文本 | -| chat_transcriber | 转录工具 | 将聊天记录转发到频道的实用工具 | -| agora_rtc | 传输工具 | 由 agora_rtc 提供支持的低延迟传输工具 | -| interrupt_detector | 中断工具 | 帮助中断语音助手的实用工具 | - -

Voice agent 架构图

- -![ASTRAvoice agent架构图](../../images/image-2.png) +#### 4. 构建 Agent 并开启服务 +再打开一个 Terminal 窗口,通过下面的命令进入 Docker 容器,创建并开启服务: +```bash +# 进入容器创建 Agent +docker exec -it astra_agents_dev bash +make build +# 端口 8080 开启服务 +make run-server +``` -
-

ASTRA 服务

+#### 5. 创建成功并体验 Agent 🎉 -现在您已经创建了第一个 AI voice agent,创意并不会止步于此。 要开发更多的 AI agents, 您需要深入了解 ASTRA 的工作原理。请参阅 [ ASTRA 架构文档 ](./docs/astra-architecture.md)。 +现在可以打开浏览器 `localhost:3000` 体验 Astra 语音助手,同时可以打开 `localhost:3001` 体验 Graph Designer。

点星收藏