In this tutorial, we’ll guide you on how to deploy AutoRAG using Kotaemon to create a functional chat UI. With this guide, you can utilize an optimized RAG system through AutoRAG and experience it in a seamless chat interface.
- Optimize RAG using AutoRAG.
- Run the API server from the optimized RAG.
- Deploy the AutoRAG x Kotaemon web app on fly.io.
- Connect and use the API server in the web app.
- Git installed on your system
- Homebrew (for macOS users)
- fly.io account
- Completion of optimization using AutoRAG
First, find an optimized RAG pipeline. Check out this tutorial for instructions on optimizing with AutoRAG.
To run the AutoRAG API server locally, use the following command:
autorag run_api --trial_dir /trial/dir/0 --host 0.0.0.0 --port 8000
The trial directory is a subdirectory within your project directory post-optimization, typically named with a “number.” Specify the directory name to be used as the backend for the chat interface.
For public access to the API server, AutoRAG uses NGrok. Upon server startup, you can find the public URL in the logs:
INFO [api.py:199] >> Public API URL: api.py:199
<https://8a31-14-52-132-205.ngrok-free.app>
Make sure to remember the URL displayed in the terminal.
First, clone the AutoRAG Kotaemon repository:
git clone https://github.com/vkehfdl1/AutoRAG-web-kotaemon.git
cd AutoRAG-web-kotaemon
Then proceed to [fly.io]:
- Install the Fly.io CLI tool:
brew install flyctl
This is for macOS users.
For other operating systems, refer to here.
- Authenticate with Fly.io:
fly auth login
- Deploy on Fly.io:
fly launch
Set up the deployment as shown above. You can set Region, Name, etc., as desired.
Note: The initial deployment may take around 10-15 minutes.
Also, a minimum of 1GB memory is recommended for smooth operation.
Once deployed, you’ll see the Fly URL. If you don’t see it in the CLI, you can find it in the Fly.io dashboard. Clicking on it will open Kotaemon’s initial setup screen.
Upon first launch, you’ll see the initial setup screen as shown above. Here, you can set your OpenAI API Key or Cohere API key, or proceed without setting one by pressing the red button.
Without setting an API key, you won’t be able to use the “Automatic Conversation Title” feature. For private data, avoid setting an API key and proceed to the next step by pressing the red button.
Next, you’ll see the login screen. For the first run, set both the ID and password to admin
. This will allow you to use the service without issues.
After logging in, be sure to enter the Settings tab at the top left and go to the Reasoning settings tab.
In the AutoRAG API Endpoint URL tab, enter the API server URL you noted down earlier. Ensure it ends with .app
and do not add a /
at the end.
Finally, press the Save Changes button!
Now you can use the optimized RAG pipeline with Kotaemon as shown below!
Since Fly.io is a paid service, it’s best to stop deployment when not in use.
To stop an application on Fly.io, use:
fly scale count 0