Skip to content

Commit

Permalink
readme: add Azure deployment description + single Docker improvements…
Browse files Browse the repository at this point in the history
… and fixes (cohere-ai#76)

* readme: add Azure deploy description + single Docker improvements

* readme: add Azure deploy description + single Docker improvements

* Add Sqlite v.3.45.3 for Chroma DB

deployment: add docker compose down command in Makefile. (cohere-ai#65)

Signed-off-by: ifuryst <[email protected]>

coral_web: Add is_available check to tools (cohere-ai#82)

* add is_available check to tools

* add tool error message as tooltip

* disable unavailable tools, show error message if description does not exist

Setup: fix key error (cohere-ai#84)

docs: Update README.md links (cohere-ai#83)

Update README.md links

Some links were still pointing to the old `cohere-ia/toolkit` repository, instead of `cohere-ai/cohere-toolkit`.

docs: clarify setup env for development. (cohere-ai#64)

Signed-off-by: ifuryst <[email protected]>

coral-web: update the starter card options (cohere-ai#73)

* add new start options

* set start option prompts

* clean up

* remove welcome message

* remove notification message

* visual nits

* center start options, fade out when convo is populated

* remove streaming message check

coral-web: include conversationId in file upload (cohere-ai#85)

include conversationId in file upload

Deployment: add local model deployment option (cohere-ai#77)

* Deployment: add local model deployment option

* lint

* add tests

* lint

* fix cohere prompts

Docs: add env setup instructions (cohere-ai#88)

Cli: add dummy tests (cohere-ai#89)

* Cli: add dummy tests

* move cli to backend

backend: Set up next.js to proxy requests to the API (cohere-ai#86)

Set up next.js to proxy requests to the API

tools: Update default NEXT_PUBLIC_API_HOSTNAME for the new api routing (cohere-ai#94)

* Update default NEXT_PUBLIC_API_HOSTNAME for the new api routing

* Also update NEXT_PUBLIC_API_HOSTNAME in README and .env-template

fix: broken backend URL in cli (cohere-ai#93)

Update main.py

Co-authored-by: Tianjing Li <[email protected]>

changed logo and pager header

implemented openAI adapter

added env variable for oai key

implemented working chatgpt

1

fixed

fix

fixed conversation order bug

fixed bugs with incorrect chat history.

added exrract script

other

changes to msgs

msgRow has text duplication bug

fix?

big update

pls

close!

cool

cool

font change

big fixes for message row highlight selection now working perfectly.

small fix for code selection (remaining bug in nodes textContent tooltip)

big commit

big commit

HUGE bug fix for laggy composer !

HUGE bug fix for laggy composer !

implemented annotation prompting

less spam

working annotaiton schema setup

WORKING annotation saving!

working annotation saving in db

huge update, cookies, ua and many fixes for highlights
  • Loading branch information
EugeneLightsOn authored and da03 committed Jul 22, 2024
1 parent 2764fc2 commit 2503714
Show file tree
Hide file tree
Showing 83 changed files with 4,638 additions and 660 deletions.
3 changes: 2 additions & 1 deletion .env-template
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# REQUIRED VARIABLES
NEXT_PUBLIC_API_HOSTNAME=http://localhost:8000
NEXT_PUBLIC_API_HOSTNAME=http://backend:8000
DATABASE_URL=postgresql+psycopg2://postgres:postgres@db:5432

# TOOLS
Expand All @@ -13,6 +13,7 @@ WOLFRAM_ALPHA_APP_ID=<APP_ID_HERE>

# 1 - Cohere Platform
COHERE_API_KEY=<API_KEY_HERE>
OPENAI_API_KEY=<OAI_KEY_HERE>

# 2 - SageMaker
SAGE_MAKER_PROFILE_NAME=<PROFILE NAME>
Expand Down
4 changes: 3 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
dev:
@docker compose watch
down:
@docker compose down
run-tests:
docker compose run --build backend poetry run pytest src/backend/tests/$(file)
run-community-tests:
Expand All @@ -19,7 +21,7 @@ reset-db:
docker volume rm cohere_toolkit_db
setup:
poetry install --only setup --verbose
poetry run python3 cli/main.py
poetry run python3 src/backend/cli/main.py
lint:
poetry run black .
poetry run isort .
Expand Down
110 changes: 106 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,98 @@ make first-run

Follow the instructions to configure the model - either AWS Sagemaker, Azure, or Cohere's platform. This can also be done by running `make setup` (See Option 2 below), which will help generate a file for you, or by manually creating a `.env` file and copying the contents of the provided `.env-template`. Then replacing the values with the correct ones.

#### Detailed environment setup

<details>
<summary>Windows</summary>

1. Install [docker](https://docs.docker.com/desktop/install/windows-install/)
2. Install [git]https://git-scm.com/download/win
3. In PowerShell (Terminal), install [scoop](https://scoop.sh/). After installing, run scoop bucket add extras
4. Install pipx
```bash
scoop install pipx
pipx ensurepath
```
5. Install poetry >= 1.7.1 using
```bash
pipx install poetry
```
6. Install miniconda using
```bash
scoop install miniconda3
conda init powershell
```
7. Restart PowerShell
8. Install the following:
```bash
scoop install postgresql
scoop install make
```
9. Create a new virtual environment with Python 3.11
```bash
conda create -n toolkit python=3.11
conda activate toolkit
```
10. Clone the repo
11. Alternatively to `make first-run` or `make setup`, run
```bash
poetry install --only setup --verbose
poetry run python cli/main.py
make migrate
make dev
```
12. Navigate to https://localhost:4000 in your browser

</details>

<details>
<summary>MacOS</summary>

1. Install Xcode. This can be done from the App Store or terminal
```bash
xcode-select --install
```
2. Install [docker desktop](https://docs.docker.com/desktop/install/mac-install/)
3. Install [homebrew](https://brew.sh/)
4. Install [pipx](https://github.com/pypa/pipx). This is useful for installing poetry later.
```bash
brew install pipx
pipx ensurepath
```
5. Install [postgres](brew install postgresql)
6. Install conda using [miniconda](https://docs.anaconda.com/free/miniconda/index.html)
7. Use your environment manager to create a new virtual environment with Python 3.11
```bash
conda create -n toolkit python=3.11
```
8. Install [poetry >= 1.7.1](https://python-poetry.org/docs/#installing-with-pipx)
```bash
pipx install poetry
```
To test if poetry has been installed correctly,
```bash
conda activate toolkit
poetry --version
```
You should see the version of poetry (e.g. 1.8.2). If poetry is not found, try
```bash
export PATH="$HOME/.local/bin:$PATH"
```
And then retry `poetry --version`
9. Clone the repo and run `make first-run`
10. Navigate to https://localhost:4000 in your browser

</details>

<details>
<summary>Environment variables</summary>

### Cohere Platform

- `COHERE_API_KEY`: If your application will interface with Cohere's API, you will need to supply an API key. Not required if using AWS Sagemaker or Azure.
Sign up at https://dashboard.cohere.com/ to create an API key.
- `NEXT_PUBLIC_API_HOSTNAME`: The backend URL which the frontend will communicate with. Defaults to http://localhost:8000
- `NEXT_PUBLIC_API_HOSTNAME`: The backend URL which the frontend will communicate with. Defaults to http://backend:8000 for use with `docker compose`
- `DATABASE_URL`: Your PostgreSQL database connection string for SQLAlchemy, should follow the format `postgresql+psycopg2://USER:PASSWORD@HOST:PORT`.

### AWS Sagemaker
Expand Down Expand Up @@ -143,6 +227,15 @@ You can deploy Toolkit with one click to Microsoft Azure Platform:

[<img src="https://aka.ms/deploytoazurebutton" height="48px">](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fcohere-ai%2Fcohere-toolkit%2Fmain%2Fazuredeploy.json)

This deployment type uses Azure Container Instances to host the Toolkit.
After your deployment is complete click "Go to resource" button.
1) Check the logs to see if the container is running successfully:
- click on the "Containers" button on the left side of the screen
- click on the container name
- click on "Logs" tab to see the logs
2) Navigate to the "Overview" tab to see the FQDN of the container instance
3) Open the \<FQDN\>:4000 in your browser to access the Toolkit

## Setup for Development

### Setting up Poetry
Expand All @@ -162,6 +255,13 @@ poetry run black .
poetry run isort .
```

## Setting up the Environment Variables
**Please confirm that you have at least one configuration of the Cohere Platform, SageMaker or Azure.**

You have two methods to set up the environment variables:
1. Run `make setup` and follow the instructions to configure it.
2. Run `cp .env-template .env` and adjust the values in the `.env` file according to your situation.

### Setting up Your Local Database

The docker-compose file should spin up a local `db` container with a PostgreSQL server. The first time you setup this project, and whenever new migrations are added, you will need to run:
Expand Down Expand Up @@ -284,9 +384,11 @@ A model deployment is a running version of one of the Cohere command models. The
- This model deployment calls into your Azure deployment. To get an Azure deployment [follow these steps](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-cohere-command). Once you have a model deployed you will need to get the endpoint URL and API key from the azure AI studio https://ai.azure.com/build/ -> Project -> Deployments -> Click your deployment -> You will see your URL and API Key. Note to use the Cohere SDK you need to add `/v1` to the end of the url.
- SageMaker (model_deployments/sagemaker.py)
- This deployment option calls into your SageMaker deployment. To create a SageMaker endpoint [follow the steps here](https://docs.cohere.com/docs/amazon-sagemaker-setup-guide), alternatively [follow a command notebook here](https://github.com/cohere-ai/cohere-aws/tree/main/notebooks/sagemaker). Note your region and endpoint name when executing the notebook as these will be needed in the environment variables.
- Local models with LlamaCPP (community/model_deployments/local_model.py)
- This deployment option calls into a local model. To use this deployment you will need to download a model. You can use Cohere command models or choose between a range of other models that you can see [here](https://github.com/ggerganov/llama.cpp). You will need to enable community features to use this deployment by setting `USE_COMMUNITY_FEATURES=True` in your .env file.
- To add your own deployment:
1. Create a deployment file, add it to [/community/model_deployments](https://github.com/cohere-ai/toolkit/tree/main/src/community/model_deployments) folder, implement the function calls from `BaseDeployment` similar to the other deployments.
2. Add the deployment to [src/community/config/deployments.py](https://github.com/cohere-ai/toolkit/blob/main/src/community/config/deployments.py)
1. Create a deployment file, add it to [/community/model_deployments](https://github.com/cohere-ai/cohere-toolkit/tree/main/src/community/model_deployments) folder, implement the function calls from `BaseDeployment` similar to the other deployments.
2. Add the deployment to [src/community/config/deployments.py](https://github.com/cohere-ai/cohere-toolkit/blob/main/src/community/config/deployments.py)
3. Add the environment variables required to the env template.
- To add a Cohere private deployment, use the steps above copying the cohere platform implementation changing the base_url for your private deployment and add in custom auth steps.

Expand All @@ -310,7 +412,7 @@ Currently the core chat interface is the Coral frontend. To add your own interfa

### How to add a connector to the Toolkit

If you have already created a [connector](https://docs.cohere.com/docs/connectors), it can be used in the toolkit with `ConnectorRetriever`. Add in your configuration and then add the definition in [community/config/tools.py](https://github.com/cohere-ai/toolkit/blob/main/src/community/config/tools.py) similar to `Arxiv` implementation with the category `Category.DataLoader`. You can now use the Coral frontend and API with the connector.
If you have already created a [connector](https://docs.cohere.com/docs/connectors), it can be used in the toolkit with `ConnectorRetriever`. Add in your configuration and then add the definition in [community/config/tools.py](https://github.com/cohere-ai/cohere-toolkit/blob/main/src/community/config/tools.py) similar to `Arxiv` implementation with the category `Category.DataLoader`. You can now use the Coral frontend and API with the connector.

### How to set up web search with the Toolkit

Expand Down
6 changes: 3 additions & 3 deletions docker_scripts/env-defaults
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@ DB_TEMPLATE=${DB_TEMPLATE:-template1}
DB_EXTENSION=${DB_EXTENSION:-}

# Defaults for the toolkit
NEXT_PUBLIC_API_HOSTNAME=${NEXT_PUBLIC_API_HOSTNAME:-http://localhost:8000}
PYTHON_INTERPRETER_URL=${PYTHON_INTERPRETER_URL:-http://localhost:8080}
DATABASE_URL=${DATABASE_URL:-postgresql+psycopg2://postgre:postgre@localhost:5432/toolkit}
export NEXT_PUBLIC_API_HOSTNAME=${NEXT_PUBLIC_API_HOSTNAME:-http://localhost:8000}
export PYTHON_INTERPRETER_URL=${PYTHON_INTERPRETER_URL:-http://localhost:8080}
export DATABASE_URL=${DATABASE_URL:-postgresql+psycopg2://postgre:postgre@localhost:5432/toolkit}
11 changes: 10 additions & 1 deletion docs/deployment_guides/single_container.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,16 @@ You can deploy Toolkit with one click in Microsoft Azure Platform:

[<img src="https://aka.ms/deploytoazurebutton" height="30px">](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fcohere-ai%2Fcohere-toolkit%2Fmain%2Fazuredeploy.json)

### AWS ECS(Fargate) Deployment guide
This deployment type uses Azure Container Instances to host the Toolkit.
After your deployment is complete click "Go to resource" button.
1) Check the logs to see if the container is running successfully:
- click on the "Containers" button on the left side of the screen
- click on the container name
- click on "Logs" tab to see the logs
2) Navigate to the "Overview" tab to see the FQDN of the container instance
3) Open the \<FQDN\>:4000 in your browser to access the Toolkit

### AWS ECS Deployment guides
- [AWS ECS Fargate Deployment](aws_ecs_single_container.md): Deploy the Toolkit single container to AWS ECS(Fargate).
- [AWS ECS EC2 Deployment](docs/deployment_guides/aws_ecs_single_container_ec2.md): Deploy the Toolkit single container to AWS ECS(EC2).

Expand Down
2 changes: 1 addition & 1 deletion docs/postman/Toolkit.postman_collection.json
Original file line number Diff line number Diff line change
Expand Up @@ -455,7 +455,7 @@
{
"key": "file",
"type": "file",
"src": "/Users/luisa/Downloads/Aya_dataset__ACL_edition.pdf"
"src": "/Users/luisa/Downloads/Aya_dataset.pdf"
}
]
},
Expand Down
77 changes: 77 additions & 0 deletions extract.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
from src.backend.crud.conversation import extract_conversations, Conversation

from dotenv import load_dotenv
from sqlalchemy import create_engine, text
from sqlalchemy.orm import Session, sessionmaker

import json
from datetime import datetime

load_dotenv()

SQLALCHEMY_DATABASE_URL = "postgresql+psycopg2://postgres:postgres@localhost:5433"

engine = create_engine(
SQLALCHEMY_DATABASE_URL, echo=False
)

db = Session(autocommit=False, autoflush=False, bind=engine)

def run_script():

"""
Saves all conversations in the database in format:
\n`conv_id` : {conversation attributes}
"""

conversations = extract_conversations(db)

file_path = "conversations.txt"

data = {}

#Format the data and assemble the new conversation dictionary
for conv in conversations:

id, p_conv = parse_conversation(conv)
data[id] = p_conv

print(conversations[-1].description)
print(conversations[-1].messages[-1].text)

#Save it
with open(file_path, "w") as file:
json.dump(data, file)

print(f"Succesfully saved file at {file_path}! Saved {len(conversations)} conversations!")
print("Checking if data can be successfully loaded . . .")

#Check to see if we can load data without errors.
try:
with open(file_path, "r") as file:
loaded_data = json.load(file)
print("Sucess!")
except Exception as e:
print("We were unable to load the data, this means it isnt being saved properly and is corrupted.")
print(f"Error message: {e}")

#Turns a conversation into something we can store.
def parse_conversation(conv : Conversation) -> tuple[str, dict]:
"""
Returns a conversation_id and dictionary of all conversation data.
"""

parsed_messages = [{'role' : msg.agent, 'text' : msg.text, 'm_id' : msg.id, 'annotations' : [{'a_id' : annot.id, 'htext' : annot.htext, 'annotation' : annot.annotation, 'start' : annot.start, 'end' : annot.end} for annot in msg.annotations], 'position' : msg.position} for msg in conv.messages]

return conv.id, {
'date' : conv.created_at.strftime("%Y-%m-%d"),
'user_id' : conv.user_id,
'messages' : parsed_messages
}

if __name__ == "__main__":
run_script()
35 changes: 34 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ llama-index = "^0.10.11"
wolframalpha = "^5.0.0"
transformers = "^4.40.1"
torch = "^2.3.0"
llama-cpp-python = "^0.2.67"

[build-system]
requires = ["poetry-core"]
Expand Down
Loading

0 comments on commit 2503714

Please sign in to comment.