Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,10 +85,14 @@ To run the application in a container; download the [source](https://github.com/
1. [Configure](https://oracle-samples.github.io/ai-optimizer/client/configuration/index.html) the **AI Optimizer**.

#### Got OCI?
The **AI Optimizer** can be deployed with an Oracle Autonomous Database 23ai using infrastructure as code. Deploy the **AI Optimizer** in Oracle Cloud Infrastructure using OCI Resource Manager:

The **AI Optimizer** can be deployed in Oracle Cloud Infrastructure (OCI) using Infrastructure as Code (IaC).

Choose either a light-weight Virtual Machine or robust Oracle Kubernetes Engine deployment, both with an Oracle Autonomous Database 23ai:
[![Deploy to Oracle Cloud][magic_button]][magic_arch_stack]

For more information, please visit the [IaC Documentation](https://oracle-samples.github.io/ai-optimizer/advanced/iac/index.html).

## Contributing

This project welcomes contributions from the community. Before submitting a pull request, please [review our contribution guide](./CONTRIBUTING.md).
Expand Down
86 changes: 86 additions & 0 deletions docs/content/advanced/iac.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
+++
title = 'Infrastructure as Code'
weight = 1
+++

<!--
Copyright (c) 2024, 2025, Oracle and/or its affiliates.
Licensed under the Universal Permissive License v1.0 as shown at http://oss.oracle.com/licenses/upl.

spell-checker: ignore opentofu Ollama
-->

The {{< full_app_ref >}} can easily be deployed in Oracle Cloud Infrastructure (**OCI**) using Infrastructure as Code (**IaC**) provided in the source [opentofu](https://github.com/oracle-samples/ai-optimizer/tree/main/opentofu) directory.

Choose between deploying a light-weight [Virtual Machine](#virtual-machine) or robust [Oracle Kubernetes Engine (**OKE**)](#oracle-kubernetes-engine) along with the Oracle Autonomous Database for a fully configured {{< short_app_ref >}} environment, ready to use.

While the **IaC** can be run from a command-line with prior experience, the steps outlined here use [Oracle Cloud Resource Manager](https://docs.oracle.com/en-us/iaas/Content/ResourceManager/Concepts/resourcemanager.htm) to simplify the process. To get started:

{{< imagelink url="https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-samples/ai-optimizer/releases/latest/download/ai-optimizer-stack.zip" src="https://oci-resourcemanager-plugin.plugins.oci.oraclecloud.com/latest/deploy-to-oracle-cloud.svg" alt="Deploy to Oracle Cloud" >}}

## Virtual Machine

The Virtual Machine (VM) deployment provisions both the {{< short_app_ref >}} API Server and GUI Client together in an "All-in-One" configuration for experimentation and development. As part of the deployment, one local Large Language Model and one Embedding Model is made available out-of-the-box. However, as these models will be running on a CPU VM, their performance will be very poor.

### Configure Variables

After clicking the "Deploy to Oracle Cloud" button and authenticating to your tenancy; you will be presented with the {{< short_app_ref >}} stack information.

1. Review the Terms, tick the box to accept (if you do), and click "Next" to Configure Variables

![Stack Information](../images/iac_stack_information.png)

1. Change the Infrastructure to "VM"

![Stack - AI Optimizer](../images/iac_stack_optimizer.png)

#### Access Control

Most of the other configuration options are self-explanatory, but let's highlight those important for the **Security** of your deployment.

* The {{< short_app_ref >}} is often configured with authentication details for your OCI Tenancy, Autonomous Database, and API Keys for AI Models. Since these details are accessible via the Application GUI, access must be restricted to a limited set of CIDR blocks.

* The {{< short_app_ref >}} REST endpoints require API token authentication, providing some protection. However, you should still restrict access to a limited set of CIDR blocks where possible for added security.

* The Oracle Autonomous Database requires mTLS authentication with a wallet, providing strong initial protection. However, it's recommended to further restrict access to a limited set of CIDR blocks.

![Stack - Access Control](../images/iac_stack_access_control.png)

To restrict access, provide a comma-separated list of CIDR blocks, for example: `192.168.1.0/24,10.0.0.0/16,203.0.113.42/32`

In this example:
* `192.168.1.0/24` – Allows access from all IPs in the range 192.168.1.0 to 192.168.1.255 (a typical subnet).
* `10.0.0.0/16` – Allows access from 10.0.0.0 to 10.0.255.255 (a broader range).
* `203.0.113.42/32` – Allows access from a single public IP address only. The /32 denotes a single host.

### Review and Apply

After configuring the variables, click "Next" to review and apply the stack.

![Stack - Review and Apply](../images/iac_stack_review_apply.png)

Tick the Apply box and click "Create".

### Job Details

The next screen will show the progress of the Apply job. Once the job has Succeeded, the {{< short_app_ref >}} has been deployed!

The Application Information tab will provide the URL's to access the {{< short_app_ref >}} GUI and API Server. In the "All-in-One" deployment on the VM, the API Server will only become accessible after visiting the GUI at least once.

![Stack - VM Application Information](../images/iac_stack_vm_info.png)

{{% notice style="code" title="502 Bad Gateway: Communication Breakdown!" icon="fire" %}}
Although the infrastructure is deployed, the {{< short_app_ref >}} may still be initializing, which can result in a 502 Bad Gateway error when accessing the URLs. Please allow up to 10 minutes for the configuration to complete.
{{% /notice %}}

To get a better understanding of how the API Server works and to obtain the API Key for making REST calls, review the [API Server documentation](client/api_server/).

### Cleanup

To destroy the {{< short_app_ref >}} infrastructure, in **OCI** navigate to `Developer Services` -> `Stacks`. Choose the Compartment the {{< short_app_ref >}} was deployed into and select the stack Name. Click on the "Destroy" button.

## Oracle Kubernetes Engine

{{% notice style="code" title="Documentation is Hard!" icon="circle-info" %}}
More information coming soon... 11-June-2025
{{% /notice %}}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/content/advanced/images/iac_stack_vm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions docs/content/client/api_server/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ Copyright (c) 2024, 2025, Oracle and/or its affiliates.
Licensed under the Universal Permissive License v1.0 as shown at http://oss.oracle.com/licenses/upl.
-->

The {{< full_app_ref >}} is powered by an API Server to allow for any client to access its features. The API Server can be run as part of the provided {{< short_app_ref >}} GUI client or as a separate, independent process.
The {{< full_app_ref >}} is powered by an API Server to allow for any client to access its features. The API Server can be run as part of the provided {{< short_app_ref >}} GUI client (referred to as the "All-in-One" deployment) or as a separate, independent process.

Each client connected to the API Server, including those from the {{< short_app_ref >}} GUI client, share the same configuration but maintain their own settings. Database, Model, OCI, and Prompt configurations are used across all clients; but which database, models, OCI profile, and prompts set are specific to each client.

When started as part of the {{< short_app_ref >}} client, you can change the Port it listens on and the API Server Key. A restart is required for the changes to take effect.
When started as part of the {{< short_app_ref >}} "All-in-One" deployment, you can change the Port it listens on and the API Server Key. A restart is required for the changes to take effect.

![Server Configuration](images/api_server_config.png)

Expand Down
2 changes: 1 addition & 1 deletion docs/content/client/configuration/model_config.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ weight = 10
Copyright (c) 2024, 2025, Oracle and/or its affiliates.
Licensed under the Universal Permissive License v1.0 as shown at http://oss.oracle.com/licenses/upl.

spell-checker:ignore ollama, mxbai, nomic, thenlper, minilm, uniqueid, huggingface, hftei, openai, pplx
spell-checker:ignore ollama, mxbai, nomic, thenlper, minilm, uniqueid, huggingface, hftei, openai, pplx, genai, ocid, configfile
-->

## Supported Models
Expand Down
3 changes: 3 additions & 0 deletions docs/layouts/shortcodes/imagelink.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
<a href="{{ .Get "url" }}">
<img src="{{ .Get "src" }}" alt="{{ .Get "alt" }}">
</a>
1 change: 0 additions & 1 deletion opentofu/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,6 @@ module "vm" {
adb_password = local.adb_password
streamlit_client_port = local.streamlit_client_port
fastapi_server_port = local.fastapi_server_port
source_repository = var.source_repository
compute_os_ver = var.compute_os_ver
compute_cpu_ocpu = var.compute_cpu_ocpu
compute_cpu_shape = var.compute_cpu_shape
Expand Down
7 changes: 7 additions & 0 deletions opentofu/modules/network/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,13 @@ resource "oci_core_default_security_list" "lockdown" {
compartment_id = oci_core_vcn.vcn.compartment_id
display_name = format("%s-default-sec-list", var.label_prefix)
manage_default_resource_id = oci_core_vcn.vcn.default_security_list_id
egress_security_rules {
description = "Egress for Bastion Access"
destination = "0.0.0.0/0"
destination_type = "CIDR_BLOCK"
protocol = "all"
stateless = "false"
}
}

// Public Subnet
Expand Down
2 changes: 1 addition & 1 deletion opentofu/modules/network/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ variable "infra" {
variable "vcn_cidr" {
type = map(any)
default = {
"VM" = ["10.42.0.0/28"]
"VM" = ["10.42.0.0/27"]
"Kubernetes" = ["10.42.0.0/16"]
}
}
8 changes: 8 additions & 0 deletions opentofu/modules/vm/data.tf
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,12 @@ data "oci_core_images" "images" {

data "oci_core_vcn" "vcn" {
vcn_id = var.vcn_id
}

data "oci_core_services" "core_services" {
filter {
name = "name"
values = ["All .* Services In Oracle Services Network"]
regex = true
}
}
1 change: 0 additions & 1 deletion opentofu/modules/vm/locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,5 @@ locals {
oci_region = var.region
db_name = var.adb_name
db_password = var.adb_password
source_code = var.source_repository
})
}
9 changes: 9 additions & 0 deletions opentofu/modules/vm/nsgs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,15 @@ resource "oci_core_network_security_group_security_rule" "vcn_icmp_ingress" {
source_type = "CIDR_BLOCK"
}

resource "oci_core_network_security_group_security_rule" "vcn_services_egress" {
network_security_group_id = oci_core_network_security_group.compute.id
description = "Compute OCI Services - All Ingress."
direction = "INGRESS"
protocol = "all"
source = data.oci_core_services.core_services.services.0.cidr_block
source_type = "SERVICE_CIDR_BLOCK"
}

resource "oci_core_network_security_group_security_rule" "vcn_icmp_egress" {
network_security_group_id = oci_core_network_security_group.compute.id
description = "Compute Path Discovery - ICMP Egress."
Expand Down
77 changes: 53 additions & 24 deletions opentofu/modules/vm/templates/cloudinit-compute.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,44 @@
# Copyright (c) 2024, 2025, Oracle and/or its affiliates.
# All rights reserved. The Universal Permissive License (UPL), Version 1.0 as shown at http://oss.oracle.com/licenses/upl
# spell-checker: disable

package_update: false
packages:
- git
- python3.11

users:
- default
- name: oracleai
uid: 10001
gid: 10001
shell: /bin/bash
homedir: /app

package_update: false
packages:
- python36-oci-cli
- python3.11

write_files:
- path: /etc/systemd/system/ai-optimizer.service
permissions: '0644'
content: |
[Unit]
Description=Run app start script
After=network.target

[Service]
Type=simple
ExecStart=/bin/bash /app/start.sh
User=oracleai
Group=oracleai
WorkingDirectory=/app
Environment="HOME=/app"
Restart=on-failure

[Install]
WantedBy=multi-user.target

- path: /tmp/root_setup.sh
permissions: '0755'
content: |
#!/bin/env bash
mkdir -p /app
chown oracleai:oracleai /app
curl -fsSL https://ollama.com/install.sh | sh
systemctl enable ollama
systemctl daemon-reload
Expand All @@ -28,31 +48,28 @@ write_files:
firewall-offline-cmd --zone=public --add-port 8501/tcp
firewall-offline-cmd --zone=public --add-port 8000/tcp
systemctl start firewalld.service
append: false
defer: false

- path: /tmp/app_setup.sh
permissions: '0755'
content: |
#!/bin/bash
# Setup for Instance Principles
export OCI_CLI_AUTH=instance_principal

# Setup oci config.ini to indicate to app to use instance_principal
# mkdir -p /app/.oci
# echo -e '[DEFAULT]\nregion=${oci_region}\ntenancy=${tenancy_id}' > /app/.oci/config
# oci setup repair-file-permissions --file /app/.oci/config

# Download/Setup Source Code
curl -L -o /tmp/source.tar.gz ${source_code}.tar.gz
curl -s https://api.github.com/repos/oracle-samples/ai-optimizer/releases/latest \
| grep tarball_url \
| cut -d '"' -f 4 \
| xargs curl -L -o /tmp/source.tar.gz
tar zxf /tmp/source.tar.gz --strip-components=2 -C /app '*/src'
cd /app
python3.11 -m venv .venv
source .venv/bin/activate
pip3.11 install --upgrade pip wheel setuptools oci-cli
pip3.11 install --upgrade pip wheel setuptools
pip3.11 install torch==2.6.0+cpu -f https://download.pytorch.org/whl/cpu/torch
pip3.11 install -e ".[all]" --quiet --no-input &
INSTALL_PID=$!

# Wait for Database and Download Wallet
while [ $SECONDS -lt $((SECONDS + 600)) ]; do
echo "Waiting for Database... ${db_name}"
Expand All @@ -64,7 +81,7 @@ write_files:
break
fi
sleep 15
done
done
mkdir -p /app/tns_admin
unzip -o /tmp/wallet.zip -d /app/tns_admin

Expand All @@ -75,18 +92,30 @@ write_files:
# Wait for python modules to finish
wait $INSTALL_PID

# Startup application
- path: /app/start.sh
permissions: '0750'
content: |
#!/bin/bash
export OCI_CLI_AUTH=instance_principal
export DB_USERNAME='ADMIN'
export DB_PASSWORD='${db_password}'
export DB_DSN='${db_name}_TP'
export DB_WALLET_PASSWORD='${db_password}'
export ON_PREM_OLLAMA_URL=http://127.0.0.1:11434
export LOG_LEVEL=DEBUG
nohup streamlit run launch_client.py --server.port 8501 --server.address 0.0.0.0 &
append: false
defer: false
# Clean Cache
find /app -type d -name "__pycache__" -exec rm -rf {} \;
find /app -type d -name ".numba_cache" -exec rm -rf {} \;
find /app -name "*.nbc" -delete
# Set venv and start
source /app/.venv/bin/activate
streamlit run /app/launch_client.py --server.port 8501 --server.address 0.0.0.0

runcmd:
- /tmp/root_setup.sh
- su - oracleai -c '/tmp/app_setup.sh'
- rm /tmp/app_setup.sh /tmp/root_setup.sh /tmp/source.tar.gz /tmp/wallet.zip
- rm /tmp/app_setup.sh /tmp/root_setup.sh /tmp/source.tar.gz /tmp/wallet.zip
- chown oracleai:oracleai /app/start.sh
- systemctl daemon-reexec
- systemctl daemon-reload
- systemctl enable ai-optimizer.service
- systemctl start ai-optimizer.service
4 changes: 0 additions & 4 deletions opentofu/modules/vm/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,6 @@ variable "adb_password" {
type = string
}

variable "source_repository" {
type = string
}

variable "streamlit_client_port" {
type = number
}
Expand Down
3 changes: 1 addition & 2 deletions opentofu/schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ variableGroups:
variables:
- adb_version
- k8s_version
- source_repository
- compute_gpu_shape
- compute_os_ver
visible: false
Expand Down Expand Up @@ -236,7 +235,7 @@ variables:

adb_whitelist_cidrs:
type: array
title: "ADB Access Control"
title: "Access Control for the Autonomous Database"
required: true
default: "0.0.0.0/0"
pattern: "((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9]).(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9]).(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9]).(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\\/(3[0-2]|[1-2]?[0-9])(,?)( ?)){1,}$"
Expand Down
5 changes: 0 additions & 5 deletions opentofu/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,6 @@ variable "private_key_path" {
default = ""
}

variable "source_repository" {
description = "Code that will pulled onto compute; ensure correct branch/tag."
default = "https://github.com/oracle-samples/ai-optimizer/archive/refs/heads/main"
}

// Infrastructure Type/Label
variable "infrastructure" {
description = "Choose between a full Kubernetes or a light-weight Virtual Machine deployment."
Expand Down