diff --git a/README.md b/README.md index 36b464e3..a9ed6607 100644 --- a/README.md +++ b/README.md @@ -85,10 +85,14 @@ To run the application in a container; download the [source](https://github.com/ 1. [Configure](https://oracle-samples.github.io/ai-optimizer/client/configuration/index.html) the **AI Optimizer**. #### Got OCI? -The **AI Optimizer** can be deployed with an Oracle Autonomous Database 23ai using infrastructure as code. Deploy the **AI Optimizer** in Oracle Cloud Infrastructure using OCI Resource Manager: +The **AI Optimizer** can be deployed in Oracle Cloud Infrastructure (OCI) using Infrastructure as Code (IaC). + +Choose either a light-weight Virtual Machine or robust Oracle Kubernetes Engine deployment, both with an Oracle Autonomous Database 23ai: [![Deploy to Oracle Cloud][magic_button]][magic_arch_stack] +For more information, please visit the [IaC Documentation](https://oracle-samples.github.io/ai-optimizer/advanced/iac/index.html). + ## Contributing This project welcomes contributions from the community. Before submitting a pull request, please [review our contribution guide](./CONTRIBUTING.md). diff --git a/docs/content/advanced/iac.md b/docs/content/advanced/iac.md new file mode 100644 index 00000000..09fba874 --- /dev/null +++ b/docs/content/advanced/iac.md @@ -0,0 +1,86 @@ ++++ +title = 'Infrastructure as Code' +weight = 1 ++++ + + + +The {{< full_app_ref >}} can easily be deployed in Oracle Cloud Infrastructure (**OCI**) using Infrastructure as Code (**IaC**) provided in the source [opentofu](https://github.com/oracle-samples/ai-optimizer/tree/main/opentofu) directory. + +Choose between deploying a light-weight [Virtual Machine](#virtual-machine) or robust [Oracle Kubernetes Engine (**OKE**)](#oracle-kubernetes-engine) along with the Oracle Autonomous Database for a fully configured {{< short_app_ref >}} environment, ready to use. + +While the **IaC** can be run from a command-line with prior experience, the steps outlined here use [Oracle Cloud Resource Manager](https://docs.oracle.com/en-us/iaas/Content/ResourceManager/Concepts/resourcemanager.htm) to simplify the process. To get started: + +{{< imagelink url="https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oracle-samples/ai-optimizer/releases/latest/download/ai-optimizer-stack.zip" src="https://oci-resourcemanager-plugin.plugins.oci.oraclecloud.com/latest/deploy-to-oracle-cloud.svg" alt="Deploy to Oracle Cloud" >}} + +## Virtual Machine + +The Virtual Machine (VM) deployment provisions both the {{< short_app_ref >}} API Server and GUI Client together in an "All-in-One" configuration for experimentation and development. As part of the deployment, one local Large Language Model and one Embedding Model is made available out-of-the-box. However, as these models will be running on a CPU VM, their performance will be very poor. + +### Configure Variables + +After clicking the "Deploy to Oracle Cloud" button and authenticating to your tenancy; you will be presented with the {{< short_app_ref >}} stack information. + +1. Review the Terms, tick the box to accept (if you do), and click "Next" to Configure Variables + + ![Stack Information](../images/iac_stack_information.png) + +1. Change the Infrastructure to "VM" + + ![Stack - AI Optimizer](../images/iac_stack_optimizer.png) + +#### Access Control + +Most of the other configuration options are self-explanatory, but let's highlight those important for the **Security** of your deployment. + +* The {{< short_app_ref >}} is often configured with authentication details for your OCI Tenancy, Autonomous Database, and API Keys for AI Models. Since these details are accessible via the Application GUI, access must be restricted to a limited set of CIDR blocks. + +* The {{< short_app_ref >}} REST endpoints require API token authentication, providing some protection. However, you should still restrict access to a limited set of CIDR blocks where possible for added security. + +* The Oracle Autonomous Database requires mTLS authentication with a wallet, providing strong initial protection. However, it's recommended to further restrict access to a limited set of CIDR blocks. + +![Stack - Access Control](../images/iac_stack_access_control.png) + +To restrict access, provide a comma-separated list of CIDR blocks, for example: `192.168.1.0/24,10.0.0.0/16,203.0.113.42/32` + +In this example: +* `192.168.1.0/24` – Allows access from all IPs in the range 192.168.1.0 to 192.168.1.255 (a typical subnet). +* `10.0.0.0/16` – Allows access from 10.0.0.0 to 10.0.255.255 (a broader range). +* `203.0.113.42/32` – Allows access from a single public IP address only. The /32 denotes a single host. + +### Review and Apply + +After configuring the variables, click "Next" to review and apply the stack. + +![Stack - Review and Apply](../images/iac_stack_review_apply.png) + +Tick the Apply box and click "Create". + +### Job Details + +The next screen will show the progress of the Apply job. Once the job has Succeeded, the {{< short_app_ref >}} has been deployed! + +The Application Information tab will provide the URL's to access the {{< short_app_ref >}} GUI and API Server. In the "All-in-One" deployment on the VM, the API Server will only become accessible after visiting the GUI at least once. + +![Stack - VM Application Information](../images/iac_stack_vm_info.png) + +{{% notice style="code" title="502 Bad Gateway: Communication Breakdown!" icon="fire" %}} +Although the infrastructure is deployed, the {{< short_app_ref >}} may still be initializing, which can result in a 502 Bad Gateway error when accessing the URLs. Please allow up to 10 minutes for the configuration to complete. +{{% /notice %}} + +To get a better understanding of how the API Server works and to obtain the API Key for making REST calls, review the [API Server documentation](client/api_server/). + +### Cleanup + +To destroy the {{< short_app_ref >}} infrastructure, in **OCI** navigate to `Developer Services` -> `Stacks`. Choose the Compartment the {{< short_app_ref >}} was deployed into and select the stack Name. Click on the "Destroy" button. + +## Oracle Kubernetes Engine + +{{% notice style="code" title="Documentation is Hard!" icon="circle-info" %}} +More information coming soon... 11-June-2025 +{{% /notice %}} \ No newline at end of file diff --git a/docs/content/advanced/images/iac_stack_access_control.png b/docs/content/advanced/images/iac_stack_access_control.png new file mode 100644 index 00000000..ce13bd95 Binary files /dev/null and b/docs/content/advanced/images/iac_stack_access_control.png differ diff --git a/docs/content/advanced/images/iac_stack_information.png b/docs/content/advanced/images/iac_stack_information.png new file mode 100644 index 00000000..2031e21d Binary files /dev/null and b/docs/content/advanced/images/iac_stack_information.png differ diff --git a/docs/content/advanced/images/iac_stack_optimizer.png b/docs/content/advanced/images/iac_stack_optimizer.png new file mode 100644 index 00000000..299eb98a Binary files /dev/null and b/docs/content/advanced/images/iac_stack_optimizer.png differ diff --git a/docs/content/advanced/images/iac_stack_review_apply.png b/docs/content/advanced/images/iac_stack_review_apply.png new file mode 100644 index 00000000..b1867c07 Binary files /dev/null and b/docs/content/advanced/images/iac_stack_review_apply.png differ diff --git a/docs/content/advanced/images/iac_stack_vm.png b/docs/content/advanced/images/iac_stack_vm.png new file mode 100644 index 00000000..7afac7a4 Binary files /dev/null and b/docs/content/advanced/images/iac_stack_vm.png differ diff --git a/docs/content/advanced/images/iac_stack_vm_info.png b/docs/content/advanced/images/iac_stack_vm_info.png new file mode 100644 index 00000000..3fda07e5 Binary files /dev/null and b/docs/content/advanced/images/iac_stack_vm_info.png differ diff --git a/docs/content/client/api_server/_index.md b/docs/content/client/api_server/_index.md index 76c50247..29648ee7 100644 --- a/docs/content/client/api_server/_index.md +++ b/docs/content/client/api_server/_index.md @@ -7,11 +7,11 @@ Copyright (c) 2024, 2025, Oracle and/or its affiliates. Licensed under the Universal Permissive License v1.0 as shown at http://oss.oracle.com/licenses/upl. --> -The {{< full_app_ref >}} is powered by an API Server to allow for any client to access its features. The API Server can be run as part of the provided {{< short_app_ref >}} GUI client or as a separate, independent process. +The {{< full_app_ref >}} is powered by an API Server to allow for any client to access its features. The API Server can be run as part of the provided {{< short_app_ref >}} GUI client (referred to as the "All-in-One" deployment) or as a separate, independent process. Each client connected to the API Server, including those from the {{< short_app_ref >}} GUI client, share the same configuration but maintain their own settings. Database, Model, OCI, and Prompt configurations are used across all clients; but which database, models, OCI profile, and prompts set are specific to each client. -When started as part of the {{< short_app_ref >}} client, you can change the Port it listens on and the API Server Key. A restart is required for the changes to take effect. +When started as part of the {{< short_app_ref >}} "All-in-One" deployment, you can change the Port it listens on and the API Server Key. A restart is required for the changes to take effect. ![Server Configuration](images/api_server_config.png) diff --git a/docs/content/client/configuration/model_config.md b/docs/content/client/configuration/model_config.md index 68123af6..0c05a5f4 100644 --- a/docs/content/client/configuration/model_config.md +++ b/docs/content/client/configuration/model_config.md @@ -6,7 +6,7 @@ weight = 10 Copyright (c) 2024, 2025, Oracle and/or its affiliates. Licensed under the Universal Permissive License v1.0 as shown at http://oss.oracle.com/licenses/upl. -spell-checker:ignore ollama, mxbai, nomic, thenlper, minilm, uniqueid, huggingface, hftei, openai, pplx +spell-checker:ignore ollama, mxbai, nomic, thenlper, minilm, uniqueid, huggingface, hftei, openai, pplx, genai, ocid, configfile --> ## Supported Models diff --git a/docs/layouts/shortcodes/imagelink.html b/docs/layouts/shortcodes/imagelink.html new file mode 100644 index 00000000..b6b85960 --- /dev/null +++ b/docs/layouts/shortcodes/imagelink.html @@ -0,0 +1,3 @@ + + {{ .Get + \ No newline at end of file diff --git a/opentofu/main.tf b/opentofu/main.tf index fff9a3fb..c471f261 100644 --- a/opentofu/main.tf +++ b/opentofu/main.tf @@ -92,7 +92,6 @@ module "vm" { adb_password = local.adb_password streamlit_client_port = local.streamlit_client_port fastapi_server_port = local.fastapi_server_port - source_repository = var.source_repository compute_os_ver = var.compute_os_ver compute_cpu_ocpu = var.compute_cpu_ocpu compute_cpu_shape = var.compute_cpu_shape diff --git a/opentofu/modules/network/main.tf b/opentofu/modules/network/main.tf index 255d4898..759c9ade 100644 --- a/opentofu/modules/network/main.tf +++ b/opentofu/modules/network/main.tf @@ -19,6 +19,13 @@ resource "oci_core_default_security_list" "lockdown" { compartment_id = oci_core_vcn.vcn.compartment_id display_name = format("%s-default-sec-list", var.label_prefix) manage_default_resource_id = oci_core_vcn.vcn.default_security_list_id + egress_security_rules { + description = "Egress for Bastion Access" + destination = "0.0.0.0/0" + destination_type = "CIDR_BLOCK" + protocol = "all" + stateless = "false" + } } // Public Subnet diff --git a/opentofu/modules/network/variables.tf b/opentofu/modules/network/variables.tf index 349cf400..7234f471 100644 --- a/opentofu/modules/network/variables.tf +++ b/opentofu/modules/network/variables.tf @@ -17,7 +17,7 @@ variable "infra" { variable "vcn_cidr" { type = map(any) default = { - "VM" = ["10.42.0.0/28"] + "VM" = ["10.42.0.0/27"] "Kubernetes" = ["10.42.0.0/16"] } } \ No newline at end of file diff --git a/opentofu/modules/vm/data.tf b/opentofu/modules/vm/data.tf index 3baee406..ed199803 100644 --- a/opentofu/modules/vm/data.tf +++ b/opentofu/modules/vm/data.tf @@ -21,4 +21,12 @@ data "oci_core_images" "images" { data "oci_core_vcn" "vcn" { vcn_id = var.vcn_id +} + +data "oci_core_services" "core_services" { + filter { + name = "name" + values = ["All .* Services In Oracle Services Network"] + regex = true + } } \ No newline at end of file diff --git a/opentofu/modules/vm/locals.tf b/opentofu/modules/vm/locals.tf index 6f7fb10d..26ef0a0c 100644 --- a/opentofu/modules/vm/locals.tf +++ b/opentofu/modules/vm/locals.tf @@ -9,6 +9,5 @@ locals { oci_region = var.region db_name = var.adb_name db_password = var.adb_password - source_code = var.source_repository }) } \ No newline at end of file diff --git a/opentofu/modules/vm/nsgs.tf b/opentofu/modules/vm/nsgs.tf index ae3f9eaa..1de8840c 100644 --- a/opentofu/modules/vm/nsgs.tf +++ b/opentofu/modules/vm/nsgs.tf @@ -26,6 +26,15 @@ resource "oci_core_network_security_group_security_rule" "vcn_icmp_ingress" { source_type = "CIDR_BLOCK" } +resource "oci_core_network_security_group_security_rule" "vcn_services_egress" { + network_security_group_id = oci_core_network_security_group.compute.id + description = "Compute OCI Services - All Ingress." + direction = "INGRESS" + protocol = "all" + source = data.oci_core_services.core_services.services.0.cidr_block + source_type = "SERVICE_CIDR_BLOCK" +} + resource "oci_core_network_security_group_security_rule" "vcn_icmp_egress" { network_security_group_id = oci_core_network_security_group.compute.id description = "Compute Path Discovery - ICMP Egress." diff --git a/opentofu/modules/vm/templates/cloudinit-compute.tpl b/opentofu/modules/vm/templates/cloudinit-compute.tpl index c70f4e9f..c4f3fbd2 100644 --- a/opentofu/modules/vm/templates/cloudinit-compute.tpl +++ b/opentofu/modules/vm/templates/cloudinit-compute.tpl @@ -2,24 +2,44 @@ # Copyright (c) 2024, 2025, Oracle and/or its affiliates. # All rights reserved. The Universal Permissive License (UPL), Version 1.0 as shown at http://oss.oracle.com/licenses/upl # spell-checker: disable - -package_update: false -packages: - - git - - python3.11 - users: + - default - name: oracleai uid: 10001 - gid: 10001 shell: /bin/bash homedir: /app +package_update: false +packages: + - python36-oci-cli + - python3.11 + write_files: + - path: /etc/systemd/system/ai-optimizer.service + permissions: '0644' + content: | + [Unit] + Description=Run app start script + After=network.target + + [Service] + Type=simple + ExecStart=/bin/bash /app/start.sh + User=oracleai + Group=oracleai + WorkingDirectory=/app + Environment="HOME=/app" + Restart=on-failure + + [Install] + WantedBy=multi-user.target + - path: /tmp/root_setup.sh permissions: '0755' content: | #!/bin/env bash + mkdir -p /app + chown oracleai:oracleai /app curl -fsSL https://ollama.com/install.sh | sh systemctl enable ollama systemctl daemon-reload @@ -28,8 +48,7 @@ write_files: firewall-offline-cmd --zone=public --add-port 8501/tcp firewall-offline-cmd --zone=public --add-port 8000/tcp systemctl start firewalld.service - append: false - defer: false + - path: /tmp/app_setup.sh permissions: '0755' content: | @@ -37,22 +56,20 @@ write_files: # Setup for Instance Principles export OCI_CLI_AUTH=instance_principal - # Setup oci config.ini to indicate to app to use instance_principal - # mkdir -p /app/.oci - # echo -e '[DEFAULT]\nregion=${oci_region}\ntenancy=${tenancy_id}' > /app/.oci/config - # oci setup repair-file-permissions --file /app/.oci/config - # Download/Setup Source Code - curl -L -o /tmp/source.tar.gz ${source_code}.tar.gz + curl -s https://api.github.com/repos/oracle-samples/ai-optimizer/releases/latest \ + | grep tarball_url \ + | cut -d '"' -f 4 \ + | xargs curl -L -o /tmp/source.tar.gz tar zxf /tmp/source.tar.gz --strip-components=2 -C /app '*/src' cd /app python3.11 -m venv .venv source .venv/bin/activate - pip3.11 install --upgrade pip wheel setuptools oci-cli + pip3.11 install --upgrade pip wheel setuptools pip3.11 install torch==2.6.0+cpu -f https://download.pytorch.org/whl/cpu/torch pip3.11 install -e ".[all]" --quiet --no-input & INSTALL_PID=$! - + # Wait for Database and Download Wallet while [ $SECONDS -lt $((SECONDS + 600)) ]; do echo "Waiting for Database... ${db_name}" @@ -64,7 +81,7 @@ write_files: break fi sleep 15 - done + done mkdir -p /app/tns_admin unzip -o /tmp/wallet.zip -d /app/tns_admin @@ -75,18 +92,30 @@ write_files: # Wait for python modules to finish wait $INSTALL_PID - # Startup application + - path: /app/start.sh + permissions: '0750' + content: | + #!/bin/bash + export OCI_CLI_AUTH=instance_principal export DB_USERNAME='ADMIN' export DB_PASSWORD='${db_password}' export DB_DSN='${db_name}_TP' export DB_WALLET_PASSWORD='${db_password}' export ON_PREM_OLLAMA_URL=http://127.0.0.1:11434 - export LOG_LEVEL=DEBUG - nohup streamlit run launch_client.py --server.port 8501 --server.address 0.0.0.0 & - append: false - defer: false + # Clean Cache + find /app -type d -name "__pycache__" -exec rm -rf {} \; + find /app -type d -name ".numba_cache" -exec rm -rf {} \; + find /app -name "*.nbc" -delete + # Set venv and start + source /app/.venv/bin/activate + streamlit run /app/launch_client.py --server.port 8501 --server.address 0.0.0.0 runcmd: - /tmp/root_setup.sh - su - oracleai -c '/tmp/app_setup.sh' - - rm /tmp/app_setup.sh /tmp/root_setup.sh /tmp/source.tar.gz /tmp/wallet.zip \ No newline at end of file + - rm /tmp/app_setup.sh /tmp/root_setup.sh /tmp/source.tar.gz /tmp/wallet.zip + - chown oracleai:oracleai /app/start.sh + - systemctl daemon-reexec + - systemctl daemon-reload + - systemctl enable ai-optimizer.service + - systemctl start ai-optimizer.service \ No newline at end of file diff --git a/opentofu/modules/vm/variables.tf b/opentofu/modules/vm/variables.tf index 08cba3d7..5efcfad2 100644 --- a/opentofu/modules/vm/variables.tf +++ b/opentofu/modules/vm/variables.tf @@ -52,10 +52,6 @@ variable "adb_password" { type = string } -variable "source_repository" { - type = string -} - variable "streamlit_client_port" { type = number } diff --git a/opentofu/schema.yaml b/opentofu/schema.yaml index d9ef7b2a..6b8e9095 100644 --- a/opentofu/schema.yaml +++ b/opentofu/schema.yaml @@ -29,7 +29,6 @@ variableGroups: variables: - adb_version - k8s_version - - source_repository - compute_gpu_shape - compute_os_ver visible: false @@ -236,7 +235,7 @@ variables: adb_whitelist_cidrs: type: array - title: "ADB Access Control" + title: "Access Control for the Autonomous Database" required: true default: "0.0.0.0/0" pattern: "((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9]).(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9]).(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9]).(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\\/(3[0-2]|[1-2]?[0-9])(,?)( ?)){1,}$" diff --git a/opentofu/variables.tf b/opentofu/variables.tf index f43499fa..0b5b9643 100644 --- a/opentofu/variables.tf +++ b/opentofu/variables.tf @@ -49,11 +49,6 @@ variable "private_key_path" { default = "" } -variable "source_repository" { - description = "Code that will pulled onto compute; ensure correct branch/tag." - default = "https://github.com/oracle-samples/ai-optimizer/archive/refs/heads/main" -} - // Infrastructure Type/Label variable "infrastructure" { description = "Choose between a full Kubernetes or a light-weight Virtual Machine deployment."