diff --git a/ci/vale/styles/config/vocabularies/nat/accept.txt b/ci/vale/styles/config/vocabularies/nat/accept.txt index 5cb4cda48..b427e7a15 100644 --- a/ci/vale/styles/config/vocabularies/nat/accept.txt +++ b/ci/vale/styles/config/vocabularies/nat/accept.txt @@ -57,6 +57,7 @@ Dynatrace [Ee]val [Ee]xplainability Faiss +Gantt [Gg]eneratable GitHub glog diff --git a/examples/notebooks/4_observability_evaluation_and_profiling.ipynb b/examples/notebooks/4_observability_evaluation_and_profiling.ipynb index f2dd7bdcb..05a2e18b5 100644 --- a/examples/notebooks/4_observability_evaluation_and_profiling.ipynb +++ b/examples/notebooks/4_observability_evaluation_and_profiling.ipynb @@ -2,34 +2,1183 @@ "cells": [ { "cell_type": "markdown", + "metadata": { + "id": "PjRuzfwyImeC" + }, + "source": [ + "# Tracing, Evaluating, and Profiling your Agent\n", + "\n", + "In this notebook, we will walk through the advanced capabilities of NVIDIA NeMo Agent toolkit (NAT) for observability, evaluation, and profiling, from setting up Phoenix tracing to running comprehensive workflow assessments and performance analysis." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "p4b2tXeEB5MH" + }, + "source": [ + "## Prerequisites\n", + "\n", + "- **Platform:** Linux, macOS, or Windows\n", + "- **Python:** version 3.11, 3.12, or 3.13\n", + "- **Python Packages:** `pip`" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "PzjU1lTaE3gW" + }, + "source": [ + "### API Keys" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3g2OD3D3TAuN" + }, + "source": [ + "For this notebook, you will need the following API keys to run all examples end-to-end:\n", + "\n", + "- **NVIDIA Build:** You can obtain an NVIDIA Build API Key by creating an [NVIDIA Build](https://build.nvidia.com) account and generating a key at https://build.nvidia.com/settings/api-keys\n", + "\n", + "Then you can run the cell below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if \"NVIDIA_API_KEY\" not in os.environ:\n", + " nvidia_api_key = getpass.getpass(\"Enter your NVIDIA API key: \")\n", + " os.environ[\"NVIDIA_API_KEY\"] = nvidia_api_key" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GBMnVYQ7E75x" + }, + "source": [ + "### Obtaining the Dataset\n", + "\n", + "Several data files are required for this example. To keep this as a stand-alone example, the files are included here as cells which can be run to create them." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ELmZ_Pdz-qX7" + }, + "source": [ + "The following cells:\n", + "* creates the `data` directory as well as a `rag` subdirectory\n", + "* writes the `data/retail_sales_data.csv` file\n", + "* writes the RAG product catalog file, `data/product_catalog.md`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir -p data/rag" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile data/retail_sales_data.csv\n", + "Date,StoreID,Product,UnitsSold,Revenue,Promotion\n", + "2024-01-01,S001,Laptop,1,1000,No\n", + "2024-01-01,S001,Phone,9,4500,No\n", + "2024-01-01,S001,Tablet,2,600,No\n", + "2024-01-01,S002,Laptop,9,9000,No\n", + "2024-01-01,S002,Phone,10,5000,No\n", + "2024-01-01,S002,Tablet,5,1500,No\n", + "2024-01-02,S001,Laptop,4,4000,No\n", + "2024-01-02,S001,Phone,11,5500,No\n", + "2024-01-02,S001,Tablet,7,2100,No\n", + "2024-01-02,S002,Laptop,7,7000,No\n", + "2024-01-02,S002,Phone,6,3000,No\n", + "2024-01-02,S002,Tablet,9,2700,No\n", + "2024-01-03,S001,Laptop,6,6000,No\n", + "2024-01-03,S001,Phone,7,3500,No\n", + "2024-01-03,S001,Tablet,8,2400,No\n", + "2024-01-03,S002,Laptop,3,3000,No\n", + "2024-01-03,S002,Phone,16,8000,No\n", + "2024-01-03,S002,Tablet,5,1500,No\n", + "2024-01-04,S001,Laptop,5,5000,No\n", + "2024-01-04,S001,Phone,11,5500,No\n", + "2024-01-04,S001,Tablet,9,2700,No\n", + "2024-01-04,S002,Laptop,2,2000,No\n", + "2024-01-04,S002,Phone,12,6000,No\n", + "2024-01-04,S002,Tablet,7,2100,No\n", + "2024-01-05,S001,Laptop,8,8000,No\n", + "2024-01-05,S001,Phone,18,9000,No\n", + "2024-01-05,S001,Tablet,5,1500,No\n", + "2024-01-05,S002,Laptop,7,7000,No\n", + "2024-01-05,S002,Phone,10,5000,No\n", + "2024-01-05,S002,Tablet,10,3000,No\n", + "2024-01-06,S001,Laptop,9,9000,No\n", + "2024-01-06,S001,Phone,11,5500,No\n", + "2024-01-06,S001,Tablet,5,1500,No\n", + "2024-01-06,S002,Laptop,5,5000,No\n", + "2024-01-06,S002,Phone,14,7000,No\n", + "2024-01-06,S002,Tablet,10,3000,No\n", + "2024-01-07,S001,Laptop,2,2000,No\n", + "2024-01-07,S001,Phone,15,7500,No\n", + "2024-01-07,S001,Tablet,6,1800,No\n", + "2024-01-07,S002,Laptop,0,0,No\n", + "2024-01-07,S002,Phone,7,3500,No\n", + "2024-01-07,S002,Tablet,12,3600,No\n", + "2024-01-08,S001,Laptop,5,5000,No\n", + "2024-01-08,S001,Phone,8,4000,No\n", + "2024-01-08,S001,Tablet,5,1500,No\n", + "2024-01-08,S002,Laptop,4,4000,No\n", + "2024-01-08,S002,Phone,11,5500,No\n", + "2024-01-08,S002,Tablet,9,2700,No\n", + "2024-01-09,S001,Laptop,6,6000,No\n", + "2024-01-09,S001,Phone,9,4500,No\n", + "2024-01-09,S001,Tablet,8,2400,No\n", + "2024-01-09,S002,Laptop,7,7000,No\n", + "2024-01-09,S002,Phone,11,5500,No\n", + "2024-01-09,S002,Tablet,8,2400,No\n", + "2024-01-10,S001,Laptop,6,6000,No\n", + "2024-01-10,S001,Phone,11,5500,No\n", + "2024-01-10,S001,Tablet,5,1500,No\n", + "2024-01-10,S002,Laptop,8,8000,No\n", + "2024-01-10,S002,Phone,5,2500,No\n", + "2024-01-10,S002,Tablet,6,1800,No\n", + "2024-01-11,S001,Laptop,5,5000,No\n", + "2024-01-11,S001,Phone,7,3500,No\n", + "2024-01-11,S001,Tablet,5,1500,No\n", + "2024-01-11,S002,Laptop,4,4000,No\n", + "2024-01-11,S002,Phone,10,5000,No\n", + "2024-01-11,S002,Tablet,4,1200,No\n", + "2024-01-12,S001,Laptop,2,2000,No\n", + "2024-01-12,S001,Phone,10,5000,No\n", + "2024-01-12,S001,Tablet,9,2700,No\n", + "2024-01-12,S002,Laptop,8,8000,No\n", + "2024-01-12,S002,Phone,10,5000,No\n", + "2024-01-12,S002,Tablet,14,4200,No\n", + "2024-01-13,S001,Laptop,3,3000,No\n", + "2024-01-13,S001,Phone,6,3000,No\n", + "2024-01-13,S001,Tablet,9,2700,No\n", + "2024-01-13,S002,Laptop,1,1000,No\n", + "2024-01-13,S002,Phone,12,6000,No\n", + "2024-01-13,S002,Tablet,7,2100,No\n", + "2024-01-14,S001,Laptop,4,4000,Yes\n", + "2024-01-14,S001,Phone,16,8000,Yes\n", + "2024-01-14,S001,Tablet,4,1200,Yes\n", + "2024-01-14,S002,Laptop,5,5000,Yes\n", + "2024-01-14,S002,Phone,14,7000,Yes\n", + "2024-01-14,S002,Tablet,6,1800,Yes\n", + "2024-01-15,S001,Laptop,9,9000,No\n", + "2024-01-15,S001,Phone,6,3000,No\n", + "2024-01-15,S001,Tablet,11,3300,No\n", + "2024-01-15,S002,Laptop,5,5000,No\n", + "2024-01-15,S002,Phone,10,5000,No\n", + "2024-01-15,S002,Tablet,4,1200,No\n", + "2024-01-16,S001,Laptop,6,6000,No\n", + "2024-01-16,S001,Phone,11,5500,No\n", + "2024-01-16,S001,Tablet,5,1500,No\n", + "2024-01-16,S002,Laptop,4,4000,No\n", + "2024-01-16,S002,Phone,7,3500,No\n", + "2024-01-16,S002,Tablet,4,1200,No\n", + "2024-01-17,S001,Laptop,6,6000,No\n", + "2024-01-17,S001,Phone,14,7000,No\n", + "2024-01-17,S001,Tablet,7,2100,No\n", + "2024-01-17,S002,Laptop,3,3000,No\n", + "2024-01-17,S002,Phone,7,3500,No\n", + "2024-01-17,S002,Tablet,6,1800,No\n", + "2024-01-18,S001,Laptop,7,7000,Yes\n", + "2024-01-18,S001,Phone,10,5000,Yes\n", + "2024-01-18,S001,Tablet,6,1800,Yes\n", + "2024-01-18,S002,Laptop,5,5000,Yes\n", + "2024-01-18,S002,Phone,16,8000,Yes\n", + "2024-01-18,S002,Tablet,8,2400,Yes\n", + "2024-01-19,S001,Laptop,4,4000,No\n", + "2024-01-19,S001,Phone,12,6000,No\n", + "2024-01-19,S001,Tablet,7,2100,No\n", + "2024-01-19,S002,Laptop,3,3000,No\n", + "2024-01-19,S002,Phone,12,6000,No\n", + "2024-01-19,S002,Tablet,8,2400,No\n", + "2024-01-20,S001,Laptop,6,6000,No\n", + "2024-01-20,S001,Phone,8,4000,No\n", + "2024-01-20,S001,Tablet,6,1800,No\n", + "2024-01-20,S002,Laptop,8,8000,No\n", + "2024-01-20,S002,Phone,9,4500,No\n", + "2024-01-20,S002,Tablet,8,2400,No\n", + "2024-01-21,S001,Laptop,3,3000,No\n", + "2024-01-21,S001,Phone,9,4500,No\n", + "2024-01-21,S001,Tablet,5,1500,No\n", + "2024-01-21,S002,Laptop,8,8000,No\n", + "2024-01-21,S002,Phone,15,7500,No\n", + "2024-01-21,S002,Tablet,7,2100,No\n", + "2024-01-22,S001,Laptop,1,1000,No\n", + "2024-01-22,S001,Phone,15,7500,No\n", + "2024-01-22,S001,Tablet,5,1500,No\n", + "2024-01-22,S002,Laptop,11,11000,No\n", + "2024-01-22,S002,Phone,4,2000,No\n", + "2024-01-22,S002,Tablet,4,1200,No\n", + "2024-01-23,S001,Laptop,3,3000,No\n", + "2024-01-23,S001,Phone,8,4000,No\n", + "2024-01-23,S001,Tablet,8,2400,No\n", + "2024-01-23,S002,Laptop,6,6000,No\n", + "2024-01-23,S002,Phone,12,6000,No\n", + "2024-01-23,S002,Tablet,12,3600,No\n", + "2024-01-24,S001,Laptop,2,2000,No\n", + "2024-01-24,S001,Phone,14,7000,No\n", + "2024-01-24,S001,Tablet,6,1800,No\n", + "2024-01-24,S002,Laptop,1,1000,No\n", + "2024-01-24,S002,Phone,5,2500,No\n", + "2024-01-24,S002,Tablet,7,2100,No\n", + "2024-01-25,S001,Laptop,7,7000,No\n", + "2024-01-25,S001,Phone,11,5500,No\n", + "2024-01-25,S001,Tablet,11,3300,No\n", + "2024-01-25,S002,Laptop,6,6000,No\n", + "2024-01-25,S002,Phone,11,5500,No\n", + "2024-01-25,S002,Tablet,5,1500,No\n", + "2024-01-26,S001,Laptop,5,5000,Yes\n", + "2024-01-26,S001,Phone,22,11000,Yes\n", + "2024-01-26,S001,Tablet,7,2100,Yes\n", + "2024-01-26,S002,Laptop,6,6000,Yes\n", + "2024-01-26,S002,Phone,24,12000,Yes\n", + "2024-01-26,S002,Tablet,3,900,Yes\n", + "2024-01-27,S001,Laptop,7,7000,Yes\n", + "2024-01-27,S001,Phone,20,10000,Yes\n", + "2024-01-27,S001,Tablet,6,1800,Yes\n", + "2024-01-27,S002,Laptop,4,4000,Yes\n", + "2024-01-27,S002,Phone,8,4000,Yes\n", + "2024-01-27,S002,Tablet,6,1800,Yes\n", + "2024-01-28,S001,Laptop,10,10000,No\n", + "2024-01-28,S001,Phone,15,7500,No\n", + "2024-01-28,S001,Tablet,12,3600,No\n", + "2024-01-28,S002,Laptop,6,6000,No\n", + "2024-01-28,S002,Phone,11,5500,No\n", + "2024-01-28,S002,Tablet,10,3000,No\n", + "2024-01-29,S001,Laptop,3,3000,No\n", + "2024-01-29,S001,Phone,16,8000,No\n", + "2024-01-29,S001,Tablet,5,1500,No\n", + "2024-01-29,S002,Laptop,6,6000,No\n", + "2024-01-29,S002,Phone,17,8500,No\n", + "2024-01-29,S002,Tablet,2,600,No\n", + "2024-01-30,S001,Laptop,3,3000,No\n", + "2024-01-30,S001,Phone,11,5500,No\n", + "2024-01-30,S001,Tablet,2,600,No\n", + "2024-01-30,S002,Laptop,6,6000,No\n", + "2024-01-30,S002,Phone,16,8000,No\n", + "2024-01-30,S002,Tablet,8,2400,No\n", + "2024-01-31,S001,Laptop,5,5000,Yes\n", + "2024-01-31,S001,Phone,22,11000,Yes\n", + "2024-01-31,S001,Tablet,9,2700,Yes\n", + "2024-01-31,S002,Laptop,3,3000,Yes\n", + "2024-01-31,S002,Phone,14,7000,Yes\n", + "2024-01-31,S002,Tablet,4,1200,Yes\n", + "2024-02-01,S001,Laptop,2,2000,No\n", + "2024-02-01,S001,Phone,7,3500,No\n", + "2024-02-01,S001,Tablet,11,3300,No\n", + "2024-02-01,S002,Laptop,6,6000,No\n", + "2024-02-01,S002,Phone,11,5500,No\n", + "2024-02-01,S002,Tablet,5,1500,No\n", + "2024-02-02,S001,Laptop,2,2000,No\n", + "2024-02-02,S001,Phone,9,4500,No\n", + "2024-02-02,S001,Tablet,7,2100,No\n", + "2024-02-02,S002,Laptop,5,5000,No\n", + "2024-02-02,S002,Phone,9,4500,No\n", + "2024-02-02,S002,Tablet,12,3600,No\n", + "2024-02-03,S001,Laptop,9,9000,No\n", + "2024-02-03,S001,Phone,12,6000,No\n", + "2024-02-03,S001,Tablet,9,2700,No\n", + "2024-02-03,S002,Laptop,10,10000,No\n", + "2024-02-03,S002,Phone,6,3000,No\n", + "2024-02-03,S002,Tablet,10,3000,No\n", + "2024-02-04,S001,Laptop,6,6000,No\n", + "2024-02-04,S001,Phone,5,2500,No\n", + "2024-02-04,S001,Tablet,8,2400,No\n", + "2024-02-04,S002,Laptop,6,6000,No\n", + "2024-02-04,S002,Phone,10,5000,No\n", + "2024-02-04,S002,Tablet,10,3000,No\n", + "2024-02-05,S001,Laptop,7,7000,No\n", + "2024-02-05,S001,Phone,13,6500,No\n", + "2024-02-05,S001,Tablet,11,3300,No\n", + "2024-02-05,S002,Laptop,8,8000,No\n", + "2024-02-05,S002,Phone,11,5500,No\n", + "2024-02-05,S002,Tablet,8,2400,No\n", + "2024-02-06,S001,Laptop,5,5000,No\n", + "2024-02-06,S001,Phone,14,7000,No\n", + "2024-02-06,S001,Tablet,4,1200,No\n", + "2024-02-06,S002,Laptop,2,2000,No\n", + "2024-02-06,S002,Phone,11,5500,No\n", + "2024-02-06,S002,Tablet,7,2100,No\n", + "2024-02-07,S001,Laptop,6,6000,No\n", + "2024-02-07,S001,Phone,7,3500,No\n", + "2024-02-07,S001,Tablet,9,2700,No\n", + "2024-02-07,S002,Laptop,2,2000,No\n", + "2024-02-07,S002,Phone,8,4000,No\n", + "2024-02-07,S002,Tablet,9,2700,No\n", + "2024-02-08,S001,Laptop,5,5000,No\n", + "2024-02-08,S001,Phone,12,6000,No\n", + "2024-02-08,S001,Tablet,3,900,No\n", + "2024-02-08,S002,Laptop,8,8000,No\n", + "2024-02-08,S002,Phone,5,2500,No\n", + "2024-02-08,S002,Tablet,8,2400,No\n", + "2024-02-09,S001,Laptop,6,6000,Yes\n", + "2024-02-09,S001,Phone,18,9000,Yes\n", + "2024-02-09,S001,Tablet,5,1500,Yes\n", + "2024-02-09,S002,Laptop,7,7000,Yes\n", + "2024-02-09,S002,Phone,18,9000,Yes\n", + "2024-02-09,S002,Tablet,5,1500,Yes\n", + "2024-02-10,S001,Laptop,9,9000,No\n", + "2024-02-10,S001,Phone,6,3000,No\n", + "2024-02-10,S001,Tablet,8,2400,No\n", + "2024-02-10,S002,Laptop,7,7000,No\n", + "2024-02-10,S002,Phone,5,2500,No\n", + "2024-02-10,S002,Tablet,6,1800,No\n", + "2024-02-11,S001,Laptop,6,6000,No\n", + "2024-02-11,S001,Phone,11,5500,No\n", + "2024-02-11,S001,Tablet,2,600,No\n", + "2024-02-11,S002,Laptop,7,7000,No\n", + "2024-02-11,S002,Phone,5,2500,No\n", + "2024-02-11,S002,Tablet,9,2700,No\n", + "2024-02-12,S001,Laptop,5,5000,No\n", + "2024-02-12,S001,Phone,5,2500,No\n", + "2024-02-12,S001,Tablet,4,1200,No\n", + "2024-02-12,S002,Laptop,1,1000,No\n", + "2024-02-12,S002,Phone,14,7000,No\n", + "2024-02-12,S002,Tablet,15,4500,No\n", + "2024-02-13,S001,Laptop,3,3000,No\n", + "2024-02-13,S001,Phone,18,9000,No\n", + "2024-02-13,S001,Tablet,8,2400,No\n", + "2024-02-13,S002,Laptop,5,5000,No\n", + "2024-02-13,S002,Phone,8,4000,No\n", + "2024-02-13,S002,Tablet,6,1800,No\n", + "2024-02-14,S001,Laptop,4,4000,No\n", + "2024-02-14,S001,Phone,9,4500,No\n", + "2024-02-14,S001,Tablet,6,1800,No\n", + "2024-02-14,S002,Laptop,4,4000,No\n", + "2024-02-14,S002,Phone,6,3000,No\n", + "2024-02-14,S002,Tablet,7,2100,No\n", + "2024-02-15,S001,Laptop,4,4000,Yes\n", + "2024-02-15,S001,Phone,26,13000,Yes\n", + "2024-02-15,S001,Tablet,5,1500,Yes\n", + "2024-02-15,S002,Laptop,2,2000,Yes\n", + "2024-02-15,S002,Phone,14,7000,Yes\n", + "2024-02-15,S002,Tablet,6,1800,Yes\n", + "2024-02-16,S001,Laptop,7,7000,No\n", + "2024-02-16,S001,Phone,9,4500,No\n", + "2024-02-16,S001,Tablet,1,300,No\n", + "2024-02-16,S002,Laptop,6,6000,No\n", + "2024-02-16,S002,Phone,12,6000,No\n", + "2024-02-16,S002,Tablet,10,3000,No\n", + "2024-02-17,S001,Laptop,5,5000,No\n", + "2024-02-17,S001,Phone,8,4000,No\n", + "2024-02-17,S001,Tablet,14,4200,No\n", + "2024-02-17,S002,Laptop,4,4000,No\n", + "2024-02-17,S002,Phone,13,6500,No\n", + "2024-02-17,S002,Tablet,7,2100,No\n", + "2024-02-18,S001,Laptop,6,6000,Yes\n", + "2024-02-18,S001,Phone,22,11000,Yes\n", + "2024-02-18,S001,Tablet,9,2700,Yes\n", + "2024-02-18,S002,Laptop,2,2000,Yes\n", + "2024-02-18,S002,Phone,10,5000,Yes\n", + "2024-02-18,S002,Tablet,12,3600,Yes\n", + "2024-02-19,S001,Laptop,6,6000,No\n", + "2024-02-19,S001,Phone,12,6000,No\n", + "2024-02-19,S001,Tablet,3,900,No\n", + "2024-02-19,S002,Laptop,3,3000,No\n", + "2024-02-19,S002,Phone,4,2000,No\n", + "2024-02-19,S002,Tablet,7,2100,No\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile data/rag/product_catalog.md\n", + "# Product Catalog: Smartphones, Laptops, and Tablets\n", + "\n", + "## Smartphones\n", + "\n", + "The Veltrix Solis Z9 is a flagship device in the premium smartphone segment. It builds on a decade of design iterations that prioritize screen-to-body ratio, minimal bezels, and high refresh rate displays. The 6.7-inch AMOLED panel with 120Hz refresh rate delivers immersive visual experiences, whether in gaming, video streaming, or augmented reality applications. The display's GorillaGlass Fusion coating provides scratch resistance and durability, and the thin form factor is engineered using a titanium-aluminum alloy chassis to reduce weight without compromising rigidity.\n", + "\n", + "Internally, the Solis Z9 is powered by the OrionEdge V14 chipset, a 4nm process SoC designed for high-efficiency workloads. Its AI accelerator module handles on-device tasks such as voice transcription, camera optimization, and intelligent background app management. The inclusion of 12GB LPDDR5 RAM and a 256GB UFS 3.1 storage system allows for seamless multitasking, instant app launching, and rapid data access. The device supports eSIM and dual physical SIM configurations, catering to global travelers and hybrid network users.\n", + "\n", + "Photography and videography are central to the Solis Z9 experience. The triple-camera system incorporates a periscope-style 8MP telephoto lens with 5x optical zoom, a 12MP ultra-wide sensor with macro capabilities, and a 64MP main sensor featuring optical image stabilization (OIS) and phase detection autofocus (PDAF). Night mode and HDRX+ processing enable high-fidelity image capture in challenging lighting conditions.\n", + "\n", + "Software-wise, the device ships with LunOS 15, a lightweight Android fork optimized for modular updates and privacy compliance. The system supports secure containers for work profiles and AI-powered notifications that summarize app alerts across channels. Facial unlock is augmented by a 3D IR depth sensor, providing reliable biometric security alongside the ultrasonic in-display fingerprint scanner.\n", + "\n", + "The Solis Z9 is a culmination of over a decade of design experimentation in mobile form factors, ranging from curved-edge screens to under-display camera arrays. Its balance of performance, battery efficiency, and user-centric software makes it an ideal daily driver for content creators, mobile gamers, and enterprise users.\n", + "\n", + "## Laptops\n", + "\n", + "The Cryon Vanta 16X represents the latest evolution of portable computing power tailored for professional-grade workloads.\n", + "\n", + "The Vanta 16X features a unibody chassis milled from aircraft-grade aluminum using CNC machining. The thermal design integrates vapor chamber cooling and dual-fan exhaust architecture to support sustained performance under high computational loads. The 16-inch 4K UHD display is color-calibrated at the factory and supports HDR10+, making it suitable for cinematic video editing and high-fidelity CAD modeling.\n", + "\n", + "Powering the device is Intel's Core i9-13900H processor, which includes 14 cores with a hybrid architecture combining performance and efficiency cores. This allows the system to dynamically balance power consumption and raw speed based on active workloads. The dedicated Zephira RTX 4700G GPU features 8GB of GDDR6 VRAM and is optimized for CUDA and Tensor Core operations, enabling applications in real-time ray tracing, AI inference, and 3D rendering.\n", + "\n", + "The Vanta 16X includes a 2TB PCIe Gen 4 NVMe SSD, delivering sequential read/write speeds above 7GB/s, and 32GB of high-bandwidth DDR5 RAM. The machine supports hardware-accelerated virtualization and dual-booting, and ships with VireoOS Pro pre-installed, with official drivers available for Fedora, Ubuntu LTS, and NebulaOS.\n", + "\n", + "Input options are expansive. The keyboard features per-key RGB lighting and programmable macros, while the haptic touchpad supports multi-gesture navigation and palm rejection. Port variety includes dual Thunderbolt 4 ports, a full-size SD Express card reader, HDMI 2.1, 2.5G Ethernet, three USB-A 3.2 ports, and a 3.5mm TRRS audio jack. A fingerprint reader is embedded in the power button and supports biometric logins via Windows Hello.\n", + "\n", + "The history of the Cryon laptop line dates back to the early 2010s, when the company launched its first ultrabook aimed at mobile developers. Since then, successive generations have introduced carbon fiber lids, modular SSD bays, and convertible form factors. The Vanta 16X continues this tradition by integrating a customizable BIOS, a modular fan assembly, and a trackpad optimized for creative software like Blender and Adobe Creative Suite.\n", + "\n", + "Designed for software engineers, data scientists, film editors, and 3D artists, the Cryon Vanta 16X is a workstation-class laptop in a portable shell.\n", + "\n", + "## Tablets\n", + "\n", + "The Nebulyn Ark S12 Ultra reflects the current apex of tablet technology, combining high-end hardware with software environments tailored for productivity and creativity.\n", + "\n", + "The Ark S12 Ultra is built around a 12.9-inch OLED display that supports 144Hz refresh rate and HDR10+ dynamic range. With a resolution of 2800 x 1752 pixels and a contrast ratio of 1,000,000:1, the screen delivers vibrant color reproduction ideal for design and media consumption. The display supports true tone adaptation and low blue-light filtering for prolonged use.\n", + "\n", + "Internally, the tablet uses Qualcomm's Snapdragon 8 Gen 3 SoC, which includes an Adreno 750 GPU and an NPU for on-device AI tasks. The device ships with 16GB LPDDR5X RAM and 512GB of storage with support for NVMe expansion via a proprietary magnetic dock. The 11200mAh battery enables up to 15 hours of typical use and recharges to 80 percent in 45 minutes via 45W USB-C PD.\n", + "\n", + "The Ark's history traces back to the original Nebulyn Tab, which launched in 2014 as an e-reader and video streaming device. Since then, the line has evolved through multiple iterations that introduced stylus support, high-refresh screens, and multi-window desktop modes. The current model supports NebulynVerse, a DeX-like environment that allows external display mirroring and full multitasking with overlapping windows and keyboard shortcuts.\n", + "\n", + "Input capabilities are central to the Ark S12 Ultra’s appeal. The Pluma Stylus 3 features magnetic charging, 4096 pressure levels, and tilt detection. It integrates haptic feedback to simulate traditional pen strokes and brush textures. The device also supports a SnapCover keyboard that includes a trackpad and programmable shortcut keys. With the stylus and keyboard, users can effectively transform the tablet into a mobile workstation or digital sketchbook.\n", + "\n", + "Camera hardware includes a 13MP main sensor and a 12MP ultra-wide front camera with center-stage tracking and biometric unlock. Microphone arrays with beamforming enable studio-quality call audio. Connectivity includes Wi-Fi 7, Bluetooth 5.3, and optional LTE/5G with eSIM.\n", + "\n", + "Software support is robust. The device runs NebulynOS 6.0, based on Android 14L, and supports app sandboxing, multi-user profiles, and remote device management. Integration with cloud services, including SketchNimbus and ThoughtSpace, allows for real-time collaboration and syncing of content across devices.\n", + "\n", + "This tablet is targeted at professionals who require a balance between media consumption, creativity, and light productivity. Typical users include architects, consultants, university students, and UX designers.\n", + "\n", + "## Comparative Summary\n", + "\n", + "Each of these devices—the Veltrix Solis Z9, Cryon Vanta 16X, and Nebulyn Ark S12 Ultra—represents a best-in-class interpretation of its category. The Solis Z9 excels in mobile photography and everyday communication. The Vanta 16X is tailored for high-performance applications such as video production and AI prototyping. The Ark S12 Ultra provides a canvas for creativity, note-taking, and hybrid productivity use cases.\n", + "\n", + "## Historical Trends and Design Evolution\n", + "\n", + "Design across all three categories is converging toward modularity, longevity, and environmental sustainability. Recycled materials, reparability scores, and software longevity are becoming integral to brand reputation and product longevity. Future iterations are expected to feature tighter integration with wearable devices, ambient AI experiences, and cross-device workflows." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0IUUGtXSFB5G" + }, + "source": [ + "## Installing NeMo Agent Toolkit" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OSICVNHGGm9l" + }, + "source": [ + "The recommended way to install NAT is through `pip` or `uv pip`.\n", + "\n", + "First, we will install `uv` which offers parallel downloads and faster dependency resolution." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install uv" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "EBV2Gh9NIC8R" + }, + "source": [ + "NeMo Agent toolkit can be installed through the PyPI `nvidia-nat` package.\n", + "\n", + "There are several optional subpackages available for NAT. For this example, we will rely on three subpackages:\n", + "* The `langchain` subpackage contains useful components for integrating and running within [LangChain](https://python.langchain.com/docs/introduction/).\n", + "* The `llama-index` subpackage contains useful components for integrating and running within [LlamaIndex](https://developers.llamaindex.ai/python/framework/).\n", + "* The `phoenix` subpackage contains components for integrating with [Phoenix](https://phoenix.arize.com/).\n", + "* The `profiling` subpackage contains components common for profiling with NeMo Agent toolkit." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!uv pip install \"nvidia-nat[langchain,llama-index,phoenix,profiling]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qrl3St-WWBQ2" + }, + "source": [ + "## Installing the Workflow\n", + "\n", + "In the previous notebook we went through a complex multi-agent example with several new tools. We will reuse this same example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!nat workflow create retail_sales_agent" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iSDMOrSQKtBr" + }, + "source": [ + "### Adding Tools\n", + "\n", + "The following cells adding additional tools to the workflow and register them.\n", + "\n", + "* Sales Per Day Tool\n", + "* Detect Outliers Tool\n", + "* Total Product Sales Data Tool\n", + "* LlamaIndex RAG Tool\n", + "* Data Visualization Tools\n", + "* Tool Registration" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile retail_sales_agent/src/retail_sales_agent/total_product_sales_data_tool.py\n", + "from pydantic import Field\n", + "\n", + "from nat.builder.builder import Builder\n", + "from nat.builder.framework_enum import LLMFrameworkEnum\n", + "from nat.builder.function_info import FunctionInfo\n", + "from nat.cli.register_workflow import register_function\n", + "from nat.data_models.function import FunctionBaseConfig\n", + "\n", + "\n", + "class GetTotalProductSalesDataConfig(FunctionBaseConfig, name=\"get_total_product_sales_data\"):\n", + " \"\"\"Get total sales data by product.\"\"\"\n", + " data_path: str = Field(description=\"Path to the data file\")\n", + "\n", + "\n", + "@register_function(config_type=GetTotalProductSalesDataConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n", + "async def get_total_product_sales_data_function(config: GetTotalProductSalesDataConfig, _builder: Builder):\n", + " \"\"\"Get total sales data for a specific product.\"\"\"\n", + " import pandas as pd\n", + "\n", + " df = pd.read_csv(config.data_path)\n", + "\n", + " async def _get_total_product_sales_data(product_name: str) -> str:\n", + " \"\"\"\n", + " Retrieve total sales data for a specific product.\n", + "\n", + " Args:\n", + " product_name: Name of the product\n", + "\n", + " Returns:\n", + " String message containing total sales data\n", + " \"\"\"\n", + " df['Product'] = df[\"Product\"].apply(lambda x: x.lower())\n", + " revenue = df[df['Product'] == product_name]['Revenue'].sum()\n", + " units_sold = df[df['Product'] == product_name]['UnitsSold'].sum()\n", + "\n", + " return f\"Revenue for {product_name} are {revenue} and total units sold are {units_sold}\"\n", + "\n", + " yield FunctionInfo.from_fn(\n", + " _get_total_product_sales_data,\n", + " description=_get_total_product_sales_data.__doc__)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile retail_sales_agent/src/retail_sales_agent/sales_per_day_tool.py\n", + "from pydantic import Field\n", + "\n", + "from nat.builder.builder import Builder\n", + "from nat.builder.framework_enum import LLMFrameworkEnum\n", + "from nat.builder.function_info import FunctionInfo\n", + "from nat.cli.register_workflow import register_function\n", + "from nat.data_models.function import FunctionBaseConfig\n", + "\n", + "\n", + "class GetSalesPerDayConfig(FunctionBaseConfig, name=\"get_sales_per_day\"):\n", + " \"\"\"Get total sales across all products per day.\"\"\"\n", + " data_path: str = Field(description=\"Path to the data file\")\n", + "\n", + "\n", + "@register_function(config_type=GetSalesPerDayConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n", + "async def sales_per_day_function(config: GetSalesPerDayConfig, builder: Builder):\n", + " \"\"\"Get total sales across all products per day.\"\"\"\n", + " import pandas as pd\n", + "\n", + " df = pd.read_csv(config.data_path)\n", + " df['Product'] = df[\"Product\"].apply(lambda x: x.lower())\n", + "\n", + " async def _get_sales_per_day(date: str, product: str) -> str:\n", + " \"\"\"\n", + " Calculate total sales data across all products for a specific date.\n", + "\n", + " Args:\n", + " date: Date in YYYY-MM-DD format\n", + " product: Product name\n", + "\n", + " Returns:\n", + " String message with the total sales for the day\n", + " \"\"\"\n", + " if date == \"None\":\n", + " return \"Please provide a date in YYYY-MM-DD format.\"\n", + " total_revenue = df[(df['Date'] == date) & (df['Product'] == product)]['Revenue'].sum()\n", + " total_units_sold = df[(df['Date'] == date) & (df['Product'] == product)]['UnitsSold'].sum()\n", + "\n", + " return f\"Total revenue for {date} is {total_revenue} and total units sold is {total_units_sold}\"\n", + "\n", + " yield FunctionInfo.from_fn(\n", + " _get_sales_per_day,\n", + " description=_get_sales_per_day.__doc__)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile retail_sales_agent/src/retail_sales_agent/detect_outliers_tool.py\n", + "from pydantic import Field\n", + "\n", + "from nat.builder.builder import Builder\n", + "from nat.builder.framework_enum import LLMFrameworkEnum\n", + "from nat.builder.function_info import FunctionInfo\n", + "from nat.cli.register_workflow import register_function\n", + "from nat.data_models.function import FunctionBaseConfig\n", + "\n", + "\n", + "class DetectOutliersIQRConfig(FunctionBaseConfig, name=\"detect_outliers_iqr\"):\n", + " \"\"\"Detect outliers in sales data using IQR method.\"\"\"\n", + " data_path: str = Field(description=\"Path to the data file\")\n", + "\n", + "\n", + "@register_function(config_type=DetectOutliersIQRConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n", + "async def detect_outliers_iqr_function(config: DetectOutliersIQRConfig, _builder: Builder):\n", + " \"\"\"Detect outliers in sales data using the Interquartile Range (IQR) method.\"\"\"\n", + " import pandas as pd\n", + "\n", + " df = pd.read_csv(config.data_path)\n", + "\n", + " async def _detect_outliers_iqr(metric: str) -> str:\n", + " \"\"\"\n", + " Detect outliers in retail data using the IQR method.\n", + "\n", + " Args:\n", + " metric: Specific metric to check for outliers\n", + "\n", + " Returns:\n", + " Dictionary containing outlier analysis results\n", + " \"\"\"\n", + " if metric == \"None\":\n", + " column = \"Revenue\"\n", + " else:\n", + " column = metric\n", + "\n", + " q1 = df[column].quantile(0.25)\n", + " q3 = df[column].quantile(0.75)\n", + " iqr = q3 - q1\n", + " outliers = df[(df[column] < q1 - 1.5 * iqr) | (df[column] > q3 + 1.5 * iqr)]\n", + "\n", + " return f\"Outliers in {column} are {outliers.to_dict('records')}\"\n", + "\n", + " yield FunctionInfo.from_fn(\n", + " _detect_outliers_iqr,\n", + " description=_detect_outliers_iqr.__doc__)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile retail_sales_agent/src/retail_sales_agent/llama_index_rag_tool.py\n", + "import logging\n", + "import os\n", + "\n", + "from pydantic import Field\n", + "\n", + "from nat.builder.builder import Builder\n", + "from nat.builder.framework_enum import LLMFrameworkEnum\n", + "from nat.builder.function_info import FunctionInfo\n", + "from nat.cli.register_workflow import register_function\n", + "from nat.data_models.component_ref import EmbedderRef\n", + "from nat.data_models.component_ref import LLMRef\n", + "from nat.data_models.function import FunctionBaseConfig\n", + "\n", + "logger = logging.getLogger(__name__)\n", + "\n", + "\n", + "class LlamaIndexRAGConfig(FunctionBaseConfig, name=\"llama_index_rag\"):\n", + "\n", + " llm_name: LLMRef = Field(description=\"The name of the LLM to use for the RAG engine.\")\n", + " embedder_name: EmbedderRef = Field(description=\"The name of the embedder to use for the RAG engine.\")\n", + " data_dir: str = Field(description=\"The directory containing the data to use for the RAG engine.\")\n", + " description: str = Field(description=\"A description of the knowledge included in the RAG system.\")\n", + " collection_name: str = Field(default=\"context\", description=\"The name of the collection to use for the RAG engine.\")\n", + "\n", + "\n", + "def _walk_directory(root: str):\n", + " for root, dirs, files in os.walk(root):\n", + " for file_name in files:\n", + " yield os.path.join(root, file_name)\n", + "\n", + "\n", + "@register_function(config_type=LlamaIndexRAGConfig, framework_wrappers=[LLMFrameworkEnum.LLAMA_INDEX])\n", + "async def llama_index_rag_tool(config: LlamaIndexRAGConfig, builder: Builder):\n", + " from llama_index.core import Settings\n", + " from llama_index.core import SimpleDirectoryReader\n", + " from llama_index.core import StorageContext\n", + " from llama_index.core import VectorStoreIndex\n", + " from llama_index.core.node_parser import SentenceSplitter\n", + "\n", + " llm = await builder.get_llm(config.llm_name, wrapper_type=LLMFrameworkEnum.LLAMA_INDEX)\n", + " embedder = await builder.get_embedder(config.embedder_name, wrapper_type=LLMFrameworkEnum.LLAMA_INDEX)\n", + "\n", + " Settings.embed_model = embedder\n", + " Settings.llm = llm\n", + "\n", + " files = list(_walk_directory(config.data_dir))\n", + " docs = SimpleDirectoryReader(input_files=files).load_data()\n", + " logger.info(\"Loaded %s documents from %s\", len(docs), config.data_dir)\n", + "\n", + " parser = SentenceSplitter(\n", + " chunk_size=400,\n", + " chunk_overlap=20,\n", + " separator=\" \",\n", + " )\n", + " nodes = parser.get_nodes_from_documents(docs)\n", + "\n", + " index = VectorStoreIndex(nodes)\n", + "\n", + " query_engine = index.as_query_engine(similarity_top_k=3, )\n", + "\n", + " async def _arun(inputs: str) -> str:\n", + " \"\"\"\n", + " Search product catalog for information about tablets, laptops, and smartphones\n", + " Args:\n", + " inputs: user query about product specifications\n", + " \"\"\"\n", + " try:\n", + " response = query_engine.query(inputs)\n", + " return str(response.response)\n", + "\n", + " except Exception as e:\n", + " logger.error(\"RAG query failed: %s\", e)\n", + " return f\"Sorry, I couldn't retrieve information about that product. Error: {str(e)}\"\n", + "\n", + " yield FunctionInfo.from_fn(_arun, description=config.description)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile retail_sales_agent/src/retail_sales_agent/data_visualization_tools.py\n", + "from pydantic import Field\n", + "\n", + "from nat.builder.builder import Builder\n", + "from nat.builder.framework_enum import LLMFrameworkEnum\n", + "from nat.builder.function_info import FunctionInfo\n", + "from nat.cli.register_workflow import register_function\n", + "from nat.data_models.component_ref import LLMRef\n", + "from nat.data_models.function import FunctionBaseConfig\n", + "\n", + "\n", + "class PlotSalesTrendForStoresConfig(FunctionBaseConfig, name=\"plot_sales_trend_for_stores\"):\n", + " \"\"\"Plot sales trend for a specific store.\"\"\"\n", + " data_path: str = Field(description=\"Path to the data file\")\n", + "\n", + "\n", + "@register_function(config_type=PlotSalesTrendForStoresConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n", + "async def plot_sales_trend_for_stores_function(config: PlotSalesTrendForStoresConfig, _builder: Builder):\n", + " \"\"\"Create a visualization of sales trends over time.\"\"\"\n", + " import matplotlib.pyplot as plt\n", + " import pandas as pd\n", + "\n", + " df = pd.read_csv(config.data_path)\n", + "\n", + " async def _plot_sales_trend_for_stores(store_id: str) -> str:\n", + " if store_id not in df[\"StoreID\"].unique():\n", + " data = df\n", + " title = \"Sales Trend for All Stores\"\n", + " else:\n", + " data = df[df[\"StoreID\"] == store_id]\n", + " title = f\"Sales Trend for Store {store_id}\"\n", + "\n", + " plt.figure(figsize=(10, 5))\n", + " trend = data.groupby(\"Date\")[\"Revenue\"].sum()\n", + " trend.plot(title=title)\n", + " plt.xlabel(\"Date\")\n", + " plt.ylabel(\"Revenue\")\n", + " plt.tight_layout()\n", + " plt.savefig(\"sales_trend.png\")\n", + "\n", + " return \"Sales trend plot saved to sales_trend.png\"\n", + "\n", + " yield FunctionInfo.from_fn(\n", + " _plot_sales_trend_for_stores,\n", + " description=(\n", + " \"This tool can be used to plot the sales trend for a specific store or all stores. \"\n", + " \"It takes in a store ID creates and saves an image of a plot of the revenue trend for that store.\"))\n", + "\n", + "\n", + "class PlotAndCompareRevenueAcrossStoresConfig(FunctionBaseConfig, name=\"plot_and_compare_revenue_across_stores\"):\n", + " \"\"\"Plot and compare revenue across stores.\"\"\"\n", + " data_path: str = Field(description=\"Path to the data file\")\n", + "\n", + "\n", + "@register_function(config_type=PlotAndCompareRevenueAcrossStoresConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n", + "async def plot_revenue_across_stores_function(config: PlotAndCompareRevenueAcrossStoresConfig, _builder: Builder):\n", + " \"\"\"Create a visualization comparing sales trends between stores.\"\"\"\n", + " import matplotlib.pyplot as plt\n", + " import pandas as pd\n", + "\n", + " df = pd.read_csv(config.data_path)\n", + "\n", + " async def _plot_revenue_across_stores(arg: str) -> str:\n", + " pivot = df.pivot_table(index=\"Date\", columns=\"StoreID\", values=\"Revenue\", aggfunc=\"sum\")\n", + " pivot.plot(figsize=(12, 6), title=\"Revenue Trends Across Stores\")\n", + " plt.xlabel(\"Date\")\n", + " plt.ylabel(\"Revenue\")\n", + " plt.legend(title=\"StoreID\")\n", + " plt.tight_layout()\n", + " plt.savefig(\"revenue_across_stores.png\")\n", + "\n", + " return \"Revenue trends across stores plot saved to revenue_across_stores.png\"\n", + "\n", + " yield FunctionInfo.from_fn(\n", + " _plot_revenue_across_stores,\n", + " description=(\n", + " \"This tool can be used to plot and compare the revenue trends across stores. Use this tool only if the \"\n", + " \"user asks for a comparison of revenue trends across stores.\"\n", + " \"It takes in a single string as input (which is ignored) and creates and saves an image of a plot of the revenue trends across stores.\"\n", + " ))\n", + "\n", + "\n", + "class PlotAverageDailyRevenueConfig(FunctionBaseConfig, name=\"plot_average_daily_revenue\"):\n", + " \"\"\"Plot average daily revenue for stores and products.\"\"\"\n", + " data_path: str = Field(description=\"Path to the data file\")\n", + "\n", + "\n", + "@register_function(config_type=PlotAverageDailyRevenueConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n", + "async def plot_average_daily_revenue_function(config: PlotAverageDailyRevenueConfig, _builder: Builder):\n", + " \"\"\"Create a bar chart showing average daily revenue by day of week.\"\"\"\n", + " import matplotlib.pyplot as plt\n", + " import pandas as pd\n", + "\n", + " df = pd.read_csv(config.data_path)\n", + "\n", + " async def _plot_average_daily_revenue(arg: str) -> str:\n", + " daily_revenue = df.groupby([\"StoreID\", \"Product\", \"Date\"])[\"Revenue\"].sum().reset_index()\n", + "\n", + " avg_daily_revenue = daily_revenue.groupby([\"StoreID\", \"Product\"])[\"Revenue\"].mean().unstack()\n", + "\n", + " avg_daily_revenue.plot(kind=\"bar\", figsize=(12, 6), title=\"Average Daily Revenue per Store by Product\")\n", + " plt.ylabel(\"Average Revenue\")\n", + " plt.xlabel(\"Store ID\")\n", + " plt.xticks(rotation=0)\n", + " plt.legend(title=\"Product\", bbox_to_anchor=(1.05, 1), loc='upper left')\n", + " plt.tight_layout()\n", + " plt.savefig(\"average_daily_revenue.png\")\n", + "\n", + " return \"Average daily revenue plot saved to average_daily_revenue.png\"\n", + "\n", + " yield FunctionInfo.from_fn(\n", + " _plot_average_daily_revenue,\n", + " description=(\"This tool can be used to plot the average daily revenue for stores and products \"\n", + " \"It takes in a single string as input and creates and saves an image of a grouped bar chart \"\n", + " \"of the average daily revenue\"))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile -a retail_sales_agent/src/retail_sales_agent/register.py\n", + "\n", + "from . import sales_per_day_tool\n", + "from . import detect_outliers_tool\n", + "from . import total_product_sales_data_tool\n", + "from . import llama_index_rag_tool\n", + "from . import data_visualization_tools" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KAGE-pJ_OZ_P" + }, + "source": [ + "### Workflow Configuration File\n", + "\n", + "The following cell creates a basic workflow configuration file" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# @title\n", + "%%writefile retail_sales_agent/configs/config.yml\n", + "llms:\n", + " nim_llm:\n", + " _type: nim\n", + " model_name: meta/llama-3.3-70b-instruct\n", + " temperature: 0.0\n", + " max_tokens: 2048\n", + " context_window: 32768\n", + " api_key: $NVIDIA_API_KEY\n", + "\n", + "embedders:\n", + " nim_embedder:\n", + " _type: nim\n", + " model_name: nvidia/nv-embedqa-e5-v5\n", + " truncate: END\n", + " api_key: $NVIDIA_API_KEY\n", + "\n", + "functions:\n", + " total_product_sales_data:\n", + " _type: get_total_product_sales_data\n", + " data_path: data/retail_sales_data.csv\n", + " sales_per_day:\n", + " _type: get_sales_per_day\n", + " data_path: data/retail_sales_data.csv\n", + " detect_outliers:\n", + " _type: detect_outliers_iqr\n", + " data_path: data/retail_sales_data.csv\n", + "\n", + " data_analysis_agent:\n", + " _type: tool_calling_agent\n", + " tool_names:\n", + " - total_product_sales_data\n", + " - sales_per_day\n", + " - detect_outliers\n", + " llm_name: nim_llm\n", + " max_history: 10\n", + " max_iterations: 15\n", + " description: |\n", + " A helpful assistant that can answer questions about the retail sales CSV data.\n", + " Use the tools to answer the questions.\n", + " Input is a single string.\n", + " verbose: false\n", + "\n", + " product_catalog_rag:\n", + " _type: llama_index_rag\n", + " llm_name: nim_llm\n", + " embedder_name: nim_embedder\n", + " collection_name: product_catalog_rag\n", + " data_dir: data/rag/\n", + " description: \"Search product catalog for TabZen tablet, AeroBook laptop, NovaPhone specifications\"\n", + "\n", + " rag_agent:\n", + " _type: react_agent\n", + " llm_name: nim_llm\n", + " tool_names: [product_catalog_rag]\n", + " max_history: 3\n", + " max_iterations: 5\n", + " max_retries: 2\n", + " description: |\n", + " An assistant that can only answer questions about products.\n", + " Use the product_catalog_rag tool to answer questions about products.\n", + " Do not make up any information.\n", + " verbose: false\n", + "\n", + " plot_sales_trend_for_stores:\n", + " _type: plot_sales_trend_for_stores\n", + " data_path: data/retail_sales_data.csv\n", + " plot_and_compare_revenue_across_stores:\n", + " _type: plot_and_compare_revenue_across_stores\n", + " data_path: data/retail_sales_data.csv\n", + " plot_average_daily_revenue:\n", + " _type: plot_average_daily_revenue\n", + " data_path: data/retail_sales_data.csv\n", + "\n", + " data_visualization_agent:\n", + " _type: react_agent\n", + " llm_name: nim_llm\n", + " tool_names:\n", + " - plot_sales_trend_for_stores\n", + " - plot_and_compare_revenue_across_stores\n", + " - plot_average_daily_revenue\n", + " max_history: 10\n", + " max_iterations: 15\n", + " description: |\n", + " You are a data visualization expert.\n", + " You can only create plots and visualizations based on user requests.\n", + " Only use available tools to generate plots.\n", + " You cannot analyze any data.\n", + " verbose: false\n", + " handle_parsing_errors: true\n", + " max_retries: 2\n", + " retry_parsing_errors: true\n", + "\n", + "workflow:\n", + " _type: react_agent\n", + " tool_names: [data_analysis_agent, data_visualization_agent, rag_agent]\n", + " llm_name: nim_llm\n", + " verbose: true\n", + " handle_parsing_errors: true\n", + " max_retries: 2\n", + " system_prompt: |\n", + " Answer the following questions as best you can.\n", + " You may communicate and collaborate with various experts to answer the questions.\n", + "\n", + " {tools}\n", + "\n", + " You may respond in one of two formats.\n", + " Use the following format exactly to communicate with an expert:\n", + "\n", + " Question: the input question you must answer\n", + " Thought: you should always think about what to do\n", + " Action: the action to take, should be one of [{tool_names}]\n", + " Action Input: the input to the action (if there is no required input, include \"Action Input: None\")\n", + " Observation: wait for the expert to respond, do not assume the expert's response\n", + "\n", + " ... (this Thought/Action/Action Input/Observation can repeat N times.)\n", + " Use the following format once you have the final answer:\n", + "\n", + " Thought: I now know the final answer\n", + " Final Answer: the final answer to the original input question" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9ugVMpgoSlb_" + }, + "source": [ + "### Verifying Workflow Installation\n", + "\n", + "You can verify the workflow was successfully set up by running the following example:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "# Tracing, Evaluating and Profiling your Agent" + "!nat run --config_file retail_sales_agent/configs/config.yml \\\n", + " --input \"What is the Ark S12 Ultra tablet and what are its specifications?\" \\\n", + " --input \"How do laptop sales compare to phone sales?\" \\\n", + " --input \"Plot average daily revenue\"" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "ItzxNviJof2Q" + }, "source": [ - "### Observing a Workflow with Phoenix\n", + "## Observing a Workflow with Phoenix\n", "\n", - "We can now go through the steps to enable observability in a workflow using Phoenix for tracing and logging.\n", + "> **Note:** _This portion of the example will only work when the notebook is run locally. It may not work through Google Colab and other online notebook environments._" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "b-7r6YUhOWAs" + }, + "source": [ + "Phoenix is an open-source observability platform designed for monitoring, debugging, and improving LLM applications and AI agents. It provides a web-based interface for visualizing and analyzing traces from LLM applications, agent workflows, and ML pipelines. Phoenix automatically captures key metrics such as latency, token usage, and costs, and displays the inputs and outputs at each step, making it invaluable for debugging complex agent behaviors and identifying performance bottlenecks in AI workflows." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "drjEt3WkyK8l" + }, + "source": [ + "### Updating the Workflow Configuration\n", "\n", - "NeMo Agent toolkit provides comprehensive tracing that automatically monitors all registered functions in your workflow, LLM interactions, and any custom functions decorated with @track_function, capturing their inputs, outputs, and execution flow to provide complete visibility into how your agent processes requests. The lightweight `@track_function` decorator can be applied to any Python function to gain execution insights without requiring full function registration—this is particularly valuable when you want to monitor utility functions, data processing steps, or business logic that doesn't need to be a full NAT component. All tracing data flows into a unified observability system that integrates seamlessly with popular monitoring platforms like Phoenix, OpenTelemetry, and LangSmith, enabling real-time monitoring, performance analysis, and debugging of your entire agent workflow from high-level function calls down to individual processing steps." + "We will need to update the workflow configuration file to support telemetry tracing with Phoenix." ] }, { "cell_type": "markdown", + "metadata": { + "id": "hF8z4R1Vyr4_" + }, + "source": [ + "To do this, we will first copy the original configuration:" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "To enable tracing, update your workflow configuration file to include the telemetry settings." + "!cp retail_sales_agent/configs/config.yml retail_sales_agent/configs/phoenix_config.yml" ] }, { "cell_type": "markdown", + "metadata": { + "id": "cBuWIqYHyzhJ" + }, + "source": [ + "Then we will append necessary configuration components to the `phoenix_config.yml` file:" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "```yaml\n", + "%%writefile -a retail_sales_agent/configs/phoenix_config.yml\n", + "\n", "general:\n", " telemetry:\n", " logging:\n", @@ -40,22 +1189,43 @@ " phoenix:\n", " _type: phoenix\n", " endpoint: http://localhost:6006/v1/traces\n", - " project: retail_sales_agent\n", - "```" + " project: retail_sales_agent\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kzGYACji_eh3" + }, + "source": [ + "### Start Phoenix Server" ] }, { "cell_type": "markdown", + "metadata": { + "id": "Uk0fRgMY6RX9" + }, + "source": [ + "First, we will ensure the service is publicly accessible:" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "Run the following command to start Phoenix server locally:\n", - "\n", - "```bash\n", - "phoenix serve\n", - "```\n", - "Phoenix should now be accessible at http://localhost:6006.\n", - "\n", - "Run this using the following command and observe the traces at the URL above.\n" + "%env PHOENIX_HOST=0.0.0.0" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "e2ajQ08B9jGG" + }, + "source": [ + "Then we will start the server:" ] }, { @@ -64,20 +1234,20 @@ "metadata": {}, "outputs": [], "source": [ - "import getpass\n", - "import os\n", - "\n", - "if \"NVIDIA_API_KEY\" not in os.environ:\n", - " nvidia_api_key = getpass.getpass(\"Enter your NVIDIA API key: \")\n", - " os.environ[\"NVIDIA_API_KEY\"] = nvidia_api_key\n", - "\n", - "if \"TAVILY_API_KEY\" not in os.environ:\n", - " tavily_api_key = getpass.getpass(\"Enter your Tavily API key: \")\n", - " os.environ[\"TAVILY_API_KEY\"] = tavily_api_key\n", + "%%bash --bg\n", + "# phoenix will run on port 6006\n", + "phoenix serve" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pCScuDXVziTi" + }, + "source": [ + "### Running the Workflow\n", "\n", - "if \"OPENAI_API_KEY\" not in os.environ:\n", - " openai_api_key = getpass.getpass(\"Enter your OpenAI API key: \")\n", - " os.environ[\"OPENAI_API_KEY\"] = openai_api_key" + "Instead of the original workflow configuration, we will run with the updated `phoenix_config.yml` file:" ] }, { @@ -86,33 +1256,65 @@ "metadata": {}, "outputs": [], "source": [ - "!nat run --config_file retail_sales_agent/configs/config_tracing.yml \\\n", - " --input \"How do laptop sales compare to phone sales?\"" + "!nat run --config_file retail_sales_agent/configs/phoenix_config.yml \\\n", + " --input \"What is the Ark S12 Ultra tablet and what are its specifications?\" \\\n", + " --input \"How do laptop sales compare to phone sales?\" \\\n", + " --input \"Plot average daily revenue\"" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "Ka6DC7YC-JbJ" + }, + "source": [ + "### Viewing the trace\n", + "\n", + "You can access the Phoenix server at http://localhost:6006" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "j8q7dYytOqX4" + }, "source": [ - "### Evaluating a Workflow using `nat eval`" + "## Evaluating a Workflow" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "hci41nsrhgo6" + }, "source": [ - "**Please refer to [this documentation](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/evaluate.html) for a detailed guide on evaluating a workflow.**" + "After setting up observability, the next step is to evaluate your workflow's performance against a test dataset. NAT provides a powerful evaluation framework that can assess your agent's responses using various metrics and evaluators.\n", + "\n", + "For detailed information on evaluation, please refer to the [Evaluating NVIDIA NeMo Agent Toolkit Workflows](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/evaluate.html).\n" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "vO9wbpgNhgo6" + }, "source": [ - "For evaluating this workflow, we create a sample [dataset](./retail_sales_agent/data/eval_data.json)\n", + "### Evaluation Dataset\n", + "\n", + "For evaluating this workflow, we will created a sample dataset.\n", "\n", - "```json\n", + "The dataset will contain three test cases covering different query types. Each entry contains a question and the expected answer that the agent should provide.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%writefile retail_sales_agent/data/eval_data.json\n", "[\n", - " { \n", + " {\n", " \"id\": \"1\",\n", " \"question\": \"How do laptop sales compare to phone sales?\",\n", " \"answer\": \"Phone sales are higher than laptop sales in terms of both revenue and units sold. Phones generated a revenue of 561,000 with 1,122 units sold, whereas laptops generated a revenue of 512,000 with 512 units sold.\"\n", @@ -127,8 +1329,27 @@ " \"question\": \"What were the laptop sales on Feb 16th 2024?\",\n", " \"answer\": \"On February 16th, 2024, the total laptop sales were 13 units, generating a total revenue of $13,000.\"\n", " }\n", - "]\n", - "```" + "]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FWxbhiB9SK8K" + }, + "source": [ + "### Updating the Workflow Configuration\n", + "\n", + "Workflow configuration files can contain extra settings relevant for evaluation and profiling." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "v7QmbGpvUDkZ" + }, + "source": [ + "To do this, we will first copy the original configuration:" ] }, { @@ -137,64 +1358,229 @@ "metadata": {}, "outputs": [], "source": [ - "!nat eval --config_file retail_sales_agent/configs/config_evaluation_and_profiling.yml" + "!cp retail_sales_agent/configs/config.yml retail_sales_agent/configs/config_eval.yml" ] }, { "cell_type": "markdown", + "metadata": { + "id": "Gsrj4FUSUDka" + }, + "source": [ + "*Then* we will append necessary configuration components to the `config_eval.yml` file:" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "The `nat eval` command runs the workflow on all the entries in the dataset. The output of these runs is stored in a file named `workflow_output.json` under the `output_dir` specified in the configuration file.\n", + "%%writefile -a retail_sales_agent/configs/config_eval.yml\n", "\n", - "Each evaluator provides an average score across all the entries in the dataset. The evaluator output also includes the score for each entry in the dataset along with the reasoning for the score. The score is a floating point number between 0 and 1, where 1 indicates a perfect match between the expected output and the generated output.\n", + "eval:\n", + " general:\n", + " output_dir: ./eval_output\n", + " verbose: true\n", + " dataset:\n", + " _type: json\n", + " file_path: ./retail_sales_agent/data/eval_data.json\n", "\n", - "The output of each evaluator is stored in a separate file under the `output_dir` specified in the configuration file." + " evaluators:\n", + " rag_accuracy:\n", + " _type: ragas\n", + " metric: AnswerAccuracy\n", + " llm_name: nim_llm\n", + " rag_groundedness:\n", + " _type: ragas\n", + " metric: ResponseGroundedness\n", + " llm_name: nim_llm\n", + " rag_relevance:\n", + " _type: ragas\n", + " metric: ContextRelevance\n", + " llm_name: nim_llm\n", + " trajectory_accuracy:\n", + " _type: trajectory\n", + " llm_name: nim_llm\n" ] }, { "cell_type": "markdown", + "metadata": { + "id": "kpr0vte_hgo6" + }, + "source": [ + "### Running the Evaluation\n", + "\n", + "The `nat eval` command executes the workflow against all entries in the dataset and evaluates the results using configured evaluators. Run the cell below to evaluate the retail sales agent workflow.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], + "source": [ + "!nat eval --config_file retail_sales_agent/configs/config_eval.yml\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1hM9ObwXhgo7" + }, + "source": [ + "### Understanding Evaluation Results\n", + "\n", + "The `nat eval` command runs the workflow on all entries in the dataset and produces several output files:\n", + "\n", + "- **`workflow_output.json`**: Contains the raw outputs from the workflow for each input in the dataset\n", + "- **Evaluator-specific files**: Each configured evaluator generates its own output file with scores and reasoning\n", + "\n", + "#### Evaluation Scores\n", + "\n", + "Each evaluator provides:\n", + "- An **average score** across all dataset entries (0-1 scale, where 1 is perfect)\n", + "- **Individual scores** for each entry with detailed reasoning\n", + "- **Performance metrics** to help identify areas for improvement\n", + "\n", + "All evaluation results are stored in the `output_dir` specified in the configuration file.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ouCqR1daVg59" + }, + "source": [ + "## Profiling a Workflow\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "P06nWQI6hgo7" + }, + "source": [ + "Profiling provides deep insights into your workflow's performance characteristics, helping you identify bottlenecks, optimize resource usage, and improve overall efficiency.\n", + "\n", + "For detailed information on profiling, please refer to the [Profiling and Performance Monitoring of NVIDIA NeMo Agent Toolkit Workflows](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/profiler.html).\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kBUu8wVzYT93" + }, "source": [ - "### Profiling a Workflow" + "### Updating the Workflow Configuration\n", + "\n", + "Workflow configuration files can contain extra settings relevant for evaluation and profiling." ] }, { "cell_type": "markdown", + "metadata": { + "id": "IREct15KYT94" + }, + "source": [ + "To do this, we will first copy the original configuration:" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ - "**Please refer to [this documentation](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/profiler.html) for a detailed guide on profiling a workflow.**" + "!cp retail_sales_agent/configs/config.yml retail_sales_agent/configs/config_profile.yml" ] }, { "cell_type": "markdown", + "metadata": { + "id": "8iONd0KTYT94" + }, + "source": [ + "*Then* we will append necessary configuration components to the `config_profile.yml` file:" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], + "source": [ + "%%writefile -a retail_sales_agent/configs/config_profile.yml\n", + "\n", + "eval:\n", + " general:\n", + " output_dir: ./profile_output\n", + " verbose: true\n", + " dataset:\n", + " _type: json\n", + " file_path: ./retail_sales_agent/data/eval_data.json\n", + "\n", + " profiler:\n", + " token_uniqueness_forecast: true\n", + " workflow_runtime_forecast: true\n", + " compute_llm_metrics: true\n", + " csv_exclude_io_text: true\n", + " prompt_caching_prefixes:\n", + " enable: true\n", + " min_frequency: 0.1\n", + " bottleneck_analysis:\n", + " enable_nested_stack: true\n", + " concurrency_spike_analysis:\n", + " enable: true\n", + " spike_threshold: 7\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nNLdnyc1hgo7" + }, "source": [ - "The profiler can be run through the `nat eval` command and can be configured through the `profiler` section of the workflow configuration file.\n", + "### Profiler Configuration\n", + "\n", + "We will reuse the same configuration as evaluation.\n", + "\n", + "The profiler is configured through the `profiler` section of your workflow configuration file. It runs alongside the `nat eval` command and offers several analysis options:\n", "\n", - "Please also note the `output_dir` parameter which specifies the directory where the profiler output will be stored. \n", + "#### Key Configuration Options:\n", "\n", - "Let us explore the profiler configuration options:\n", + "- **`token_uniqueness_forecast`**: Computes the inter-query token uniqueness forecast, predicting the expected number of unique tokens in the next query based on tokens used in previous queries\n", "\n", - "- `token_uniqueness_forecast`: Compute the inter-query token uniqueness forecast. This computes the expected number of unique tokens in the next query based on the tokens used in the previous queries.\n", + "- **`workflow_runtime_forecast`**: Calculates the expected workflow runtime based on historical query performance\n", "\n", - "- `workflow_runtime_forecast`: Compute the expected workflow runtime forecast. This computes the expected runtime of the workflow based on the runtime of the previous queries.\n", + "- **`compute_llm_metrics`**: Computes inference optimization metrics including latency, throughput, and other performance indicators\n", "\n", - "- `compute_llm_metrics`: Compute inference optimization metrics. This computes workflow-specific metrics for performance analysis (e.g., latency, throughput, etc.).\n", + "- **`csv_exclude_io_text`**: Prevents large text from being dumped into output CSV files, preserving CSV structure and readability\n", "\n", - "- `csv_exclude_io_text`: Avoid dumping large text into the output CSV. This is helpful to not break the structure of the CSV output.\n", + "- **`prompt_caching_prefixes`**: Identifies common prompt prefixes that can be pre-populated in KV caches for improved performance\n", "\n", - "- `prompt_caching_prefixes`: Identify common prompt prefixes. This is helpful for identifying if you have commonly repeated prompts that can be pre-populated in KV caches\n", + "- **`bottleneck_analysis`**: Analyzes workflow performance measures such as bottlenecks, latency, and concurrency spikes\n", + " - `simple_stack`: Provides a high-level analysis\n", + " - `nested_stack`: Offers detailed analysis of nested bottlenecks (e.g., tool calls inside other tool calls)\n", "\n", - "- `bottleneck_analysis`: Analyze workflow performance measures such as bottlenecks, latency, and concurrency spikes. This can be set to `simple_stack` for a simpler analysis. Nested stack will provide a more detailed analysis identifying nested bottlenecks like tool calls inside other tools calls.\n", + "- **`concurrency_spike_analysis`**: Identifies concurrency spikes in your workflow. The `spike_threshold` parameter (e.g., 7) determines when to flag spikes based on the number of concurrent running functions\n", "\n", - "- `concurrency_spike_analysis`: Analyze concurrency spikes. This will identify if there are any spikes in the number of concurrent tool calls. At a `spike_threshold` of `7`, the profiler will identify any spikes where the number of concurrent running functions is greater than or equal to `7`. Those are surfaced to the user in a dedicated section of the workflow profiling report." + "#### Output Directory\n", + "\n", + "The `output_dir` parameter specifies where all profiler outputs will be stored for later analysis.\n" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "A1wwbC_Lhgo7" + }, "source": [ - "Run the profiler for our created workflow using the following command:" + "### Running the Profiler\n", + "\n", + "The profiler runs as part of the `nat eval` command. When properly configured, it will collect performance data across all evaluation runs and generate comprehensive profiling reports.\n" ] }, { @@ -203,44 +1589,151 @@ "metadata": {}, "outputs": [], "source": [ - "!nat eval --config_file retail_sales_agent/configs/config_evaluation_and_profiling.yml" + "!nat eval --config_file retail_sales_agent/configs/config_profile.yml\n" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "FvwCiUrqaOaf" + }, + "source": [ + "### Profiler Output Files\n", + "\n", + "Based on the profiler configuration, the following files will be generated in the `output_dir`:" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_YFrbGAWhgo7" + }, + "source": [ + "#### Core Output Files:\n", + "\n", + "1. **`all_requests_profiler_traces.json`**: Raw usage statistics collected by the profiler, including:\n", + " - Raw traces of LLM interactions\n", + " - Tool input and output data\n", + " - Runtime measurements\n", + " - Execution metadata\n", + "\n", + "2. **`inference_optimization.json`**: Workflow-specific performance metrics with confidence intervals:\n", + " - 90%, 95%, and 99% confidence intervals for latency\n", + " - Throughput statistics\n", + " - Workflow runtime predictions\n", + "\n", + "3. **`standardized_data_all.csv`**: Standardized usage data in CSV format containing:\n", + " - Prompt tokens and completion tokens\n", + " - LLM input/output\n", + " - Framework information\n", + " - Additional metadata\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QMcdVjOZaQkD" + }, "source": [ - "This will, based on the above configuration, produce the following files in the `output_dir` specified in the configuration file:\n", + "#### Advanced Analysis Files\n", "\n", - "- `all_requests_profiler_traces.json`: This file contains the raw usage statistics collected by the profiler. Includes raw traces of LLM and tool input, runtimes, and other metadata.\n", + "4. **Analysis Reports**: JSON files and text reports for any advanced techniques enabled:\n", + " - Concurrency analysis results\n", + " - Bottleneck analysis reports\n", + " - PrefixSpan pattern mining results\n", + "\n", + "These files provide comprehensive insights into your workflow's performance and can be used for optimization and debugging." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Bf7ICQLiaTje" + }, + "source": [ + "#### Gantt Chart\n", "\n", - "- `inference_optimization.json`: This file contains the computed workflow-specific metrics. This includes 90%, 95%, and 99% confidence intervals for latency, throughput, and workflow runtime.\n", + "We can also view a Gantt chart of the profile run:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from IPython.display import Image\n", "\n", - "- `standardized_data_all.csv`: This file contains the standardized usage data including prompt tokens, completion tokens, LLM input, framework, and other metadata.\n", + "Image(\"profile_output/gantt_chart.png\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iIUhnLt-hgo7" + }, + "source": [ + "## Summary\n", "\n", - "- You’ll also find a JSON file and text report of any advanced or experimental techniques you ran including concurrency analysis, bottleneck analysis, or PrefixSpan." + "In this notebook, we covered the complete workflow for observability, evaluation, and profiling in NeMo Agent Toolkit:" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sSo0k_JUatEe" + }, + "source": [ + "### Observability with Phoenix\n", + "- Configured tracing in the workflow configuration\n", + "- Started the Phoenix server for real-time monitoring\n", + "- Executed workflows with automatic trace capture\n", + "- Visualized agent execution flow and LLM interactions\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QsVf_g5Qaxbe" + }, + "source": [ + "### Evaluation with `nat eval`\n", + "- Created a comprehensive evaluation dataset\n", + "- Ran automated evaluations across multiple test cases\n", + "- Reviewed evaluation metrics and scores\n", + "- Analyzed workflow performance against expected outputs\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qCn1i8ghazKp" + }, + "source": [ + "### Profiling for Performance Optimization\n", + "- Configured advanced profiling options\n", + "- Collected performance metrics and usage statistics\n", + "- Generated detailed profiling reports\n", + "- Identified bottlenecks and optimization opportunities\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0dF8JoWda0bl" + }, + "source": [ + "These three pillars—observability, evaluation, and profiling—work together to provide a complete picture of your agent's behavior, accuracy, and performance, enabling you to build production-ready AI applications with confidence." ] } ], "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.9" + "name": "python" } }, "nbformat": 4, - "nbformat_minor": 4 + "nbformat_minor": 0 }