feat: Improve the developer journey for example notebooks (part 2) #913

willkill07 · 2025-10-05T22:15:37Z

Description

Updates the Observability, Evaluation, and Profiling example notebook
Closes

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

New Features
- Added an end-to-end notebook for observability, evaluation, and profiling with Phoenix-based tracing.
- Introduced a unified workflow combining data analysis, visualization, and RAG agents.
- Enabled evaluation and profiling runs with metrics, profiler outputs, and charts.
Documentation
- Added step-by-step setup, installation, API key handling, and run commands.
- Clarified local vs. hosted execution and observability guidance.
Chores
- Included sample retail sales data, product catalog, and evaluation dataset.
- Added ready-to-use workflow, evaluation, and profiling configurations.
- Expanded accepted vocabulary to include "Gantt".

Signed-off-by: Will Killian <[email protected]>

coderabbitai · 2025-10-05T22:15:57Z

Walkthrough

Expands the observability/evaluation/profiling notebook to add Phoenix-based telemetry, NAT tool definitions, workflow configs, evaluation/profiling configurations, datasets, and instructions to run workflows, evals, and profiling end-to-end, including artifact generation and output directory organization.

Changes

Cohort / File(s)	Summary
Notebook expansion `examples/notebooks/4_observability_evaluation_and_profiling.ipynb`	Major content expansion: prerequisites, API keys, installation, Phoenix observability setup, workflow orchestration, evaluation and profiling steps, artifact generation.
NAT tool modules `retail_sales_agent/tools/*`	New tool definitions for data analysis, RAG, and visualization (configs, `register_function` decorators, async handlers) to integrate with NAT.
Workflow configurations `retail_sales_agent/config.yml`	Adds llm/embedders, function registrations, and components for data analysis, visualization, and RAG agents; unified workflow wiring.
Evaluation dataset `retail_sales_agent/data/eval_data.json`	Introduces evaluation dataset scaffold with multiple test cases.
Evaluation configuration `retail_sales_agent/config_eval.yml`	Adds evaluators: rag_accuracy, rag_groundedness, rag_relevance, trajectory_accuracy; eval run setup.
Profiling configuration `retail_sales_agent/config_profile.yml`	Adds profiler options (token/runtime forecasts, LLM metrics, concurrency/bottleneck analyses) and output paths.
Phoenix observability config `phoenix_config.yml`	Augmentation steps and copy/append workflow to enable telemetry tracing for Phoenix.
RAG content `retail_sales_agent/rag/product_catalog.md`	Adds product catalog content used for RAG.
Sample data `retail_sales_agent/data/retail_sales_data.csv`	Adds retail sales CSV with Date, StoreID, Product, UnitsSold, Revenue, Promotion.
Vale vocabulary `ci/vale/styles/config/vocabularies/nat/accept.txt`	Adds accepted vocabulary entry `Gantt`.
Artifacts/output paths `.../profile_output/*`, `gantt_chart.png`	Describes profiler outputs and chart generation; notebook cells generate these artifacts.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant NB as Notebook
  participant NAT as NAT CLI/Runtime
  participant WF as Workflow (Agents + Tools)
  participant PH as Phoenix
  participant DS as Data

  U->>NB: Run setup cells
  NB->>NAT: nat run -c config.yml
  NAT->>WF: Initialize agents/tools
  WF->>DS: Load CSV / product_catalog
  WF->>PH: Emit telemetry (traces/logs)
  WF->>WF: Analyze data / RAG / visualize
  WF-->>NAT: Results + artifacts
  NAT-->>NB: Output_dir with results
  NB-->>U: Display results/paths

  rect rgb(235,245,255)
  note over PH: Phoenix observability (new)
  end

sequenceDiagram
  autonumber
  participant U as User
  participant NB as Notebook
  participant NAT as NAT CLI
  participant EV as Evaluators
  participant PH as Phoenix

  U->>NB: Trigger eval
  NB->>NAT: nat eval -c config_eval.yml --data eval_data.json
  NAT->>EV: Run rag_* and trajectory evaluators
  EV->>PH: Send telemetry (optional)
  EV-->>NAT: Metrics/summaries
  NAT-->>NB: Eval reports

sequenceDiagram
  autonumber
  participant U as User
  participant NB as Notebook
  participant NAT as NAT CLI
  participant PR as Profiler
  participant PH as Phoenix

  U->>NB: Trigger profiling
  NB->>NAT: nat profile -c config_profile.yml
  NAT->>PR: Collect runtime/LLM/concurrency data
  PR->>PH: Emit traces/metrics
  PR-->>NB: profile_output + gantt_chart.png

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

feature request, non-breaking

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title uses the imperative verb “Improve,” clearly communicates the main objective of enhancing the developer journey, and is concise at 66 characters. It correctly includes the “feat:” prefix to denote a new feature as per repository conventions. The descriptive scope of “example notebooks (part 2)” aligns with the changes in the pull request.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1aef2ac and ac37cad.

📒 Files selected for processing (1)

ci/vale/styles/config/vocabularies/nat/accept.txt (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

ci/vale/styles/config/vocabularies/nat/accept.txt

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

examples/notebooks/4_observability_evaluation_and_profiling.ipynb (1)

1219-1240: Avoid binding Phoenix to 0.0.0.0 by default

Setting PHOENIX_HOST=0.0.0.0 exposes the Phoenix UI on every network interface. On shared or cloud notebook environments this opens an unauthenticated observability surface to anyone who can reach the machine, which is risky. Default to 127.0.0.1 (loopback) and add an explicit warning or opt-in instructions if external access is truly required.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b96d3d3 and 1aef2ac.

📒 Files selected for processing (1)

examples/notebooks/4_observability_evaluation_and_profiling.ipynb (7 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

examples/notebooks/4_observability_evaluation_and_profiling.ipynb

examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

If an example contains Python code, it should be placed in a subdirectory named src/ and should
contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.

If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
be checked into git-lfs.

Files:

examples/notebooks/4_observability_evaluation_and_profiling.ipynb

🪛 Ruff (0.13.3)

examples/notebooks/4_observability_evaluation_and_profiling.ipynb

66-66: Redefinition of unused FunctionInfo from line 22

Remove definition: FunctionInfo

(F811)

77-77: Unused function argument: builder

(ARG001)

113-113: Redefinition of unused FunctionInfo from line 66

Remove definition: FunctionInfo

(F811)

166-166: Redefinition of unused FunctionInfo from line 113

Remove definition: FunctionInfo

(F811)

185-185: Loop control variable root overrides iterable it iterates

(B020)

185-185: Loop control variable dirs not used within loop body

Rename unused dirs to _dirs

(B007)

229-229: Do not catch blind exception: Exception

(BLE001)

230-230: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

231-231: Use explicit conversion flag

Replace with conversion flag

(RUF010)

242-242: Redefinition of unused FunctionInfo from line 166

Remove definition: FunctionInfo

(F811)

299-299: Unused function argument: arg

(ARG001)

332-332: Unused function argument: arg

(ARG001)

360-360: Redefinition of unused llama_index_rag_tool from line 191

(F811)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

examples/notebooks/4_observability_evaluation_and_profiling.ipynb

Signed-off-by: Will Killian <[email protected]>

dagardner-nv · 2025-10-05T22:43:10Z

/merge

willkill07 · 2025-10-05T22:45:08Z

/merge

feat: Improve the developer journey for example notebooks (part 2)

1aef2ac

Signed-off-by: Will Killian <[email protected]>

willkill07 requested a review from a team as a code owner October 5, 2025 22:15

willkill07 added improvement Improvement to existing functionality non-breaking Non-breaking change labels Oct 5, 2025

coderabbitai bot added the feature request New feature or request label Oct 5, 2025

willkill07 removed the feature request New feature or request label Oct 5, 2025

coderabbitai bot reviewed Oct 5, 2025

View reviewed changes

examples/notebooks/4_observability_evaluation_and_profiling.ipynb Show resolved Hide resolved

examples/notebooks/4_observability_evaluation_and_profiling.ipynb Show resolved Hide resolved

examples/notebooks/4_observability_evaluation_and_profiling.ipynb Show resolved Hide resolved

dagardner-nv approved these changes Oct 5, 2025

View reviewed changes

chore: appease Vale

ac37cad

Signed-off-by: Will Killian <[email protected]>

rapids-bot bot merged commit 14cfda4 into NVIDIA:release/1.3 Oct 5, 2025
17 checks passed

willkill07 deleted the wkk_observability-notebook branch October 23, 2025 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Improve the developer journey for example notebooks (part 2) #913

feat: Improve the developer journey for example notebooks (part 2) #913

willkill07 commented Oct 5, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 5, 2025 •

edited

Loading

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dagardner-nv commented Oct 5, 2025

Uh oh!

willkill07 commented Oct 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Improve the developer journey for example notebooks (part 2) #913

feat: Improve the developer journey for example notebooks (part 2) #913

Conversation

willkill07 commented Oct 5, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Pre-merge checks and finishing touches

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dagardner-nv commented Oct 5, 2025

Uh oh!

willkill07 commented Oct 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

willkill07 commented Oct 5, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 5, 2025 •

edited

Loading