Skip to content

feat(infrastructure): enable optional AML diagnostic logs#400

Merged
katriendg merged 3 commits into
microsoft:mainfrom
fbeltrao:feature/infra-aml-diagnostics
Apr 8, 2026
Merged

feat(infrastructure): enable optional AML diagnostic logs#400
katriendg merged 3 commits into
microsoft:mainfrom
fbeltrao:feature/infra-aml-diagnostics

Conversation

@fbeltrao
Copy link
Copy Markdown
Contributor

@fbeltrao fbeltrao commented Apr 7, 2026

Description

Closes #399

Enable an opt-in AML workspace diagnostic setting that forwards all AML resource logs to the platform Log Analytics workspace. Thread should_enable_aml_diagnostic_logs from the root module into module.platform, create azurerm_monitor_diagnostic_setting.ml_workspace_logs in modules/platform/azureml.tf, and document the new flag in the Terraform reference files and example tfvars.

Add Terraform tests that verify the diagnostic setting is absent by default, present when enabled, and configured with the standard name plus category_group = "allLogs".

Type of Change

  • 🐛 Bug fix (non-breaking change fixing an issue)
  • ✨ New feature (non-breaking change adding functionality)
  • 💥 Breaking change (fix or feature causing existing functionality to change)
  • 📚 Documentation update
  • 🏗️ Infrastructure change (Terraform/IaC)
  • ♻️ Refactoring (no functional changes)

Component(s) Affected

  • infrastructure/terraform/prerequisites/ - Azure subscription setup
  • infrastructure/terraform/ - Terraform infrastructure
  • infrastructure/setup/ - OSMO control plane / Helm
  • workflows/ - Training and evaluation workflows
  • training/ - Training pipelines and scripts
  • docs/ - Documentation

Testing Performed

  • Terraform plan reviewed (no unexpected changes)
  • Terraform apply tested in dev environment
  • Training scripts tested locally with Isaac Sim
  • OSMO workflow submitted successfully
  • Smoke tests passed (smoke_test_azure.py)

Documentation Impact

  • No documentation changes needed
  • Documentation updated in this PR
  • Documentation issue filed

Bug Fix Checklist

Complete this section for bug fix PRs. Skip for other contribution types.

  • Linked to issue being fixed
  • Regression test included, OR
  • Justification for no regression test:

Checklist

- add AML workspace diagnostic setting behind a default-off Terraform flag
- update Terraform tests, example tfvars, and docs references

🔒 - Generated by Copilot
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@fbeltrao fbeltrao requested a review from a team as a code owner April 7, 2026 09:09
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 50.48%. Comparing base (1230fe0) to head (3178393).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #400   +/-   ##
=======================================
  Coverage   50.48%   50.48%           
=======================================
  Files         267      267           
  Lines       18188    18188           
  Branches     1855     1855           
=======================================
  Hits         9182     9182           
  Misses       8716     8716           
  Partials      290      290           
Flag Coverage Δ *Carryforward flag
pester 81.21% <ø> (ø)
pytest 6.89% <ø> (ø) Carriedforward from aeae624
pytest-dataviewer 61.97% <ø> (ø)
vitest 50.72% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Collaborator

@katriendg katriendg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @fbeltrao this is looking great. And thanks for the updates in docs already as well.
Left one minor comment, not a blocker, see if you feel it's something you want to add.

Comment thread infrastructure/terraform/modules/platform/azureml.tf
fbeltrao and others added 2 commits April 7, 2026 18:27
- add explicit disabled diagnostic metric block for AML workspace settings
- align with PR review feedback while preserving passing terraform tests

�� - Generated by Copilot
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@katriendg katriendg merged commit 58dd8db into microsoft:main Apr 8, 2026
29 checks passed
WilliamBerryiii pushed a commit that referenced this pull request Apr 8, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.6.0](v0.5.0...v0.6.0)
(2026-04-08)


### ✨ Features

* **build:** add terraform-docs generation pipeline
([#378](#378))
([78e90d0](78e90d0))
* **infrastructure:** enable optional AML diagnostic logs
([#400](#400))
([58dd8db](58dd8db))
* **scripts:** consolidate scripts library paths and enhance dataviewer
([#383](#383))
([176d9c9](176d9c9))


### 🐛 Bug Fixes

* **build:** remediate CVEs, enforce equality pinning, repair Dependabot
config
([#391](#391))
([0c29148](0c29148))
* **infrastructure:** add Storage File Data Privileged Contributor role
for ML identity
([#380](#380))
([378f7ed](378f7ed))
* **infrastructure:** replace hardcoded NAT Gateway availability zones
with variable
([#356](#356))
([a1397bd](a1397bd))
* **infrastructure:** resolve TFLint violations and enable hard-fail
([#376](#376))
([dfb55cd](dfb55cd))
* **scripts:** add dot-source guard to Invoke-MsDateFreshnessCheck.ps1
([#397](#397))
([f6f22c3](f6f22c3))
* **training:** validate AzureML and OSMO RL submissions end to end
([#372](#372))
([49904d3](49904d3))


### 📚 Documentation

* **infrastructure:** add terraform-docs tooling and improve developer
experience
([#365](#365))
([a0fb03a](a0fb03a))
* **reference:** centralize workflow template docs and convert workflow
READMEs to pointer index
([#379](#379))
([68097e4](68097e4))


### 🔧 Miscellaneous

* **deps-dev:** bump the npm_and_yarn group across 1 directory with 2
updates
([#374](#374))
([d848c8b](d848c8b))
* **deps-dev:** bump vite from 6.4.1 to 6.4.2 in
/data-management/viewer/frontend in the npm_and_yarn group across 1
directory
([#395](#395))
([6ec7f19](6ec7f19))
* **deps:** bump the github-actions group across 1 directory with 7
updates
([#370](#370))
([4d1b951](4d1b951))
* **deps:** bump the uv group across 2 directories with 1 update
([#373](#373))
([ba66ed9](ba66ed9))


### 🔒 Security

* **deps-dev:** bump brace-expansion from 1.1.12 to 1.1.13 in
/docs/docusaurus in the npm_and_yarn group across 1 directory
([#389](#389))
([27129d9](27129d9))
* **deps-dev:** bump the npm_and_yarn group across 2 directories with 2
updates
([#363](#363))
([aeae624](aeae624))
* **deps-dev:** bump the python-dependencies group with 5 updates
([#403](#403))
([bb85560](bb85560))
* **deps:** bump cryptography from 46.0.5 to 46.0.6 in /training/rl
([#367](#367))
([a82dd68](a82dd68))
* **deps:** bump the inference-dependencies group in /evaluation with 2
updates
([#401](#401))
([c88d253](c88d253))
* **deps:** bump the pip group across 4 directories with 2 updates
([#411](#411))
([1230fe0](1230fe0))
* **deps:** bump the training-dependencies group across 1 directory with
67 updates
([#375](#375))
([8e05172](8e05172))
* **deps:** bump the uv group across 2 directories with 1 update
([#382](#382))
([b6c7aea](b6c7aea))
* **deps:** update marshmallow requirement from &lt;4.3.0,&gt;=3.5 to
&gt;=3.5,&lt;4.4.0 in /evaluation in the inference-dependencies group
([#393](#393))
([599c7eb](599c7eb))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: physical-ai-toolchain-release[bot] <267194360+physical-ai-toolchain-release[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@fbeltrao fbeltrao deleted the feature/infra-aml-diagnostics branch April 10, 2026 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(infrastructure): enable optional AML workspace diagnostic logs

3 participants