-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Add an azdo failure skill #123913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add an azdo failure skill #123913
Changes from 1 commit
Commits
Show all changes
53 commits
Select commit
Hold shift + click to select a range
0365ee9
Add an azdo failure skill
lewing 3c3adec
Improve Get-HelixFailures.ps1 script
lewing c040d21
Fix PowerShell variable scope issue
lewing e5f365b
Add build error extraction and failure classification
lewing 2bdfcc5
Additional improvements to failure analysis skill
lewing 80911b5
Fix URL parsing issues in Get-HelixFailures.ps1
lewing 94a8925
Add Docker image pull failure pattern to classification
lewing b9144e8
Merge best features from both skill PRs
lewing 53724c2
Add support for local test failures and test run URL extraction
lewing eeec65c
Add Azure DevOps CLI support for fetching failed test names
lewing 3c86381
Include Helix console log links in failure output
lewing f501745
Add known issue search using Build Analysis label
lewing 983b399
Document known issue search feature
lewing 3c1f01b
Add build links alongside log URLs for failed jobs
lewing 4bc9376
Add request caching for faster repeated analysis
lewing 35b4f7b
Show build status (in-progress/completed) in output
lewing 271fbca
Add cache cleanup (-ClearCache parameter and auto-cleanup on startup)
lewing c3268d3
Add cross-platform temp directory detection
lewing a71d363
Improve error handling and caching behavior
lewing be505ff
Be more conservative about transient failure guidance
lewing 7a2ba86
Address PR review comments
lewing 09e593c
Add guidance to read PR context before analyzing failures
lewing 39840e3
Add build and log URLs to failure output
lewing 2d1e21b
Refactor: organize script with regions and remove whitespace
lewing 0bfa4aa
Improve C++/native build error detection
lewing d0525db
Analyze all failing builds for a PR, not just the first
lewing 1ed1241
Add Build Analysis check for known issues
lewing b4defa5
Update SKILL.md with Build Analysis and multi-build docs
lewing ebf291b
Fix: use explicit default value for Context parameter
lewing 021de3e
Add PR change correlation for failure analysis
lewing 22ec99f
Update SKILL.md examples to include links
lewing 2158808
Simplify skill: reduce cache TTL to 30s, remove severity classification
lewing bed5862
Add MihuBot semantic search integration for related issues
lewing ab7e37b
Highlight binlog artifacts and add MSBuild analysis guidance
lewing bacfb86
Restructure skill following Anthropic best practices
lewing 83e5ace
Add canceled job detection and smart retry recommendations
lewing b5b1caa
Generalize skill examples for all dotnet repositories
lewing 0f02fda
Address PR review comments
lewing d2e6302
Address additional PR review comments
lewing b98b0c0
Improve known issue search for local test failures
lewing 8e43185
Fix indentation in test failure extraction block
lewing f789e9e
Address PR review comments (batch 3)
lewing 3499f2f
Fix artifact file property name (Name -> FileName)
lewing 60d1419
Add helix-artifacts.md reference documentation
lewing 5fe1450
Simplify helix-artifacts.md - focus on patterns not specifics
lewing ff97b6d
Address Copilot review: security and robustness improvements
lewing a07dc98
Use relative paths in skill documentation
lewing af34913
Address remaining code quality review items
lewing da7a287
Fix documentation errors
lewing 822621e
Improve azdo-helix-failures skill per Agent Skills spec
lewing 319efed
Add guidance for reviewing facts before presenting conclusions
lewing 756505f
Update .github/skills/azdo-helix-failures/scripts/Get-HelixFailures.ps1
lewing 70c4aa5
Add -FindBinlogs parameter and AzDO build artifacts docs
lewing File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
305 changes: 305 additions & 0 deletions
305
.github/skills/azdo-helix-failures/Get-HelixFailures.ps1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,305 @@ | ||
| <# | ||
| .SYNOPSIS | ||
| Retrieves test failures from Azure DevOps builds and Helix test runs. | ||
|
|
||
| .DESCRIPTION | ||
| This script queries Azure DevOps for failed jobs in a build and retrieves | ||
| the corresponding Helix console logs to show detailed test failure information. | ||
|
|
||
| .PARAMETER BuildId | ||
| The Azure DevOps build ID to query. | ||
|
|
||
| .PARAMETER PRNumber | ||
| The GitHub PR number to find the associated build. | ||
|
|
||
| .PARAMETER Organization | ||
| The Azure DevOps organization. Default: dnceng-public | ||
|
|
||
| .PARAMETER Project | ||
| The Azure DevOps project GUID. Default: cbb18261-c48f-4abb-8651-8cdcb5474649 | ||
|
|
||
| .PARAMETER ShowLogs | ||
| If specified, fetches and displays the Helix console logs for failed tests. | ||
|
|
||
| .PARAMETER MaxJobs | ||
| Maximum number of failed jobs to process. Default: 5 | ||
|
|
||
| .EXAMPLE | ||
| .\Get-HelixFailures.ps1 -BuildId 1276327 | ||
|
|
||
| .EXAMPLE | ||
| .\Get-HelixFailures.ps1 -PRNumber 123445 -ShowLogs | ||
| #> | ||
|
|
||
| [CmdletBinding(DefaultParameterSetName = 'BuildId')] | ||
| param( | ||
| [Parameter(ParameterSetName = 'BuildId', Mandatory = $true)] | ||
| [int]$BuildId, | ||
|
|
||
| [Parameter(ParameterSetName = 'PRNumber', Mandatory = $true)] | ||
| [int]$PRNumber, | ||
|
|
||
| [string]$Organization = "dnceng-public", | ||
| [string]$Project = "cbb18261-c48f-4abb-8651-8cdcb5474649", | ||
| [switch]$ShowLogs, | ||
| [int]$MaxJobs = 5 | ||
| ) | ||
|
|
||
| $ErrorActionPreference = "Stop" | ||
|
|
||
| function Get-AzDOBuildIdFromPR { | ||
| param([int]$PR) | ||
|
|
||
| Write-Host "Finding build for PR #$PR..." -ForegroundColor Cyan | ||
|
|
||
| # Use gh cli to get the checks | ||
| $checksOutput = gh pr checks $PR --repo dotnet/runtime 2>&1 | ||
|
|
||
| # Find the runtime build URL | ||
| $runtimeCheck = $checksOutput | Select-String -Pattern "runtime\s+fail.*buildId=(\d+)" | Select-Object -First 1 | ||
| if ($runtimeCheck) { | ||
| if ($runtimeCheck -match "buildId=(\d+)") { | ||
| return [int]$Matches[1] | ||
| } | ||
| } | ||
|
|
||
| # Try to find any failing build | ||
| $anyBuild = $checksOutput | Select-String -Pattern "buildId=(\d+)" | Select-Object -First 1 | ||
| if ($anyBuild -and $anyBuild -match "buildId=(\d+)") { | ||
| return [int]$Matches[1] | ||
|
lewing marked this conversation as resolved.
Outdated
|
||
| } | ||
|
|
||
| throw "Could not find Azure DevOps build for PR #$PR" | ||
| } | ||
|
|
||
| function Get-AzDOTimeline { | ||
| param([int]$Build) | ||
|
|
||
| $url = "https://dev.azure.com/$Organization/$Project/_apis/build/builds/$Build/timeline?api-version=7.0" | ||
| Write-Host "Fetching build timeline..." -ForegroundColor Cyan | ||
|
|
||
| try { | ||
| $response = Invoke-RestMethod -Uri $url -Method Get | ||
| return $response | ||
| } | ||
| catch { | ||
| throw "Failed to fetch build timeline: $_" | ||
| } | ||
| } | ||
|
|
||
| function Get-FailedJobs { | ||
| param($Timeline) | ||
|
|
||
| $failedJobs = $Timeline.records | Where-Object { | ||
| $_.type -eq "Job" -and $_.result -eq "failed" | ||
| } | ||
|
|
||
| return $failedJobs | ||
| } | ||
|
|
||
| function Get-HelixJobInfo { | ||
| param($Timeline, $JobId) | ||
|
|
||
| # Find tasks in this job that mention Helix | ||
| $helixTasks = $Timeline.records | Where-Object { | ||
| $_.parentId -eq $JobId -and | ||
| $_.name -like "*Helix*" -and | ||
| $_.result -eq "failed" | ||
| } | ||
|
|
||
| return $helixTasks | ||
| } | ||
|
|
||
| function Get-BuildLog { | ||
| param([int]$Build, [int]$LogId) | ||
|
|
||
| $url = "https://dev.azure.com/$Organization/$Project/_apis/build/builds/$Build/logs/$LogId`?api-version=7.0" | ||
|
lewing marked this conversation as resolved.
Outdated
|
||
|
|
||
| try { | ||
| $response = Invoke-RestMethod -Uri $url -Method Get | ||
| return $response | ||
| } | ||
| catch { | ||
| Write-Warning "Failed to fetch log $LogId`: $_" | ||
|
lewing marked this conversation as resolved.
Outdated
|
||
| return $null | ||
| } | ||
| } | ||
|
|
||
| function Extract-HelixUrls { | ||
| param([string]$LogContent) | ||
|
|
||
| $urls = @() | ||
|
|
||
| # Match Helix console log URLs | ||
| $matches = [regex]::Matches($LogContent, 'https://helix\.dot\.net/api/[^/]+/jobs/[a-f0-9-]+/workitems/[^/\s]+/console') | ||
| foreach ($match in $matches) { | ||
| $urls += $match.Value | ||
| } | ||
|
|
||
| return $urls | Select-Object -Unique | ||
| } | ||
|
|
||
| function Extract-TestFailures { | ||
| param([string]$LogContent) | ||
|
|
||
| $failures = @() | ||
|
|
||
| # Match test failure patterns from MSBuild output | ||
| $pattern = 'error\s*:\s*.*Test\s+(\S+)\s+has failed' | ||
| $matches = [regex]::Matches($LogContent, $pattern, [System.Text.RegularExpressions.RegexOptions]::IgnoreCase) | ||
|
|
||
| foreach ($match in $matches) { | ||
| $failures += @{ | ||
| TestName = $match.Groups[1].Value | ||
| FullMatch = $match.Value | ||
| } | ||
| } | ||
|
|
||
| return $failures | ||
| } | ||
|
|
||
| function Get-HelixConsoleLog { | ||
| param([string]$Url) | ||
|
|
||
| try { | ||
| $response = Invoke-RestMethod -Uri $Url -Method Get | ||
| return $response | ||
| } | ||
| catch { | ||
| Write-Warning "Failed to fetch Helix log from $Url`: $_" | ||
|
lewing marked this conversation as resolved.
Outdated
|
||
| return $null | ||
| } | ||
| } | ||
|
|
||
| function Format-TestFailure { | ||
| param([string]$LogContent) | ||
|
|
||
| # Extract the key failure information | ||
| $lines = $LogContent -split "`n" | ||
| $inFailure = $false | ||
| $failureLines = @() | ||
|
|
||
| foreach ($line in $lines) { | ||
| if ($line -match '\[FAIL\]') { | ||
| $inFailure = $true | ||
| } | ||
|
|
||
| if ($inFailure) { | ||
| $failureLines += $line | ||
|
|
||
| # Stop after stack trace ends | ||
| if ($failureLines.Count -gt 20) { | ||
| break | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return $failureLines -join "`n" | ||
| } | ||
|
|
||
| # Main execution | ||
| try { | ||
| # Get build ID if using PR number | ||
| if ($PSCmdlet.ParameterSetName -eq 'PRNumber') { | ||
| $BuildId = Get-AzDOBuildIdFromPR -PR $PRNumber | ||
| Write-Host "Found build ID: $BuildId" -ForegroundColor Green | ||
| } | ||
|
|
||
| Write-Host "`n=== Azure DevOps Build $BuildId ===" -ForegroundColor Yellow | ||
| Write-Host "URL: https://dev.azure.com/$Organization/$Project/_build/results?buildId=$BuildId" -ForegroundColor Gray | ||
|
|
||
| # Get timeline | ||
| $timeline = Get-AzDOTimeline -Build $BuildId | ||
|
|
||
| # Get failed jobs | ||
| $failedJobs = Get-FailedJobs -Timeline $timeline | ||
|
|
||
| if (-not $failedJobs -or $failedJobs.Count -eq 0) { | ||
| Write-Host "`nNo failed jobs found in build $BuildId" -ForegroundColor Green | ||
| exit 0 | ||
| } | ||
|
|
||
| Write-Host "`nFound $($failedJobs.Count) failed job(s):" -ForegroundColor Red | ||
|
|
||
| $processedJobs = 0 | ||
| foreach ($job in $failedJobs) { | ||
| if ($processedJobs -ge $MaxJobs) { | ||
| Write-Host "`n... and $($failedJobs.Count - $MaxJobs) more failed jobs (use -MaxJobs to see more)" -ForegroundColor Yellow | ||
| break | ||
| } | ||
|
|
||
| Write-Host "`n--- $($job.name) ---" -ForegroundColor Cyan | ||
|
|
||
| # Get Helix tasks for this job | ||
| $helixTasks = Get-HelixJobInfo -Timeline $timeline -JobId $job.id | ||
|
|
||
| if ($helixTasks) { | ||
| foreach ($task in $helixTasks) { | ||
| if ($task.log) { | ||
| Write-Host " Fetching Helix task log..." -ForegroundColor Gray | ||
| $logContent = Get-BuildLog -Build $BuildId -LogId $task.log.id | ||
|
|
||
| if ($logContent) { | ||
| # Extract test failures | ||
| $failures = Extract-TestFailures -LogContent $logContent | ||
|
|
||
| if ($failures.Count -gt 0) { | ||
| Write-Host " Failed tests:" -ForegroundColor Red | ||
| foreach ($failure in $failures) { | ||
| Write-Host " - $($failure.TestName)" -ForegroundColor White | ||
| } | ||
| } | ||
|
|
||
| # Extract and optionally fetch Helix URLs | ||
| $helixUrls = Extract-HelixUrls -LogContent $logContent | ||
|
|
||
| if ($helixUrls.Count -gt 0 -and $ShowLogs) { | ||
| Write-Host "`n Helix Console Logs:" -ForegroundColor Yellow | ||
|
|
||
| foreach ($url in $helixUrls | Select-Object -First 3) { | ||
| Write-Host "`n $url" -ForegroundColor Gray | ||
|
|
||
| $helixLog = Get-HelixConsoleLog -Url $url | ||
| if ($helixLog) { | ||
| $failureInfo = Format-TestFailure -LogContent $helixLog | ||
| if ($failureInfo) { | ||
| Write-Host $failureInfo -ForegroundColor White | ||
| } | ||
| } | ||
| } | ||
| } | ||
| elseif ($helixUrls.Count -gt 0) { | ||
| Write-Host "`n Helix logs available (use -ShowLogs to fetch):" -ForegroundColor Yellow | ||
| foreach ($url in $helixUrls | Select-Object -First 3) { | ||
| Write-Host " $url" -ForegroundColor Gray | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| else { | ||
| Write-Host " No Helix tasks found for this job" -ForegroundColor Gray | ||
|
|
||
| # Check if it's a build failure, not test failure | ||
| $buildTasks = $timeline.records | Where-Object { | ||
| $_.parentId -eq $job.id -and $_.result -eq "failed" | ||
| } | ||
|
|
||
| foreach ($task in $buildTasks | Select-Object -First 3) { | ||
| Write-Host " Failed task: $($task.name)" -ForegroundColor Red | ||
| } | ||
| } | ||
|
|
||
| $processedJobs++ | ||
| } | ||
|
|
||
| Write-Host "`n=== Summary ===" -ForegroundColor Yellow | ||
| Write-Host "Total failed jobs: $($failedJobs.Count)" -ForegroundColor Red | ||
| Write-Host "Build URL: https://dev.azure.com/$Organization/$Project/_build/results?buildId=$BuildId" -ForegroundColor Cyan | ||
|
|
||
| } | ||
| catch { | ||
| Write-Error "Error: $_" | ||
| exit 1 | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.