Tests: Adds benchmark for SubtreeEvaluator Expression.Compile memory impact#5586
Closed
kirankumarkolli wants to merge 28 commits intomasterfrom
Closed
Tests: Adds benchmark for SubtreeEvaluator Expression.Compile memory impact#5586kirankumarkolli wants to merge 28 commits intomasterfrom
kirankumarkolli wants to merge 28 commits intomasterfrom
Conversation
This plan defines a comprehensive workflow for GitHub Copilot agents to: - Triage GitHub issues (bugs, features, questions, documentation) - Reproduce issues against Cosmos DB emulator - Analyze code using local tools and Bluebird code graph - Propose workarounds and draft PRs - Validate fixes through local and remote CI Key sections: - Model configuration (Claude Opus 4.5) - Issue classification matrix - Reproduction workflow with multi-version testing - Performance benchmark validation - Remote CI validation (Azure Pipelines gates) - Branch naming convention (users/<name>/<feature>) - Lessons learned from Issue #5547 case study 🤖 This plan was created with GitHub Copilot assistance.
…cing - Expanded PR template to include full investigation details (not just summary) - Added issue summary table, root cause analysis section, before/after output - Added workaround documentation section - Added test results table and reproduction test template - Added Section 7.3.1: PR Reference Guidelines for proper linking - PR description now includes all information (no separate investigation issue needed)
- Added Phase 2.5: Expectations Validation - Validate user expectations against official docs before investigating - Check Microsoft Docs, API reference, SDK samples, test files - Determine if issue is bug vs. expected behavior vs. undocumented - Added learnings from PR #5583 CI monitoring: - PR title lint format: 'Category: (Adds|Fixes|Refactors|Removes) Description' - gh CLI installation and authentication steps - CI gate duration estimates (up to 90 minutes) - Monitoring loop workflow with 5-10 minute intervals - Added Section 16.5: CI Monitoring Workflow with check categories and timing
- Added Section 16.5.1: Azure DevOps Build Re-run via API - Documented PAT authentication requirements - Added PowerShell script for requeuing builds - Clarified limitation: 'Rerun failed jobs only' NOT available via API - Recommended approach: Use Azure DevOps web UI for failed-only reruns - Added Azure DevOps MCP Server reference (official Microsoft package)
- Section 0: MCP server setup (GitHub, Azure DevOps, Bluebird) - Section 16.7-16.12: Session learnings (flaky tests, draft PR workflow, parallel agents, templates) - Updated branch naming: users/<name>/copilot-<issue>-<feature> - Simplified CI retry to MCP-only approach
- Section 16.8.1: Explicit iterate-until-GREEN workflow - Categorize failures: code/flaky/infrastructure/unrelated - Response actions for each failure type - Success criteria and escalation paths - Quick reference commands for monitoring
- Async/await & CancellationToken patterns - Error handling with CosmosException - Logging with DefaultTrace - Breaking change detection (contracts) - Security review checklist - Performance considerations - Rollback strategy - Testing patterns (unit, emulator, mocking)
- Emulator-first validation before CI push - ~85% of tests run locally with emulator - Only ~15% require secrets (CI-only) - Quick commands for local testing - Emulator setup instructions
- Step-by-step setup for new machine (15-20 min) - Test workflow with real issue - Minimal 5-minute verification script - Troubleshooting common issues
- Microsoft.Azure.Cosmos.Tests runs without emulator - Removed unnecessary --filter from unit test commands - Split test coverage: unit (no deps) vs emulator vs CI-only
- Must show test output (Passed/Failed/Skipped counts) - Gate: DO NOT create PR if any tests fail - Added to PR checklist: PROOF OF LOCAL TESTS SHOWN
- Build/test with default AND /p:IsPreview=true - PREVIEW defines: PREVIEW;ENCRYPTIONPREVIEW - Proof required for BOTH configurations - Gate: fail in ANY configuration blocks PR
- GitHub MCP Server BLOCKED by SAML for Azure org - gh CLI WORKS (browser OAuth completes SAML SSO) - web_fetch as fallback for reading content - Added quick reference commands
- winget install GitHub.cli - gh auth login --web (browser SAML flow) - Required token scopes: repo, read:org - Verification commands
- Fix for 'Resource protected by organization SAML enforcement' - Steps: logout → login → authorize Azure org - Manual SSO authorization via github.com/settings/tokens - Alternative: github.com/orgs/Azure/sso
- Test gh issue list, detect SAML error - Auto logout and re-login sequence - PowerShell script for copy-paste
- Do NOT wait for user confirmation - Mark ready as soon as all checks green - CI passing = code validated, ready for review
- Rebase on master before marking ready - dotnet-v3-ci is PRIMARY gate (must be SUCCESS) - Other checks are secondary/informational
- phase_2_code_review: Run code-review agent before creating PR - Must address: bugs, security, null refs, resource leaks - Can ignore: style, minor refactoring - Gate: No high-signal issues before PR publish
- ALL benchmarks in Microsoft.Azure.Cosmos.Samples/Tools/Benchmark - Do not add benchmarks elsewhere - Added commands for running benchmarks
- Start next issue investigation during CI wait time - State tracking per issue (branch, PR, CI status) - Session plan.md format for tracking active issues - Recommended max 2-3 parallel issues - Commands for switching between issues
- Copy-paste prompt to start workflow - Guides through setup and customer-reported issues - Alternative prompt for specific issue number
- Step 0: gh pr list --search 'fixes #XXXX' - Skip investigation if PR already exists - Consider reviewing existing PR instead
…pact (#5487) This benchmark validates the fix proposed in PR #5488 by demonstrating: - Expression.Compile() takes 101ms for 1000 iterations (emits IL, JITs code) - Expression.Compile(preferInterpretation: true) takes 4ms (25.2x faster) The performance difference proves that interpreted mode avoids DynamicMethod IL generation, which is the root cause of unbounded native memory growth in long-running services using the LINQ provider. Related: #5487, PR #5488
Adds Section 17 with reusable patterns: - Performance/benchmark PR requirements (measurements required) - Pre-existing test failure documentation approach - Check for existing PRs before starting work - CI gate not started troubleshooting - Test filter patterns from CI YAML - Benchmark test template pattern - Workflow summary for performance issues - Session learnings capture as final workflow step (17.8) Updated Quick Start to include step 9: Capture learnings
- Emulator tests now required (not optional) - Removed area-specific tests section (too issue-specific) - Updated proof format to show emulator test results
Code review finding: test was using hardcoded input that didn't match the dynamically generated filter patterns. Now uses matching input and asserts the result is correct.
Member
Author
|
Superseded by #5588 which includes the full fix + benchmark |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Test: Add benchmark for SubtreeEvaluator Expression.Compile memory impact
Description
Related to #5487, supports PR #5488
This PR adds a benchmark test that validates the fix proposed in PR #5488 for the LINQ provider memory leak issue.
Issue Summary
Root Cause Analysis
Code Path
SubtreeEvaluator.cs:36 - EvaluateConstant() └─> SubtreeEvaluator.cs:119-120 - lambda.Compile()Root Cause
SubtreeEvaluator.EvaluateConstant()callsExpression.Lambda().Compile()which:Benchmark Results
Environment
Results
Compile()(Before)Compile(preferInterpretation: true)(After)Why This Proves the Fix
Changes Made
Microsoft.Azure.Cosmos.Tests/Linq/SubtreeEvaluatorMemoryBenchmark.csTesting
Local Test Results
*Pre-existing failures, not related to this change
New Tests Added
SubtreeEvaluatorMemoryBenchmarkTests.CompareCompileStrategies_PerformanceImpactSubtreeEvaluatorMemoryBenchmarkTests.SimulateLongRunningService_WithInterpretationExternal References
Generated by GitHub Copilot CLI Agent