Skip to content

[AI] Sample: Detail page, semantic search, streaming, and UI polish#34576

Merged
mattleibow merged 19 commits intomainfrom
dev/ai-sample-improvements
Apr 2, 2026
Merged

[AI] Sample: Detail page, semantic search, streaming, and UI polish#34576
mattleibow merged 19 commits intomainfrom
dev/ai-sample-improvements

Conversation

@mattleibow
Copy link
Copy Markdown
Member

@mattleibow mattleibow commented Mar 19, 2026

Note

Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!

Summary

Major improvements to the Essentials AI sample app: new Landmark Detail page with AI features, semantic search, streaming response handler improvements, and comprehensive UI polish across all pages for a modern, edge-to-edge experience on iOS and MacCatalyst.

New Features

  • Landmark Detail Page — New intermediate page between browsing and trip planning with AI-generated travel tips, similar destinations via semantic search, animated hashtag tags, language picker, full-bleed hero image with scrolling gradient overlay, and Plan Trip navigation
  • Semantic Search — Search bar on Landmarks page filters continent groups using semantic similarity with Timer-based debounce (300ms) and tracks recent searches for contextual AI descriptions
  • ISemanticSearchService abstraction — Clean interface backed by EmbeddingSearchService (Apple NL embeddings + cosine similarity + hybrid keyword boost + sentence chunking)
  • StreamingResponseHandler passthrough mode — Supports streaming without buffering, with new device tests (DeliversMultipleIncrementalUpdates, ConcatenatedText)
  • PromptBasedSchemaClient — Prompt-based JSON schema middleware for Phi Silica compatibility

UI Polish

  • Edge-to-edge layout — All pages use SafeAreaEdges="None" on root Grid with back buttons wrapped in SafeAreaEdges="Container"
  • Scrolling gradient pattern — Fixed gradient overlay + scrolling gradient that transitions to solid background
  • Custom search entry — Replaced SearchBar with borderless Entry in rounded Border with native border/focus ring removed on iOS/MacCatalyst
  • Edge-to-edge horizontal scrollers — Scroll to screen edges, align with page content padding at rest
  • Removed broken BoxView global style — Was breaking gradient overlays
  • Added missing Gray700 color — Prevented silent navigation crash
  • Background → Background property migration — Replaced BackgroundColor with Background across all pages

Code Quality

  • IDispatcher injection instead of MainThread.BeginInvokeOnMainThread
  • Only-once AI initialization — Guard against re-entry in LandmarkDetailViewModel
  • Null safety — Fixed CS8602 warnings in DataService search methods
  • Removed debug logging

Navigation Flow

LandmarksPage (browse + search) → LandmarkDetailPage (details + AI tips) → TripPlanningPage (itinerary generation)

Deleted Files

  • LandmarkDescriptionView, LandmarkTripView — Replaced by new pages
  • LanguagePreferenceService — Refactored into inline language array

New Test Coverage

  • StreamingResponseHandlerTests/Passthrough.cs — Unit tests for passthrough streaming mode
  • ChatClientStreamingTestsBase — Updated device tests for streaming scenarios

Copilot AI review requested due to automatic review settings March 19, 2026 17:59
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 19, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34576

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34576"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR evolves the Essentials.AI.Sample app’s browsing → detail → planning flow by adding a new Landmark Detail page with AI-driven content, introducing a semantic-search abstraction + implementation, and updating UX/styling across the landmarks and itinerary experiences.

Changes:

  • Added LandmarkDetailPage + LandmarkDetailViewModel and updated navigation flow from Landmarks → Detail → Trip planning.
  • Introduced ISemanticSearchService and an in-memory EmbeddingSearchService, and refactored DataService to index/search via the abstraction.
  • Updated UI/UX styling and interaction behavior across several Views/Pages (search UI, backgrounds, transparency/input handling).

Reviewed changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/AI/samples/Essentials.AI.Sample/Views/Landmarks/LandmarkListItemView.xaml Makes list item background transparent and prevents child hit-testing so border tap works reliably.
src/AI/samples/Essentials.AI.Sample/Views/Landmarks/LandmarkHorizontalListView.xaml Adds padding to horizontal landmarks list layout.
src/AI/samples/Essentials.AI.Sample/Views/Landmarks/LandmarkFeaturedItemView.xaml Aligns featured item styling with list items (transparent bg, input transparency).
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/LandmarkTripView.xaml.cs Deleted (trip view merged into TripPlanningPage).
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/LandmarkTripView.xaml Deleted (trip view merged into TripPlanningPage).
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/LandmarkDescriptionView.xaml.cs Deleted (replaced by LandmarkDetailPage).
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/LandmarkDescriptionView.xaml Deleted (replaced by LandmarkDetailPage).
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/ItineraryView.xaml Updates rationale card colors/background styling.
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/DayView.xaml Switches Border to Background usage.
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/ActivityListView.xaml Switches Border to Background usage.
src/AI/samples/Essentials.AI.Sample/Views/ChatOverlayView.xaml Switches multiple Borders/Buttons to Background usage.
src/AI/samples/Essentials.AI.Sample/ViewModels/TripPlanningViewModel.cs Removes initial state flow; auto-starts itinerary generation on load.
src/AI/samples/Essentials.AI.Sample/ViewModels/LandmarksViewModel.cs Adds debounced search query support + recent searches tracking + dispatcher usage.
src/AI/samples/Essentials.AI.Sample/ViewModels/LandmarkDetailViewModel.cs New VM powering the detail page: similar destinations, tags, AI travel tip, language preference.
src/AI/samples/Essentials.AI.Sample/Services/TaggingService.cs Switches tag generation from structured JSON response to comma-separated parsing.
src/AI/samples/Essentials.AI.Sample/Services/ISemanticSearchService.cs New semantic search abstraction + result record.
src/AI/samples/Essentials.AI.Sample/Services/EmbeddingSearchService.cs New in-memory embedding search implementation (chunking + cosine similarity + keyword boost).
src/AI/samples/Essentials.AI.Sample/Services/DataService.cs Refactors indexing/search to use ISemanticSearchService and readiness tasks.
src/AI/samples/Essentials.AI.Sample/Services/ChatService.cs Uses dispatcher for navigation callback; updates constructor dependencies.
src/AI/samples/Essentials.AI.Sample/Resources/Styles/Styles.xaml Removes global BoxView background style.
src/AI/samples/Essentials.AI.Sample/Resources/Styles/Colors.xaml Adds Gray700.
src/AI/samples/Essentials.AI.Sample/Pages/TripPlanningPage.xaml.cs Removes chat overlay wiring; triggers itinerary initialization on load.
src/AI/samples/Essentials.AI.Sample/Pages/TripPlanningPage.xaml Inlines trip planning UI and updates layout for full-bleed header + back button.
src/AI/samples/Essentials.AI.Sample/Pages/LandmarksPage.xaml.cs Adds iOS/MacCatalyst handler tweaks for search Entry; navigates to new detail page.
src/AI/samples/Essentials.AI.Sample/Pages/LandmarksPage.xaml Adds search UI; hides featured section during search; style adjustments.
src/AI/samples/Essentials.AI.Sample/Pages/LandmarkDetailPage.xaml.cs New page code-behind handling back, plan trip, and similar landmark navigation.
src/AI/samples/Essentials.AI.Sample/Pages/LandmarkDetailPage.xaml New detail page UI: hero image, tags, AI tip, similar destinations, language picker, plan trip CTA.
src/AI/samples/Essentials.AI.Sample/Models/PointOfInterest.cs Removes stored embeddings from model.
src/AI/samples/Essentials.AI.Sample/Models/Landmark.cs Removes stored embeddings from model.
src/AI/samples/Essentials.AI.Sample/MauiProgram.cs Registers new page/VM; registers ISemanticSearchService for iOS/MacCatalyst.
src/AI/samples/Essentials.AI.Sample/AppShell.xaml.cs Registers new LandmarkDetailPage route.

@mattleibow mattleibow changed the title [AI] Sample: Landmark detail page, semantic search, and UX improvements [AI] Sample: Detail page, semantic search, streaming, and UI polish Mar 19, 2026
@mattleibow mattleibow requested a review from Copilot March 19, 2026 23:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR significantly expands and refines the Essentials.AI.Sample app’s UX and navigation flow (Landmarks → new Detail page → Trip planning), while also enhancing streaming support in Essentials.AI with a passthrough mode and accompanying tests.

Changes:

  • Added a new LandmarkDetailPage + LandmarkDetailViewModel with AI travel tips, tags, weather, similar-destination search, and language selection.
  • Introduced semantic search abstractions (ISemanticSearchService) and an embedding-backed in-memory implementation, then refactored DataService and UI to use it.
  • Updated StreamingResponseHandler to support “no chunker” passthrough streaming, plus new unit/device tests around incremental streaming behavior.

Reviewed changes

Copilot reviewed 42 out of 44 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/AI/src/Essentials.AI/Platform/StreamingResponseHandler.cs Adds nullable chunker + passthrough streaming mode and adjusts flush/reset behavior accordingly.
src/AI/tests/Essentials.AI.UnitTests/Tests/StreamingResponseHandlerTests/Passthrough.cs New unit tests validating passthrough (no chunker) streaming semantics.
src/AI/tests/Essentials.AI.DeviceTests/Tests/ChatClientStreamingTestsBase.cs Adds device tests intended to validate incremental updates and streaming/non-streaming parity.
src/AI/samples/Essentials.AI.Sample/Pages/LandmarksPage.xaml Adds a custom search entry UI; hides featured section during search; edge-to-edge layout tweaks.
src/AI/samples/Essentials.AI.Sample/Pages/LandmarksPage.xaml.cs Adds platform handler tweaks to remove native borders/focus rings; routes taps to new detail page and passes recent searches.
src/AI/samples/Essentials.AI.Sample/ViewModels/LandmarksViewModel.cs Adds debounced search query pipeline + recent-search tracking; switches UI thread dispatching to injected IDispatcher.
src/AI/samples/Essentials.AI.Sample/Pages/LandmarkDetailPage.xaml New landmark detail UI with hero image, tags, AI tip panel, similar destinations, language picker, and bottom CTA bar.
src/AI/samples/Essentials.AI.Sample/Pages/LandmarkDetailPage.xaml.cs New page code-behind: init/cancel wiring, plan-trip navigation, similar-destination navigation.
src/AI/samples/Essentials.AI.Sample/ViewModels/LandmarkDetailViewModel.cs New VM: loads similar destinations, weather, tags, and AI travel tip concurrently with cancellation support.
src/AI/samples/Essentials.AI.Sample/Pages/TripPlanningPage.xaml Reworks trip planning to match new edge-to-edge/gradient pattern; removes embedded old trip view.
src/AI/samples/Essentials.AI.Sample/Pages/TripPlanningPage.xaml.cs Simplifies page (removes chat overlay hookup) and switches to fire-and-forget initialization.
src/AI/samples/Essentials.AI.Sample/ViewModels/TripPlanningViewModel.cs Refactors trip planning state machine; removes initial/tags/button flow; auto-starts itinerary generation; adds query language input.
src/AI/samples/Essentials.AI.Sample/Services/DataService.cs Refactors from “embedding generation” to ISemanticSearchService indexing + search; adds POI name dictionary.
src/AI/samples/Essentials.AI.Sample/Services/ISemanticSearchService.cs New interface to abstract semantic indexing/search and readiness signaling.
src/AI/samples/Essentials.AI.Sample/Services/EmbeddingSearchService.cs New in-memory embedding search using cosine similarity + hybrid keyword boost + sentence chunking.
src/AI/samples/Essentials.AI.Sample/MauiProgram.cs Registers new page/vm; removes language preference service; registers semantic search service for iOS/MacCatalyst Apple embeddings.
src/AI/samples/Essentials.AI.Sample/AppShell.xaml.cs Registers LandmarkDetailPage route for Shell navigation.
src/AI/samples/Essentials.AI.Sample/Services/ItineraryService.cs Removes legacy landmark/dayCount overload (language preference removed).
src/AI/samples/Essentials.AI.Sample/Services/ChatService.cs Removes “set language” tool; switches navigation dispatch to injected IDispatcher.
src/AI/samples/Essentials.AI.Sample/Services/TaggingService.cs Switches tag generation from structured JSON to comma-separated text parsing for broader compatibility.
src/AI/samples/Essentials.AI.Sample/Services/PromptBasedSchemaClient.cs New delegating client translating JSON schema requests into prompt instructions and stripping markdown fences.
src/AI/samples/Essentials.AI.Sample/Services/WeatherService.cs Ensures invariant-culture formatting for lat/long query parameters.
src/AI/samples/Essentials.AI.Sample/AI/WorkflowModels.cs Updates workflow DTOs to use shorter JSON property names (via JsonPropertyName).
src/AI/samples/Essentials.AI.Sample/AI/ItineraryWorkflowExtensions.cs Updates agent instructions to reflect new JSON field names; tweaks itinerary planner temperature.
src/AI/samples/Essentials.AI.Sample/Views/Landmarks/LandmarkListItemView.xaml UI polish for list item shadow/background + input transparency adjustments.
src/AI/samples/Essentials.AI.Sample/Views/Landmarks/LandmarkFeaturedItemView.xaml UI polish for featured card shadow/background + input transparency adjustments.
src/AI/samples/Essentials.AI.Sample/Views/Landmarks/LandmarkHorizontalListView.xaml Adds padding to avoid shadow clipping and align edge-to-edge scroller behavior.
src/AI/samples/Essentials.AI.Sample/Views/ChatOverlayView.xaml(.cs) Updates bubble styling and scroll behavior to better follow streaming updates.
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/DayView.xaml Adds WinUI placeholder for map and uses Background instead of BackgroundColor.
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/ActivityListView.xaml Migrates from BackgroundColor to Background for borders.
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/ItineraryView.xaml Updates itinerary rationale card styling and theme colors.
src/AI/samples/Essentials.AI.Sample/Resources/Styles/Styles.xaml Removes global BoxView background style that interfered with gradients.
src/AI/samples/Essentials.AI.Sample/Resources/Styles/Colors.xaml Adds missing Gray700 color resource.
src/AI/samples/Essentials.AI.Sample/Models/Landmark.cs Removes per-landmark embedding storage.
src/AI/samples/Essentials.AI.Sample/Models/PointOfInterest.cs Removes per-POI embedding storage.
src/AI/samples/Essentials.AI.Sample/Resources/Raw/landmarkData.json Adds a new landmark entry (Bunaken Marine Park).
src/AI/samples/Essentials.AI.Sample/Resources/Raw/Images/Thumbnails/1023_thumb.jpg Adds thumbnail for the new landmark.
src/AI/samples/Essentials.AI.Sample/Services/LanguagePreferenceService.cs Deleted (language now selected in-detail and passed via navigation).
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/LandmarkTripView.xaml(.cs) Deleted (replaced by updated TripPlanningPage).
src/AI/samples/Essentials.AI.Sample/Views/Itinerary/LandmarkDescriptionView.xaml(.cs) Deleted (moved into new detail/flow).

@MauiBot MauiBot added s/agent-changes-requested AI agent recommends changes - found a better alternative or issues s/agent-fix-win AI found a better alternative fix than the PR s/agent-reviewed PR was reviewed by AI agent workflow (full 4-phase review) labels Mar 23, 2026
@dotnet dotnet deleted a comment from MauiBot Mar 23, 2026
mattleibow and others added 8 commits March 26, 2026 04:10
- New LandmarkDetailPage with AI travel tips, similar destinations, tags
- Semantic search bar on LandmarksPage with debounced filtering
- ISemanticSearchService abstraction with EmbeddingSearchService and AppContentIndexerSearchService
- Refactored DataService to use ISemanticSearchService instead of IEmbeddingGenerator
- TaggingService switched to plaintext comma-separated parsing
- TravelPlannerExecutor uses prompt-based JSON instead of structured output
- TripPlanningPage auto-starts itinerary generation (no initial state)
- Merged LandmarkTripView into TripPlanningPage, deleted unused views
- Removed NonFunctionInvokingChatClient reference
- Language picker moved to LandmarkDetailPage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix navigation crash: add missing Gray700 color to Colors.xaml
- Fix gradient overlays: remove broken BoxView global style, use transparent-white gradients
- Replace SearchBar with borderless Entry, remove native border/focus ring on MacCatalyst
- Edge-to-edge horizontal scrollers with aligned padding
- Scrolling gradient + solid background pattern on detail and trip planning pages
- Safe area: use SafeAreaEdges=None on root Grids, Container on back buttons
- Debounced search with Timer instead of per-keystroke async work
- Use IDispatcher instead of MainThread.BeginInvokeOnMainThread
- Only-once AI initialization on LandmarkDetailPage
- Migrate BackgroundColor to Background across pages and views
- Fix null dereference warnings in DataService

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… Bunaken landmark

- PromptBasedSchemaClient middleware for prompt-based structured output
- Fixed chat bubble alignment and scroll-to-end on Windows
- Removed LanguagePreferenceService — language passed via navigation
- Added Temperature for creative agents (0.8) and travel tips (0.9)
- Fixed WeatherService locale issue (InvariantCulture)
- Added weather display to LandmarkDetailPage
- Map placeholder on Windows (OnPlatform)
- TravelPlanResult uses short JsonPropertyName for SLM reliability
- StreamingResponseHandler passthrough mode + 7 unit tests
- Added Bunaken Marine Park (Manado, Indonesia) landmark
- Fixed agent instruction examples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tenatedText)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mattleibow mattleibow force-pushed the dev/ai-sample-improvements branch from 4d54ba8 to b342d97 Compare March 26, 2026 02:14
PromptBasedSchemaClient reads DisplayName for prompt-based JSON schema
generation, which works across all platforms including Phi Silica.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🧪 PR Test Evaluation

Overall Verdict: ✅ Tests are adequate

The 7 new passthrough unit tests directly cover all modified code paths in StreamingResponseHandler. The device test base class is a solid addition for future platform-specific streaming tests.

👍 / 👎 — Was this evaluation helpful? React to let us know!

📊 Expand Full Evaluation

PR Test Evaluation Report

PR: #34576 — Add passthrough mode to StreamingResponseHandler (Windows Phi Silica support)
Test files evaluated: 2 (Passthrough.cs, ChatClientStreamingTestsBase.cs)
Fix files: 1 core (StreamingResponseHandler.cs) + 38 sample app files


Overall Verdict

Tests are adequate

The unit tests directly exercise all three modified code paths in StreamingResponseHandler (passthrough in ProcessContent, tool-call guard, and complete guard). The device test base class adds a useful integration scaffold, though no concrete Phi Silica subclass is wired up yet.


1. Fix Coverage — ✅

The fix changes three code paths in StreamingResponseHandler.cs:

  1. ProcessContent_chunker is not null ? _chunker.Process(text) : text
  2. ProcessToolCall — flush/reset guarded by if (_chunker is not null)
  3. Complete — flush guarded by if (_chunker is not null)

All three are exercised by the passthrough unit tests:

  • ProcessContent_WithoutChunker_PassesDeltasDirectly → path 1
  • ToolCall_WithoutChunker_NoFlushNeeded → path 2 (verifies no flush/reset happens)
  • Complete_WithoutChunker_NoExtraFlush → path 3

These tests would fail if the fix were reverted (e.g., a NullReferenceException on _chunker.Process without the null check).


2. Edge Cases & Gaps — ✅

Covered:

  • Null text is ignored (not emitted)
  • Empty string is ignored (not emitted)
  • Duplicate content passes through as-is (verifies no delta computation in passthrough mode)
  • Tool calls work without pre-flush in passthrough mode
  • Error propagation via CompleteWithError

Minor gaps (low priority):

  • ProcessToolResult is not explicitly tested with the passthrough constructor — though this method doesn't use _chunker at all, so both modes behave identically
  • Cancellation tests (CancellationTests) use the chunker constructor only; a passthrough cancellation test would be symmetric but the behavior is identical since cancellation is handled by the channel, not the chunker
  • The device test base class ChatClientStreamingTestsBase adds integration tests for real AI model streaming, but there is no concrete PhiSilicaChatClientStreamingTests subclass in this PR — this leaves Phi Silica end-to-end streaming untested at the device level

3. Test Type Appropriateness — ✅

Unit tests (Passthrough.cs): ✅ Perfect fit. StreamingResponseHandler is pure logic using in-memory channels — no platform context, no UI, no I/O. Unit tests are the ideal choice.

Device tests (ChatClientStreamingTestsBase): ✅ Appropriate. The base class tests IChatClient.GetStreamingResponseAsync which requires a real AI model to produce actual responses. Device tests are correct here. The base class is already used by AppleIntelligenceChatClientStreamingTests and OpenAIChatClientStreamingTests.


4. Convention Compliance — ✅

  • Uses [Fact] attributes throughout ✅
  • Follows existing partial class StreamingResponseHandlerTests pattern ✅
  • Nested class naming (PassthroughTests) matches sibling files (ContentTests, ToolTests, etc.) ✅
  • Device test base class uses [Fact] and generic type constraint where T : class, IChatClient, new() — consistent with other base classes in the project ✅

No convention issues found.


5. Flakiness Risk — ✅ Low (unit) / ⚠️ Medium (device)

Unit tests: Very low risk. All operations are in-memory Channel(T) reads/writes with no timing dependencies.

Device tests: Medium inherent risk. Tests depend on real AI model responses and are subject to model latency, network conditions, and response variability. The test GetStreamingResponseAsync_DeliversMultipleIncrementalUpdates asserts textUpdates.Count > 2 — this could be flaky if a model returns a very short response in few chunks.


6. Duplicate Coverage — ✅ No duplicates

The passthrough tests (PassthroughTests) are complementary to the existing chunker-based tests (ContentTests, ToolTests). They test the same public API surface but via a different constructor code path. No redundancy.


7. Platform Scope — ✅

The fix is cross-platform (no #if directives). Unit tests run on all platforms. The device test base class is activated per-platform via subclasses. Coverage is appropriate given the scope of the fix.


8. Assertion Quality — ✅

Assertions are precise and specific:

  • Exact count: Assert.Equal(2, updates.Count)
  • Exact text: Assert.Equal("Hello", updates[0].Contents.OfType(TextContent)().Single().Text)
  • Type checking: Assert.IsType(FunctionCallContent)(updates[1].Contents[0])
  • Role checking: Assert.Equal(ChatRole.Assistant, ...)

No vague or meaningless assertions found.


9. Fix-Test Alignment — ✅

Tests target StreamingResponseHandler directly — exactly the class modified by the fix. The device test base class tests the public IChatClient streaming interface, which is the contract that StreamingResponseHandler ultimately supports.


Recommendations

  1. Low priority — Add a PhiSilicaChatClientStreamingTests subclass when the Windows Phi Silica IChatClient is merged, to give the new ChatClientStreamingTestsBase a concrete test run on the target platform.
  2. Low priority — Consider adding a passthrough cancellation test for symmetry with CancellationTests, even though the behavior is identical to the chunker path (channel handles it regardless).
  3. Not required — The 38 sample app changes don't need automated tests; samples are demo code.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • dc.services.visualstudio.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "dc.services.visualstudio.com"

See Network Configuration for more information.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

🧪 Test evaluation by Evaluate PR Tests

DisplayName does not work on properties for JSON schema generation —
it only applies at the class/type level. JsonPropertyName is needed
for PromptBasedSchemaClient to emit the correct short property names.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🧪 PR Test Evaluation

Overall Verdict: ⚠️ Tests need improvement

The unit tests for StreamingResponseHandler passthrough mode are excellent and comprehensively cover the core fix. The two new device tests have assertion quality and flakiness concerns worth addressing.

👍 / 👎 — Was this evaluation helpful? React to let us know!

📊 Expand Full Evaluation

PR Test Evaluation Report

PR: #34576 — AI Sample improvements + StreamingResponseHandler passthrough mode
Test files evaluated: 2
Fix files: 40 (1 StreamingResponseHandler.cs + 39 sample app files)


Overall Verdict

⚠️ Tests need improvement

The unit tests are excellent for the core fix. The two new device tests have weak assertions and flakiness risk that should be addressed before merge.


1. Fix Coverage — ✅

The Passthrough.cs unit tests cover all the key code paths added in StreamingResponseHandler.cs:

  • ProcessContent passthrough path (_chunker is not null ? ... : text) — covered by PassesDeltasDirectly and DoesNotAccumulateOrDelta
  • Null guard in ProcessContent — covered by NullIgnored
  • string.IsNullOrEmpty guard in ProcessContent — covered by EmptyStringIgnored
  • Complete() without chunker (no extra flush) — covered by NoExtraFlush
  • ProcessToolCall without chunker (no flush, no reset) — covered by ToolCall_WithoutChunker_NoFlushNeeded
  • CompleteWithError passthrough — covered by Error_WithoutChunker_PropagatesException

The 39 sample app files have no tests, but this is appropriate — sample code does not typically require automated test coverage.


2. Edge Cases & Gaps — ⚠️

Covered:

  • Null content input ignored
  • Empty string content ignored
  • Duplicate delta values pass through (not deduplicated like chunker mode)
  • Tool calls don't flush in passthrough mode
  • Error propagation via CompleteWithError

Missing:

  • ProcessToolResult in passthrough mode — Not tested, though the method has no _chunker gate so behavior is identical in both modes; a brief note-test would still confirm no regression.
  • Device test assertion gapConcatenatedTextMatchesNonStreaming collects both streaming and non-streaming text but only asserts IsNotNullOrEmpty on each — it never compares them for equality (despite the method name implying it should). The critical assertion streamingText == nonStreamingText (or at least that both contain "Paris") is absent.

3. Test Type Appropriateness — ✅

Core fix (StreamingResponseHandler):
Current: Unit Tests + Device Tests
Recommendation: Correct. The passthrough logic is pure C# with no native dependencies — unit tests are the right choice. Device tests add end-to-end confidence on real AI hardware, which is appropriate for streaming behavior that depends on platform AI models.

Sample app changes (39 files):
No tests — appropriate for sample/demo code.


4. Convention Compliance — ✅

  • [Fact] attributes used correctly ✅
  • partial class StreamingResponseHandlerTests used properly with shared ReadAll() helper in StreamingResponseHandlerTests.cs
  • No issues flagged by automated script ✅

5. Flakiness Risk — ⚠️ Medium

Unit tests: Very low risk ✅ — deterministic, no I/O, no timing dependencies.

Device tests: Moderate risk ⚠️

Test Risk Reason
DeliversMultipleIncrementalUpdates Medium Asserts Count > 2 on a real AI response. AI models may return large chunks reducing update count on slow hardware or with short context windows.
ConcatenatedTextMatchesNonStreaming Medium Makes two separate AI calls and asserts both are non-empty. Non-deterministic AI responses across calls make stronger equality assertions risky, but the current assertions don't actually test the stated invariant.

6. Duplicate Coverage — ✅ No duplicates

No existing tests for passthrough mode existed before this PR. The new Passthrough.cs cleanly parallels the existing Content.cs test class (which tests the chunker-based path).


7. Platform Scope — ✅

  • Unit tests: Run on all platforms — correct, since StreamingResponseHandler is pure C#.
  • Device tests in ChatClientStreamingTestsBase(T): Run on platforms with a concrete IChatClient implementation (iOS/macOS for Apple Intelligence, Windows for Phi Silica). Cross-platform coverage matches the intent of the passthrough fix (Windows Phi Silica uses passthrough mode).

8. Assertion Quality — ⚠️

Unit tests: Strong ✅

Assert.Equal(2, updates.Count);
Assert.Equal("Hello", updates[0].Contents.OfType(TextContent)().Single().Text);

Specific, will catch regressions.

Device tests: Weak ⚠️

// ConcatenatedTextMatchesNonStreaming — the test name implies equality, but:
Assert.False(string.IsNullOrEmpty(streamingText.ToString()), "Streaming response should not be empty");
Assert.False(string.IsNullOrEmpty(nonStreamingText), "Non-streaming response should not be empty");

These assertions only check non-empty strings — they don't verify the streaming and non-streaming results agree on content. The stated invariant ("Both should contain the answer (Paris)") is never actually asserted.


9. Fix-Test Alignment — ✅

The new Passthrough.cs unit tests target exactly StreamingResponseHandler with the new parameterless constructor. The device tests test the higher-level streaming API (GetStreamingResponseAsync) which ultimately flows through the fixed code on each platform. Alignment is good.


Recommendations

  1. Fix ConcatenatedTextMatchesNonStreaming assertions — either assert that both responses contain the expected answer (e.g., Assert.Contains("Paris", streamingText.ToString(), StringComparison.OrdinalIgnoreCase)), or rename the test to accurately reflect what it tests. Currently the test name is misleading.

  2. Consider adding a tolerance to DeliversMultipleIncrementalUpdates — the Count > 2 threshold may be flaky on slower devices or with AI models that return large chunks. Consider lowering to Count > 1 or adding a note explaining the threshold.

  3. Optional: Add a brief ProcessToolResult smoke test in passthrough mode — the method is unchanged but a test would confirm no regression during future refactors.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • dc.services.visualstudio.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "dc.services.visualstudio.com"

See Network Configuration for more information.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

🧪 Test evaluation by Evaluate PR Tests

@github-actions
Copy link
Copy Markdown
Contributor

🧪 PR Test Evaluation

Overall Verdict: ⚠️ Tests need improvement

The StreamingResponseHandler passthrough mode is well-tested with 7 focused unit tests, but the 40+ sample app file changes (including the IsPositiveConverter bug fix) have no test coverage.

👍 / 👎 — Was this evaluation helpful? React to let us know!

📊 Expand Full Evaluation

PR Test Evaluation Report

PR: #34576 — AI Sample improvements + StreamingResponseHandler passthrough mode
Test files evaluated: 2 (Passthrough.cs unit tests, ChatClientStreamingTestsBase.cs device tests)
Fix files: 46 (1 non-sample platform file: StreamingResponseHandler.cs; 40+ sample app files; 3 binary/data assets)


Overall Verdict

⚠️ Tests need improvement

The core platform fix (StreamingResponseHandler passthrough mode) has excellent, focused unit test coverage. However, the IsPositiveConverter bug fix and the large sample app changes (40+ files) are completely untested, with no unit tests covering converter logic or ViewModel behavior.


1. Fix Coverage — ⚠️ Partial

StreamingResponseHandler (platform fix) — ✅ Well covered

The change makes _chunker optional, adds a no-arg constructor, and guards all _chunker usages with null checks. The Passthrough.cs test file directly exercises every branch of the new passthrough mode: ProcessContent, ProcessToolCall, Complete, CompleteWithError.

IsPositiveConverter (sample app bug fix) — ❌ Not covered

The commit message describes a real behavioral fix: IsNotNullConverter was incorrectly used for SimilarDestinations.Count (an int is never null, so it always returned true even when count was 0). The new IsPositiveConverter fixes this. No test was added for this converter logic.

Sample app changes (40 files) — ❌ Not covered

ViewModels, services, XAML bindings, and UI logic changes have no associated tests.


2. Edge Cases & Gaps — ⚠️ Mostly covered, one gap

Covered by Passthrough.cs:

  • null input ignored ✅
  • Empty string ignored ✅
  • Multiple identical tokens pass through (no delta computation) ✅
  • Tool call with preceding and following content ✅
  • Error propagation via CompleteWithError
  • Complete() emits no extra flush update ✅
  • Direct delta passthrough (no accumulation) ✅

Missing:

  • ProcessToolResult in passthrough mode — Tools.cs tests this for chunker mode only; no test for the passthrough path. (The code for ProcessToolResult is actually the same in both modes — it doesn't guard on _chunker — but a test would confirm this.)
  • IsPositiveConverter edge cases: 0, negative int, non-int input — none tested.

3. Test Type Appropriateness — ✅ Appropriate

Passthrough.cs — Unit test for StreamingResponseHandler: correct. Pure in-memory logic, no platform context needed. ⭐

ChatClientStreamingTestsBase.cs — Device test base class for IChatClient streaming: correct. Testing real AI model streaming behavior requires a platform context and actual hardware/SDK integration. ⭐⭐

IsPositiveConverter — if a test were added, a unit test would be sufficient (simple value converter logic, no platform needed). ⭐


4. Convention Compliance — ✅ Pass

  • [Fact] attributes used correctly (xUnit) ✅
  • Partial class pattern used properly for StreamingResponseHandlerTests
  • Device test uses abstract base class pattern with [Fact] — appropriate for integration ✅
  • No convention issues detected by automated script ✅
  • ChatClientStreamingTestsBase(T) is an abstract generic base; concrete subclasses will register the [Fact] tests — this is an acceptable pattern for shared test contracts ✅

5. Flakiness Risk — ⚠️ Medium (device tests only)

Unit tests — ✅ Low risk. Fully deterministic, no I/O, no timing dependencies.

Device tests⚠️ Medium risk. These call real AI model APIs (GetStreamingResponseAsync, GetResponseAsync) which have non-deterministic outputs. The test has been correctly loosened (from >= 2 chunks to >= 1) and the ConcatenatedTextMatchesNonStreaming test checks for "paris" case-insensitively. However:

  • AI model latency variability may cause timeouts on CI
  • The "paris" assertion in GetStreamingResponseAsync_ConcatenatedTextMatchesNonStreaming assumes a specific answer — while this is a simple factual question, model behavior can vary

6. Duplicate Coverage — ✅ No duplicates

Passthrough.cs tests a new code path (no-chunker constructor) not previously tested. Existing files (Content.cs, Tools.cs, Completion.cs, etc.) all test the chunker path only. No overlap.


7. Platform Scope — ✅ Appropriate

  • StreamingResponseHandler is cross-platform; unit tests run on all platforms ✅
  • Device tests are platform-specific by design (require hardware AI client) ✅
  • Sample app changes are cross-platform but are sample code — not typically unit tested ✅

8. Assertion Quality — ✅ Good

Unit tests: Assertions are precise — exact text values, exact counts, specific content types (TextContent, FunctionCallContent). Would catch regressions in passthrough behavior.

Device tests: Appropriately loose for non-deterministic AI responses — checks non-empty, >= 1 update, contains expected answer. The ConcatenatedTextMatchesNonStreaming test is the strongest — it validates that streaming and non-streaming produce consistent output.


9. Fix-Test Alignment — ⚠️ Partial

Passthrough.cs aligns tightly with the StreamingResponseHandler.cs diff — every new code path is exercised. ✅

IsPositiveConverter.cs changed to fix a behavioral bug (always-true for int binding) but has no corresponding test. ⚠️

The 40 sample app files are sample code changes (new features, UI polish, new landmark data). While coverage of sample app logic is not strictly required, the converter fix is a real behavioral bug that could silently regress. ⚠️


Recommendations

  1. Add a unit test for IsPositiveConverter — The bug fix (replacing IsNotNullConverter with IsPositiveConverter for SimilarDestinations.Count) is a real behavioral fix. A simple xUnit test with values 0, 1, 5, -1 would prevent silent regression and takes under 10 lines.

  2. Add ProcessToolResult passthrough test — Small gap: confirm ProcessToolResult works in passthrough mode (it does, since there's no _chunker guard, but a test makes the contract explicit).

  3. Consider loosening ConcatenatedTextMatchesNonStreaming further — The "paris" assertion is a reasonable heuristic, but if the model returns "Paris, France" (and the assertion is Contains("paris", OrdinalIgnoreCase)), this should work. Document why "paris" is chosen as the expected answer.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • dc.services.visualstudio.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "dc.services.visualstudio.com"

See Network Configuration for more information.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

🧪 Test evaluation by Evaluate PR Tests

…nd test assertions

- Replace IsNotNullConverter with new IsPositiveConverter for
  SimilarDestinations.Count bindings (int is never null, so
  IsNotNull always returned true even for 0 items)
- Loosen streaming test assertion from >2 chunks to >=1 (streaming
  contract does not guarantee minimum chunk count)
- Strengthen concatenated-text test to assert both responses contain
  'Paris' instead of only checking non-empty

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mattleibow mattleibow force-pushed the dev/ai-sample-improvements branch from 3ad33f1 to e61cdee Compare March 31, 2026 14:04
@github-actions
Copy link
Copy Markdown
Contributor

🧪 PR Test Evaluation

Overall Verdict: ✅ Tests are adequate

The StreamingResponseHandler passthrough mode fix is well-tested with 7 focused unit tests. The large sample app changes (40+ files) are intentionally untested, which is appropriate for sample/demo code.

👍 / 👎 — Was this evaluation helpful? React to let us know!

📊 Expand Full Evaluation

PR Test Evaluation Report

PR: #34576 — AI sample improvements + StreamingResponseHandler passthrough mode
Test files evaluated: 2 (Passthrough.cs, ChatClientStreamingTestsBase.cs)
Fix files: 41 (40 sample app files + StreamingResponseHandler.cs)


Overall Verdict

Tests are adequate

The core fix — adding a parameterless constructor to StreamingResponseHandler for Windows Phi Silica passthrough mode — is comprehensively covered by 7 well-written unit tests. The sample app changes are expected to be untested.


1. Fix Coverage — ✅

The Passthrough.cs unit tests directly exercise the new code paths in StreamingResponseHandler.cs:

  • The new parameterless constructor (StreamingResponseHandler())
  • The passthrough branch in ProcessContent: _chunker is not null ? _chunker.Process(text) : text
  • The null-guard branches in ProcessToolCall and Complete

All 7 tests would fail to compile if the parameterless constructor were removed, proving strong coupling to the fix.

The ChatClientStreamingTestsBase.cs device test base class validates the higher-level end-to-end streaming contract for any IChatClient implementation.

The 40 sample app files have no automated test coverage, but sample/demo code is not expected to have unit tests.


2. Edge Cases & Gaps — ✅

Covered:

  • Null content ignored (ProcessContent_WithoutChunker_NullIgnored)
  • Empty string ignored (ProcessContent_WithoutChunker_EmptyStringIgnored)
  • Duplicate content passes through without delta computation
  • Tool calls in passthrough mode (no flush needed)
  • Error propagation via CompleteWithError
  • Complete does not emit extra flush

Potentially missing (minor):

  • Cancellation token behavior in passthrough mode — existing Cancellation.cs covers chunker mode, but not explicitly passthrough. Low priority since the channel behavior is the same either way.
  • ProcessToolResult in passthrough mode — but this method has no chunker dependency, so it behaves identically in both modes. Not a real gap.

3. Test Type Appropriateness — ✅

Unit Tests (Passthrough.cs): ⭐ Ideal choice. StreamingResponseHandler is pure logic with no platform dependencies. Unit tests are exactly right here.

Device Tests (ChatClientStreamingTestsBase.cs): ⭐⭐ Appropriate. This abstract base class tests IChatClient streaming behavior which requires actual platform-native AI client implementations (Apple Intelligence, Phi Silica, OpenAI). A device test is the correct level.

No suggestions to lighten — test types match the code being tested.


4. Convention Compliance — ✅

Script detected 0 convention issues.

  • [Fact] attributes used correctly (xUnit) ✅
  • Partial class structure consistent with existing StreamingResponseHandlerTests pattern ✅
  • ChatClientStreamingTestsBase(T) follows the existing *TestsBase(T) generic base class convention ✅

5. Flakiness Risk — ⚠️ Medium (device tests only)

Unit tests: ✅ Very low risk. Pure in-memory channel operations, no timing dependencies.

Device tests (concern):

  • GetStreamingResponseAsync_ConcatenatedTextMatchesNonStreaming calls the AI model twice (streaming + non-streaming) and uses Assert.Contains("paris", ...). AI responses are generally consistent for simple factual queries, but there's inherent non-determinism.
  • Tests like GetStreamingResponseAsync_DeliversMultipleIncrementalUpdates only assert Count >= 1, which reduces flakiness risk at the cost of assertion strength.
  • These device tests require a live AI model service — they'll be skipped or fail in CI without the appropriate hardware (Apple Intelligence on iOS/macOS, Phi Silica on Windows).

6. Duplicate Coverage — ✅ No duplicates

The new Passthrough.cs tests are distinct from existing StreamingResponseHandlerTests files (Content.cs, Tools.cs, Cancellation.cs, etc.) which all test the chunker mode. The passthrough mode is a new code path.

ChatClientStreamingTestsBase is a new base class and doesn't duplicate any existing base classes.


7. Platform Scope — ✅

  • The StreamingResponseHandler fix is cross-platform (the passthrough mode enables Windows Phi Silica)
  • The unit tests have no platform guards and run on all platforms ✅
  • The device test base class is platform-agnostic; concrete subclasses provide platform-specific clients ✅
  • The sample app changes are cross-platform

8. Assertion Quality — ⚠️ (device tests)

Unit tests: ✅ Specific and strong

Assert.Equal(2, updates.Count);
Assert.Equal("Hello", updates[0].Contents.OfType(TextContent)().Single().Text);
Assert.IsType(FunctionCallContent)(updates[1].Contents[0]);
await Assert.ThrowsAsync(InvalidOperationException)(...)

Device tests: ⚠️ Some assertions are vague

Assert.True(receivedUpdate, "Should receive at least one streaming update");  // binary
Assert.True(updates.Count > 0, "Should receive streaming updates");           // weak
Assert.NotNull(update);                                                         // minimal

These are acceptable trade-offs for AI model tests where output is non-deterministic, but GetStreamingResponseAsync_UpdatesHaveContents could be strengthened to also check that at least one content is a TextContent.


9. Fix-Test Alignment — ✅

Fix File Tests It
StreamingResponseHandler.cs (passthrough constructor + null guards) Passthrough.cs (7 unit tests) ✅
Sample app files (40 files) None — expected for demo/sample code ✅
Platform ChatClient uses of passthrough mode ChatClientStreamingTestsBase.cs (end-to-end) ✅

Recommendations

  1. Low priority: In ChatClientStreamingTestsBase.GetStreamingResponseAsync_UpdatesHaveContents, consider asserting that at least one content item is a TextContent rather than just checking Contents.Count > 0. This would better verify the streaming contract.

  2. Consider: Add a test for passthrough + cancellation token (ReadAllAsync cancellation). The existing Cancellation.cs covers only the chunker path, and while the channel behavior should be identical, explicit coverage would be more thorough.

  3. No action needed: The 40 sample app file changes without tests are appropriate — sample/demo code is expected to be untested.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • dc.services.visualstudio.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "dc.services.visualstudio.com"

See Network Configuration for more information.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

🧪 Test evaluation by Evaluate PR Tests

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 46 changed files in this pull request and generated 9 comments.

Comment on lines +130 to +143
var prompt = $"""
{searchContext}Write a brief, engaging 2-3 sentence travel tip for someone visiting {Landmark.Name} ({Landmark.Continent}).
Mention the best time to visit and one must-do activity. Be warm and inviting.
""";

var messages = new List<ChatMessage>
{
new(ChatRole.System, "You are a friendly travel guide who gives concise, helpful tips."),
new(ChatRole.User, prompt)
};

var response = await chatClient.GetResponseAsync(messages, new ChatOptions { Temperature = 0.75f }, ct);
if (!ct.IsCancellationRequested)
AiTravelTip = response.Text;
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectedLanguage is exposed (and the UI lets the user pick it), but GenerateAiTravelTipAsync doesn't use SelectedLanguage in the prompt, so the travel tip will always be generated in the model's default language. Incorporate SelectedLanguage into the prompt/system message (e.g., “Respond in {SelectedLanguage}”) and consider regenerating the tip when the language changes.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed in the earlier commit (3ad33f1) — \SelectedLanguage\ is now used in the prompt.

Comment on lines +45 to +66
partial void OnSearchQueryChanged(string? value)
{
languagePreference.SelectedLanguage = value;
OnPropertyChanged(nameof(IsSearching));
_debounceTimer?.Dispose();
_debounceTimer = new Timer(_ => dispatcher.Dispatch(() => _ = FilterLandmarksAsync(value)), null, 300, Timeout.Infinite);
}

public async Task InitializeAsync()
{
if (IsLoading || ContinentGroups.Count > 0)
if (IsLoading || _allGroups.Count > 0)
return;

SelectedLanguage = languagePreference.SelectedLanguage;
await LoadLandmarksAsync();
await WaitForEmbeddingsAsync();
}

async Task FilterLandmarksAsync(string? query)
{
_searchCts?.Cancel();
_searchCts = new CancellationTokenSource();
var ct = _searchCts.Token;

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OnSearchQueryChanged creates a new Timer on every keystroke. If the user navigates away while a timer is pending, the Timer can keep the view model rooted (and the callback can still run) because the timer/CTS are never disposed when the page is left. Consider adding a Dispose/Cancel method that disposes _debounceTimer and _searchCts, and call it from the page's NavigatingFrom/Unloaded handler.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in aa6fa0f — added \CancelPendingSearch()\ to \LandmarksViewModel\ that disposes the timer and cancels the CTS. Called from \LandmarksPage.OnDisappearing().

Comment on lines +37 to +44
return responseText
.Split([',', '\n', '\r'], StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.SelectMany(tag => tag.Split(' ', StringSplitOptions.RemoveEmptyEntries))
.Select(tag => tag.TrimStart('#', '-', '•').Trim())
.Where(tag => tag.Length > 1 && !tag.Contains(' ', StringComparison.Ordinal))
.Distinct()
.Take(5)
.ToList();
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateTagsAsync splits each tag on spaces (SelectMany(tag => tag.Split(' ', ...))). If the model returns a single hashtag phrase with spaces (or a leading bullet), this will produce multiple unrelated words that then pass the filters and show up as separate tags. Prefer treating each comma/newline-separated item as one tag; if it contains spaces after trimming, drop/normalize the whole tag instead of splitting it into words.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in aa6fa0f — removed the \SelectMany\ space-split. Each comma/newline-separated item is now treated as one tag; entries containing whitespace are dropped instead of being fragmented into words.

Comment on lines 12 to 22
Padding="0"
StrokeShape="RoundRectangle 16"
StrokeThickness="0"
BackgroundColor="Transparent"
Shadow="{Shadow Brush=Black, Offset='0,2', Radius=8, Opacity=0.3}">
<Border.GestureRecognizers>
<TapGestureRecognizer Tapped="OnLandmarkTapped" />
</Border.GestureRecognizers>

<Grid>
<Grid InputTransparent="True">
<Image
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces Border.BackgroundColor again, but the PR description calls out migrating from BackgroundColor to Background across the UI. Border supports Background (Brush), so consider using Background="Transparent" here for consistency with the rest of the PR and to avoid mixing the two patterns.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in aa6fa0f — changed to \Background=\Transparent\ for consistency with the rest of the migration.

Comment on lines 10 to 20
Padding="0"
StrokeShape="RoundRectangle 16"
StrokeThickness="0"
BackgroundColor="Transparent"
Shadow="{Shadow Brush=Black, Offset='0,2', Radius=8, Opacity=0.3}">
<Border.GestureRecognizers>
<TapGestureRecognizer Tapped="OnFeaturedLandmarkTapped" />
</Border.GestureRecognizers>

<Grid HeightRequest="250">
<Grid HeightRequest="250" InputTransparent="True">
<Image
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the list item view: this reintroduces BackgroundColor on Border, while most of the PR migrates to the Background property. Consider using Background="Transparent" for consistency (and to keep the stated migration complete).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the same commit.

Comment on lines 6 to +18
/// <summary>
/// Result from the Travel Planner Agent - raw extraction of user intent.
/// These values should be extracted exactly as the user stated them, with no interpretation or expansion.
/// Short JSON names (dest/days/lang) prevent misspelling by small language models.
/// </summary>
public record TravelPlanResult(
[property: DisplayName("destinationName")]
[property: Description("The exact place/location name as written in the user's request. Extract the raw text only - do NOT interpret, expand, or look up actual landmarks. Example: 'Maui' not 'Maui, Hawaii' or 'Haleakala National Park'.")]
[property: JsonPropertyName("place")]
[property: Description("The destination name mentioned by the user.")]
string DestinationName,
[property: DisplayName("dayCount")]
[property: Description("The exact number of days mentioned by the user. Use 3 as default only if no number is specified.")]
[property: JsonPropertyName("days")]
[property: Description("Number of days for the trip. Default is 3.")]
int DayCount,
[property: DisplayName("language")]
[property: Description("The exact output language mentioned by the user. Use 'English' as default only if no language is specified.")]
[property: JsonPropertyName("language")]
[property: Description("Output language for the itinerary. Default is English.")]
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says “Short JSON names (dest/days/lang) prevent misspelling…”, but TravelPlanResult uses JsonPropertyName("place"), "days", and "language". Consider updating the remark to match the actual short names (place/days/language) to avoid confusion when editing prompts/schema.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in aa6fa0f — updated comment to say \place/days/language\ to match the actual \JsonPropertyName\ values.

mattleibow and others added 2 commits March 31, 2026 17:15
…nd test assertions

- Replace IsNotNullConverter with new IsPositiveConverter for
  SimilarDestinations.Count bindings (int is never null, so
  IsNotNull always returned true even for 0 items)
- Incorporate SelectedLanguage into AI travel tip prompt so the
  language picker on the detail page actually affects output
- Loosen streaming test assertion from >2 chunks to >=1 (streaming
  contract does not guarantee minimum chunk count)
- Strengthen concatenated-text test to assert both responses contain
  'Paris' instead of only checking non-empty

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the ISemanticSearchService registration from the iOS/MacCatalyst
#if block into the common services section so it resolves on all
platforms (Apple NL embeddings or OpenAI). This lets DataService take
a required dependency instead of a nullable optional, removing the
null-forgiving operators and the null guard in IndexContentAsync.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🧪 PR Test Evaluation

Overall Verdict: ✅ Tests are adequate

The new StreamingResponseHandler passthrough mode is well-covered by 7 focused unit tests. A few minor gaps and a potential flakiness risk in device tests are worth noting but don't materially weaken coverage.

👍 / 👎 — Was this evaluation helpful? React to let us know!

📊 Expand Full Evaluation

PR Test Evaluation Report

PR: #34576 — AI Sample improvements + StreamingResponseHandler passthrough mode
Test files evaluated: 2 (Passthrough.cs, ChatClientStreamingTestsBase.cs)
Fix files: 41 (1 library file + 40 sample app files)


Overall Verdict

✅ Tests are adequate

The core library change — adding a parameterless constructor and null-guard passthrough mode to StreamingResponseHandler — is thoroughly covered by 7 well-written unit tests. The sample app changes are demo/showcase code and don't require automated tests.


1. Fix Coverage — ✅

The only library change is src/AI/src/Essentials.AI/Platform/StreamingResponseHandler.cs, which adds:

  • A new parameterless constructor (passthrough mode, no chunker)
  • Null-guards in ProcessContent, ProcessToolCall, and Complete for the chunker-less path

Passthrough.cs directly targets all of these code paths:

  • ProcessContent with deltas → both pass through (not accumulated)
  • ProcessContent with null/empty → ignored
  • ProcessToolCall without chunker → no flush
  • CompleteWithError → exception propagates
  • Complete without chunker → no extra flush emitted

The 40 sample app files are UI/demo code; they don't need automated test coverage.


2. Edge Cases & Gaps — ⚠️

Covered:

  • Null content → ignored
  • Empty string → ignored
  • Duplicate/identical delta values → both pass through (contrast with chunker mode, which would emit "" for the second)
  • Tool call interleaved with content (no flush in passthrough mode)
  • Error propagation via CompleteWithError
  • Empty handler completion (no content → no extra update)

Missing (minor):

  • ProcessToolResult is not tested in passthrough mode. The method is straightforward (no chunker interaction), but given the other passthrough tests, a corresponding test for ProcessToolResult would round out coverage.
  • Complete() on a completely empty handler (zero calls to ProcessContent) is not explicitly tested for passthrough mode, though the Complete_WithoutChunker_NoExtraFlush test partially validates this path.

3. Test Type Appropriateness — ✅

Current: Unit Tests (xUnit) + Device Tests (xUnit, platform AI client)

Passthrough.cs (Unit Tests): ⭐ Ideal choice. StreamingResponseHandler is a pure in-memory channel/chunker orchestrator with no platform or UI dependency. Unit tests are exactly the right type here.

ChatClientStreamingTestsBase.cs (Device Tests): ⭐⭐ Appropriate. These tests exercise real AI inference on-device (Phi Silica on Windows, Apple Intelligence on iOS/Mac), which genuinely requires a platform context and cannot be faked in a unit test.


4. Convention Compliance — ✅

  • [Fact] attribute used correctly on all test methods ✅
  • Nested class grouping pattern (PassthroughTests, CompletionTests, etc.) is consistent with existing test files ✅
  • No convention issues reported by automated checks ✅
  • Device test base class follows existing generic (T) pattern ✅

5. Flakiness Risk — ⚠️ Medium (device tests only)

Unit tests: ✅ Very low — deterministic channel operations, no timing dependencies.

Device tests: ⚠️ Medium — Two risks:

  1. GetStreamingResponseAsync_ConcatenatedTextMatchesNonStreaming calls a real AI model and asserts Assert.Contains("paris", ...). While "What is the capital of France?" is a reliable factual prompt, AI responses are non-deterministic by nature. The model could respond "It is Paris" or include other text. This test is inherently at risk on platforms where the AI model behaves differently.

  2. GetStreamingResponseAsync_DeliversMultipleIncrementalUpdates expects textUpdates.Count >= 1 — the minimum is already just 1, which is very lenient and unlikely to fail, but the comment says "we generally expect multiple chunks" without asserting it. This weakens the intent of the test name.


6. Duplicate Coverage — ✅ No duplicates

No similar existing tests were found by automated analysis. The passthrough unit tests are new (the existing StreamingResponseHandlerTests files all use new StreamingResponseHandler(new PlainTextStreamChunker())).


7. Platform Scope — ✅

  • StreamingResponseHandler is internal cross-platform code. Unit tests run on all platforms ✅
  • Device tests exercise platform-specific AI clients (e.g., Phi Silica on Windows, Apple Intelligence on iOS/Mac) — platform-scoped by design ✅
  • The 40 sample app files have no platform guards affecting test coverage

8. Assertion Quality — ✅

Unit tests: Excellent. All assertions check exact text values:

Assert.Equal("Hello", updates[0].Contents.OfType(TextContent)().Single().Text);
Assert.Equal(" world", updates[1].Contents.OfType(TextContent)().Single().Text);

Device tests: ⚠️ Variable. GetStreamingResponseAsync_DeliversMultipleIncrementalUpdates only asserts Count >= 1 and non-whitespace concatenated text — acceptable for AI output, but doesn't validate the incremental nature (multiple updates) which the test name implies.


9. Fix-Test Alignment — ✅

The library fix is in StreamingResponseHandler (passthrough constructor + null-guards). All 7 Passthrough.cs tests directly instantiate new StreamingResponseHandler() (no chunker), exercising every changed code path:

Changed Code Path Covered By
Parameterless constructor All passthrough tests
ProcessContent: _chunker is not null ? ... : text ProcessContent_WithoutChunker_PassesDeltasDirectly, DoesNotAccumulateOrDelta
ProcessContent: null guard ProcessContent_WithoutChunker_NullIgnored
ProcessContent: empty string guard ProcessContent_WithoutChunker_EmptyStringIgnored
ProcessToolCall: skip flush when no chunker ToolCall_WithoutChunker_NoFlushNeeded
Complete: skip flush when no chunker Complete_WithoutChunker_NoExtraFlush
CompleteWithError Error_WithoutChunker_PropagatesException

Recommendations

  1. Add ProcessToolResult passthrough test — the method is simple, but the suite tests every other passthrough path; a quick [Fact] for ProcessToolResult without chunker would complete the coverage picture.

  2. Consider weakening or annotating the AI-response assertion in GetStreamingResponseAsync_ConcatenatedTextMatchesNonStreaming — asserting Contains("paris", ...) on a live AI model could be flaky. If flakiness occurs, consider using a prompt with a more constrained expected output, or add a [Trait("Flaky", "true")] annotation.

  3. Rename or strengthen GetStreamingResponseAsync_DeliversMultipleIncrementalUpdates — the test asserts only Count >= 1 but the name implies multiple incremental updates. Either assert Count > 1 or rename to reflect the weaker assertion.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • dc.services.visualstudio.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "dc.services.visualstudio.com"

See Network Configuration for more information.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

🧪 Test evaluation by Evaluate PR Tests

mattleibow and others added 2 commits March 31, 2026 17:42
- Add CancelPendingSearch to LandmarksViewModel, called from
  OnDisappearing to dispose the debounce timer and cancel any
  in-flight search when navigating away
- Fix TaggingService tag parsing: drop the SelectMany space-split
  that fragmented multi-word hashtags; instead treat each
  comma/newline-separated item as a single tag and reject entries
  with whitespace
- Use Background instead of BackgroundColor on LandmarkListItemView
  and LandmarkFeaturedItemView for consistency with the rest of the
  BackgroundColor-to-Background migration
- Fix WorkflowModels comment to match actual JSON property names
  (place/days/language not dest/days/lang)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e tip, and test assertions"

This reverts commit 1bd8424.
@github-actions
Copy link
Copy Markdown
Contributor

🧪 PR Test Evaluation

Overall Verdict: ✅ Tests are adequate

The unit tests for StreamingResponseHandler passthrough mode are well-written and thoroughly cover the new code paths. The large majority of changed files are sample app improvements that don't require automated tests.

👍 / 👎 — Was this evaluation helpful? React to let us know!

📊 Expand Full Evaluation

PR Test Evaluation Report

PR: #34576 — AI Sample Improvements
Test files evaluated: 2
Fix files: 41 (40 sample app, 1 library: StreamingResponseHandler.cs)


Overall Verdict

Tests are adequate

The one non-sample library change — adding passthrough mode to StreamingResponseHandler — is thoroughly covered by 7 focused unit tests. The sample app changes don't require automated tests.


1. Fix Coverage — ✅

The key library fix makes StreamChunkerBase optional in StreamingResponseHandler, adding:

  • A parameterless constructor (passthrough mode)
  • Null-guards in ProcessContent, ProcessToolCall, and Complete

The Passthrough.cs unit test directly exercises all these new code paths with 7 facts covering the full lifecycle. The tests would fail if the fix were reverted — e.g., ProcessContent_WithoutChunker_PassesDeltasDirectly expects the parameterless constructor to work, and ToolCall_WithoutChunker_NoFlushNeeded verifies ProcessToolCall skips the chunker flush.

The 40 sample app changes (ViewModels, Pages, Services, XAML) are UI/sample code that is not typically covered by automated tests, which is appropriate.


2. Edge Cases & Gaps — ✅

Covered:

  • Null input to ProcessContent → ignored (not passed downstream)
  • Empty string input → ignored
  • Duplicate content (same string twice) → both pass through (no delta accumulation)
  • Tool calls in passthrough mode → no spurious chunker flush
  • Error propagation via CompleteWithError → exception propagates through ReadAllAsync
  • Multiple tokens, no extra flush on Complete

Minor gaps (non-critical):

  • ProcessToolResult is not explicitly tested in passthrough mode, but the method has no chunker dependency, so behavior is identical in both modes
  • Cancellation token behavior of ReadAllAsync not tested (but this is existing behavior, not changed by the fix)

3. Test Type Appropriateness — ✅

Test Current Type Recommendation
Passthrough.cs Unit test ✅ Correct — StreamingResponseHandler is pure C# with no platform dependencies
ChatClientStreamingTestsBase.cs Device test ✅ Correct — requires actual AI inference with a real chat client

The device test is an abstract base class inherited by AppleIntelligenceChatClientStreamingTests and OpenAIChatClientStreamingTests, so it will run on real device tests for those platforms.


4. Convention Compliance — ✅

  • xUnit [Fact] used correctly in unit tests
  • No naming, attribute, or anti-pattern issues detected by automated checks

5. Flakiness Risk — ⚠️ Medium

Unit tests: Low risk — pure in-memory Channel operations, deterministic.

Device tests: Medium risk — ConcatenatedTextMatchesNonStreaming and DeliversMultipleIncrementalUpdates rely on real AI model responses. ConcatenatedTextMatchesNonStreaming asserts Contains("paris", ...) which is reasonable but assumes model behavior. If an AI model is unavailable or returns an unexpected response, these tests could fail transiently in CI.

Consider adding a [Category] attribute to ChatClientStreamingTestsBase methods (or their concrete subclasses) to allow selective skipping when the AI backend is unavailable.


6. Duplicate Coverage — ✅ No duplicates

No similar existing tests found for the passthrough mode functionality.


7. Platform Scope — ✅

The passthrough mode is specifically for Windows Phi Silica, but StreamingResponseHandler is cross-platform C# logic. The unit tests run on all platforms, giving full coverage of the passthrough logic regardless of platform. A Windows-specific device test (exercising Phi Silica end-to-end) would be ideal long-term, but the unit tests are sufficient for the current fix.


8. Assertion Quality — ✅

Unit test assertions are specific and well-chosen:

  • Exact update counts (Assert.Equal(2, updates.Count))
  • Exact text values (Assert.Equal("Hello", updates[0].Contents...Text))
  • Type assertions for tool calls (Assert.IsType(FunctionCallContent)(...))
  • Exception type checking (Assert.ThrowsAsync(InvalidOperationException))

Device test assertions are appropriately loose given real AI model variability (e.g., Assert.True(updates.Count > 0) rather than an exact count).


9. Fix-Test Alignment — ✅

The Passthrough.cs unit tests map directly to the modified code:

Code change Test coverage
New StreamingResponseHandler() constructor All 7 passthrough tests
ProcessContent: _chunker is not null ? ... : text PassesDeltasDirectly, DoesNotAccumulateOrDelta, NullIgnored, EmptyStringIgnored
ProcessToolCall: null-guard for _chunker ToolCall_WithoutChunker_NoFlushNeeded
Complete: null-guard for _chunker Complete_WithoutChunker_NoExtraFlush
CompleteWithError: (unchanged) Error_WithoutChunker_PropagatesException

Recommendations

  1. Consider adding [Category] to concrete streaming device test classesAppleIntelligenceChatClientStreamingTests and OpenAIChatClientStreamingTests don't appear to have [Category] attributes (unlike sibling test classes), which could affect CI filtering.
  2. Long-term: A device test for Windows Phi Silica streaming (passthrough path end-to-end) would validate the full integration, but this is non-blocking given the thorough unit test coverage.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • dc.services.visualstudio.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "dc.services.visualstudio.com"

See Network Configuration for more information.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

🧪 Test evaluation by Evaluate PR Tests

@mattleibow
Copy link
Copy Markdown
Member Author

mattleibow commented Apr 1, 2026

/azp run maui-pr-devicetests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

iOS does not fall back to Apple Color Emoji when a custom FontFamily
(OpenSansRegular) is set via global styles, causing all emoji to render
as [?] replacement characters.

Replace all emoji usage with FluentSystemIcons-Regular font glyphs:
- Chat FAB, header, delete, close, send buttons
- Back button FABs on detail and trip planning pages
- Weather icons (split into separate icon + text labels)
- Sparkle/AI indicators
- Globe and map icons on language picker and plan trip button
- Activity bullet points replaced with location pin icons

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

The MauiFont glob (Resources/Fonts/*) was consuming FluentUI.cs as a
font asset instead of a Compile item, causing CS0103 errors on Windows
and other CI platforms.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

🧪 PR Test Evaluation

Overall Verdict: ⚠️ Tests need improvement

The new passthrough mode in StreamingResponseHandler is well-covered by unit tests. However, device tests carry flakiness risk from live AI model dependencies, ProcessToolResult in passthrough mode has no dedicated test, and 44 sample app files have no automated tests (though that's typical for sample code).

👍 / 👎 — Was this evaluation helpful? React to let us know!

📊 Expand Full Evaluation

PR Test Evaluation Report

PR: #34576 — AI Sample improvements (streaming handler passthrough mode + sample app polish)
Test files evaluated: 2
Fix files: 45 (44 sample app, 1 production code — StreamingResponseHandler.cs)


Overall Verdict

⚠️ Tests need improvement

The new passthrough mode added to StreamingResponseHandler is well-covered by 7 focused unit tests. However, device tests are tied to live AI models (flakiness risk), ProcessToolResult has no passthrough-specific test, and the 44 sample app changes have no automated coverage (expected for sample code).


1. Fix Coverage — ✅

The one substantive production code change — adding passthrough mode to StreamingResponseHandler (no-chunker constructor) — is directly exercised by Passthrough.cs. Each newly reachable code path is covered:

  • ProcessContent with no chunker → direct delta emission ✅
  • Null and empty string filtering ✅
  • Complete() without extra flush ✅
  • ProcessToolCall without prior flush ✅
  • CompleteWithError propagation ✅

The device test base class (ChatClientStreamingTestsBase) adds two new tests (DeliversMultipleIncrementalUpdates, ConcatenatedTextMatchesNonStreaming) that test the streaming contract at the consumer level. These are inherited by AppleIntelligenceChatClientStreamingTests and OpenAIChatClientStreamingTests, which means they do run. ✅

The 44 sample app changes have no automated tests — this is normal for demo/sample code.


2. Edge Cases & Gaps — ⚠️

Covered:

  • Null content → ignored (no update emitted)
  • Empty string → ignored
  • Duplicate identical values pass through (no deduplication, unlike chunker mode)
  • Tool call without pending content to flush
  • Error propagation via CompleteWithError

Missing:

  • ProcessToolResult in passthrough mode: The only tests for ProcessToolResult (ToolResultEdgeCases.cs) use a chunker. Since ProcessToolResult doesn't reference _chunker, behavior is identical regardless of mode — but a passthrough-mode test would make the coverage explicit and catch any future regressions if _chunker branching is added.
  • Cancellation during passthrough streaming: Cancellation.cs likely covers this for the chunker mode; it's unclear if it also covers passthrough.
  • ConcatenatedTextMatchesNonStreaming on edge cases: Only tests a single-word answer ("Paris"). Doesn't test longer or multi-sentence responses where streaming behavior could differ.

3. Test Type Appropriateness — ✅

Test Type Verdict
Passthrough.cs Unit ✅ Correct — pure in-memory logic, no platform dependency
ChatClientStreamingTestsBase.cs Device ✅ Correct — tests live AI model responses, requires real platform runtime

The unit tests are well-suited for StreamingResponseHandler (a POCO with no platform dependencies). Device tests are necessary to validate that the streaming pipeline works end-to-end with actual AI backends.


4. Convention Compliance — ✅

  • Unit tests use [Fact]
  • Nested class structure (PassthroughTests inside StreamingResponseHandlerTests) follows existing patterns in the codebase ✅
  • No convention issues detected by the automated script ✅
  • Device test base class follows the established ChatClient*TestsBase(T) pattern already used for function calling, cancellation, etc. ✅

5. Flakiness Risk — ⚠️ Medium

Unit tests (Passthrough.cs): ✅ Very low — pure in-memory operations with deterministic behavior.

Device tests (ChatClientStreamingTestsBase): ⚠️ Medium risk:

  • Tests depend on live AI model responses (Apple Intelligence, OpenAI). If the AI service is unavailable or returns unexpected content, tests fail.
  • ConcatenatedTextMatchesNonStreaming asserts Contains("paris", ...) — reasonable but could fail if the model responds with "Paris, France" vs "Paris" in different configurations, or if the model's output format changes.
  • DeliversMultipleIncrementalUpdates asserts Count >= 1 — very lenient, but the comment acknowledges that the streaming contract doesn't guarantee a minimum chunk count. This is intentionally loose to avoid brittleness, but it means the test might not catch regressions in streaming granularity.

6. Duplicate Coverage — ✅ No duplicates

No similar tests found. The Passthrough.cs tests specifically address the new passthrough mode, which had no prior coverage. The new device test methods extend the existing streaming base class without duplicating existing tests.


7. Platform Scope — ✅

  • StreamingResponseHandler is cross-platform (no platform-specific code).
  • Unit tests run on all platforms.
  • Device tests run on Apple (iOS/macCatalyst) via AppleIntelligenceChatClientStreamingTests and can run with OpenAI on any platform via OpenAIChatClientStreamingTests.

8. Assertion Quality — ✅

Unit tests have specific, targeted assertions:

  • Assert.Equal(2, updates.Count) — exact count verification ✅
  • Assert.Equal("Hello", updates[0].Contents.OfType(TextContent)().Single().Text) — exact text match ✅
  • Assert.IsType(FunctionCallContent)(...) — type-specific assertion ✅

Device tests use appropriately loose assertions for live AI responses:

  • Assert.Contains("paris", ..., OrdinalIgnoreCase) — reasonable for non-mocked responses ✅
  • Assert.True(updates.Count >= 1, ...) — intentionally lenient, with explanatory message ✅

No magic numbers or trivial Assert.True(true) assertions found.


9. Fix-Test Alignment — ✅

The tests directly target the changed production code:

  • Passthrough.cs → tests StreamingResponseHandler passthrough constructor and all methods
  • ChatClientStreamingTestsBase.cs → tests the streaming interface that StreamingResponseHandler backs

The sample app files (44 of 45 "fix" files) have no corresponding automated tests, which is appropriate for sample/demo code.


Recommendations

  1. Add a passthrough-mode ProcessToolResult test — Even though the code path is transitively covered by chunker-mode tests, an explicit passthrough test would clarify intent and protect against future regressions if branching is added. Example: adapt ToolResultEdgeCases to include a version with new StreamingResponseHandler() (no chunker).

  2. Consider adding a cancellation test for passthrough mode — Check whether Cancellation.cs covers the no-chunker constructor path. If not, add one test to ensure ReadAllAsync(cancellationToken) respects cancellation in passthrough mode.

  3. Document the live-AI flakiness expectation — The device tests (DeliversMultipleIncrementalUpdates, ConcatenatedTextMatchesNonStreaming) rely on live AI responses. Consider adding a [Skip] or [Category] annotation that marks them as integration/live tests, so CI can conditionally skip them without AI credentials.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • dc.services.visualstudio.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "dc.services.visualstudio.com"

See Network Configuration for more information.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

🧪 Test evaluation by Evaluate PR Tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

The Essentials.AI.UnitTests project links Models/** from the sample via
Compile Include glob. Weather.cs referencing FluentUI broke that project
since it doesn't have FluentUI.cs. Move icon mapping to WeatherService
and keep the model as pure data.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mattleibow mattleibow merged commit 74a6940 into main Apr 2, 2026
40 of 45 checks passed
@mattleibow mattleibow deleted the dev/ai-sample-improvements branch April 2, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

s/agent-changes-requested AI agent recommends changes - found a better alternative or issues s/agent-fix-win AI found a better alternative fix than the PR s/agent-reviewed PR was reviewed by AI agent workflow (full 4-phase review)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants