Skip to content

feat: add DrainErrorQueueAsync for SCPI error queue inspection#185

Merged
tylerkron merged 3 commits into
mainfrom
feat/drain-error-queue
May 5, 2026
Merged

feat: add DrainErrorQueueAsync for SCPI error queue inspection#185
tylerkron merged 3 commits into
mainfrom
feat/drain-error-queue

Conversation

@tylerkron
Copy link
Copy Markdown
Contributor

Closes #174.

Summary

  • Adds DaqifiDevice.DrainErrorQueueAsync — pops SYSTem:ERRor? entries until the device reports "No error" and returns the popped strings to the caller. This complements the existing inline ContainsScpiError check (last-command result) by surfacing what's queued on the device, including stale errors from prior commands or sessions.
  • Adds ScpiMessageProducer.GetSystemError (SYSTem:ERRor?) for use by the new method and any future callers.
  • Adds 10 unit tests covering: clean queue, N queued errors, max-iterations cap, +0,"No error" terminator variant, leading blank lines, trimming, empty-reply termination, command verification, cancellation, and argument validation.

Implementation notes (deviations from the issue's draft snippet)

  • Empty reply terminates the loop instead of being recorded as an error. Without this, a transport timeout would cause the loop to append empty strings until maxIterations was exhausted.
  • Default response timeout matches ExecuteTextCommandAsync's default (1000 ms) rather than 200 ms — 200 ms was tight enough to cause false terminations on a slightly loaded transport.
  • When maxIterations is reached without convergence, a Trace.WriteLine warning is emitted. Callers that want to treat non-convergence as a hard failure can compare Count to maxIterations.
  • Method lives on DaqifiDevice (the base class) so all subclasses inherit it without duplication.

Test plan

  • dotnet test --filter DrainErrorQueue — 10/10 passing
  • Full suite — 890/890 passing (2 pre-existing skips), no regressions

🤖 Generated with Claude Code

Closes #174.

Adds DaqifiDevice.DrainErrorQueueAsync, which pops SYSTem:ERRor? entries
from the device until the queue reports "No error" and returns the
popped strings to the caller. This complements the existing inline
ContainsScpiError check (last-command result) by exposing what is
queued on the device, including stale entries from prior sessions.

Also adds ScpiMessageProducer.GetSystemError for the SYSTem:ERRor?
query message.

Implementation notes that diverged from the issue's draft snippet:

- An empty reply terminates the loop (returning what's been popped so
  far) instead of being recorded as an error and continuing. Without
  this, a transport timeout would cause the loop to append empty
  strings until maxIterations was exhausted.
- Default response timeout matches the existing
  ExecuteTextCommandAsync default (1000 ms) rather than 200 ms, which
  was tight enough to cause false terminations on a slightly loaded
  transport.
- When maxIterations is reached without convergence, a warning is
  traced; callers that want to treat that as a hard failure can
  compare Count to maxIterations.
@tylerkron tylerkron requested a review from a team as a code owner May 5, 2026 02:24
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add DrainErrorQueueAsync for SCPI error queue inspection

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Adds DrainErrorQueueAsync method to inspect and pop queued SCPI errors
• Adds GetSystemError SCPI message producer for SYSTem:ERRor? queries
• Implements robust error queue draining with empty-reply termination
• Includes 10 comprehensive unit tests covering edge cases and error scenarios
Diagram
flowchart LR
  A["DaqifiDevice"] -->|new method| B["DrainErrorQueueAsync"]
  B -->|queries device| C["SYSTem:ERRor?"]
  C -->|via| D["ScpiMessageProducer.GetSystemError"]
  B -->|returns| E["List of error strings"]
  F["Unit Tests"] -->|verify| B
Loading

Grey Divider

File Changes

1. src/Daqifi.Core.Tests/Device/DaqifiDeviceDrainErrorQueueTests.cs 🧪 Tests +218/-0

Comprehensive unit tests for DrainErrorQueueAsync

• Adds 10 unit tests for DrainErrorQueueAsync covering clean queue, multiple errors, terminator
 variants, whitespace handling, empty replies, iteration caps, command verification, cancellation,
 and argument validation
• Implements SequencedReplyDevice test helper class that simulates device responses without real
 transport
• Tests verify correct error ordering, trimming behavior, early termination on empty replies, and
 max-iterations enforcement

src/Daqifi.Core.Tests/Device/DaqifiDeviceDrainErrorQueueTests.cs


2. src/Daqifi.Core/Communication/Producers/ScpiMessageProducer.cs ✨ Enhancement +10/-0

Add GetSystemError SCPI message producer

• Adds GetSystemError static property that creates a SYSTem:ERRor? query message
• Includes XML documentation explaining the SCPI error queue behavior and response format
• Enables callers to pop entries from the device's error queue

src/Daqifi.Core/Communication/Producers/ScpiMessageProducer.cs


3. src/Daqifi.Core/Device/DaqifiDevice.cs ✨ Enhancement +77/-0

Implement DrainErrorQueueAsync for error queue inspection

• Adds DrainErrorQueueAsync method that iteratively queries SYSTem:ERRor? until "No error" is
 received
• Implements empty-reply termination to prevent timeout loops and transport hammering
• Includes maxIterations safety cap (default 256) with trace warnings on non-convergence
• Provides comprehensive XML documentation covering use cases, concurrency considerations, and
 parameter details

src/Daqifi.Core/Device/DaqifiDevice.cs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented May 5, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Remediation recommended

1. Unsynchronized text exchange 🐞 Bug ☼ Reliability
Description
DrainErrorQueueAsync repeatedly calls ExecuteTextCommandAsync, which stops/restarts the shared
protobuf consumer and reads from the shared transport stream without any synchronization, so
concurrent drains/other text commands can interleave and corrupt consumption. The new public API
increases the likelihood of this unsupported concurrent usage while only documenting (not enforcing)
the constraint.
Code

src/Daqifi.Core/Device/DaqifiDevice.cs[R518-526]

+            var popped = new List<string>();
+            for (int i = 0; i < maxIterations; i++)
+            {
+                cancellationToken.ThrowIfCancellationRequested();
+
+                var lines = await ExecuteTextCommandAsync(
+                    () => Send(ScpiMessageProducer.GetSystemError),
+                    cancellationToken: cancellationToken).ConfigureAwait(false);
+
Evidence
ExecuteTextCommandAsync mutates shared device state (unsubscribing/resubscribing handlers,
stopping/starting the shared consumer, and creating a temporary text consumer on the same stream)
but there is no lock/gate around this sequence. DrainErrorQueueAsync introduces a new public
entrypoint that performs multiple such exchanges, widening the window for concurrent calls to
collide.

src/Daqifi.Core/Device/DaqifiDevice.cs[319-467]
src/Daqifi.Core/Device/DaqifiDevice.cs[491-546]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ExecuteTextCommandAsync` (used by `DrainErrorQueueAsync`) switches the device between protobuf-consumer mode and temporary line-based text-consumer mode on a shared stream. If two callers enter this flow concurrently, they can interfere with each other’s consumer lifecycle and stream reads.

## Issue Context
The new `DrainErrorQueueAsync` method is public and performs many `ExecuteTextCommandAsync` calls in a loop. Even though the XML doc warns against concurrent use, the implementation does not prevent it.

## Fix Focus Areas
- src/Daqifi.Core/Device/DaqifiDevice.cs[319-467]
- src/Daqifi.Core/Device/DaqifiDevice.cs[511-548]

## Suggested change
- Add a private `SemaphoreSlim` (or equivalent async lock) on `DaqifiDevice` to serialize all text exchanges.
- Acquire it at the start of `ExecuteTextCommandAsync` (preferred, protects all callers) and release in `finally`.
- Ensure the semaphore wait observes `cancellationToken`.
- Keep existing behavior otherwise (stop consumer, run setupAction, collect lines, restart consumer).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Loose terminator detection🐞 Bug ≡ Correctness
Description
DrainErrorQueueAsync stops draining when the reply merely contains the substring "No error", which
can incorrectly treat a non-terminator SCPI error as queue-empty and silently leave queued errors
undiscovered. Terminator detection should be based on the SCPI numeric code (0 or +0) rather than a
substring search.
Code

src/Daqifi.Core/Device/DaqifiDevice.cs[R527-543]

+                var reply = lines.FirstOrDefault(l => !string.IsNullOrWhiteSpace(l));
+                if (reply == null)
+                {
+                    // No reply at all — either the queue is drained and the
+                    // device elected to stay silent, or the device is
+                    // unresponsive. Either way, hammering further would not
+                    // help; stop and return what we have.
+                    Trace.WriteLine($"[DrainErrorQueueAsync] Empty reply on iteration {i}; terminating after {popped.Count} popped entries.");
+                    return popped;
+                }
+
+                if (reply.Contains("No error", StringComparison.OrdinalIgnoreCase))
+                {
+                    return popped;
+                }
+
+                popped.Add(reply.Trim());
Evidence
The implementation uses a substring match on the reply to decide termination, even though SCPI
error-queue responses are documented in this codebase as a numeric code followed by a message; using
the numeric code is the unambiguous terminator signal.

src/Daqifi.Core/Device/DaqifiDevice.cs[523-544]
src/Daqifi.Core/Communication/Producers/ScpiMessageProducer.cs[41-49]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`DrainErrorQueueAsync` treats any reply containing the substring `"No error"` as the terminator. This is more permissive than needed and can terminate incorrectly.

## Issue Context
SCPI error-queue replies are shaped as `<code>,"<message>"` (with code `0` / `+0` indicating empty queue). The codebase even documents this format in `ScpiMessageProducer.GetSystemError`.

## Fix Focus Areas
- src/Daqifi.Core/Device/DaqifiDevice.cs[523-544]
- src/Daqifi.Core/Communication/Producers/ScpiMessageProducer.cs[41-49]

## Suggested change
- Trim the reply once (`var trimmed = reply.Trim();`).
- Parse the numeric prefix up to the first comma (or whole string if no comma).
- Terminate only when parsed code is `0` (support `+0` via `int.TryParse`).
- Keep returning the trimmed full line for non-zero codes.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

tylerkron added 2 commits May 4, 2026 20:27
- Drop overspecified _SkipsLeadingBlankLines test that asserted
  defensive behavior the line parser doesn't actually exercise.
- Tighten 4-line empty-reply comment to 2.
- Remove unneeded Connect() in arg-validation test.
- Fix SYST:ERR? -> SYSTem:ERRor? doc inconsistency.
Per Qodo review on PR #185: replying to "No error" via substring
match could in theory terminate on a real error whose message text
happens to contain that phrase. Parse the numeric prefix (0 or +0)
before the comma and only terminate on code 0. Adds a test that
proves the substring-match behavior is no longer present.
@tylerkron
Copy link
Copy Markdown
Contributor Author

Replying to @qodo-code-review's review:

Issue 1 — Loose terminator detection: fixed in b4fc3ad.

Agreed. Replaced the Contains("No error") substring match with numeric-code parsing: parse the prefix up to the first comma and only treat code 0 (or +0, via int.TryParse) as the terminator. Added a regression test (DrainErrorQueueAsync_TreatsErrorMessageContainingNoErrorPhraseAsRealError) that fails under the old behavior — it queues a real error whose message text contains "No error handler installed" and asserts it's captured rather than swallowed as the terminator.

Issue 2 — Unsynchronized text exchange: valid concern, but out of scope for this PR. Tracked as #186.

The lack of locking is in ExecuteTextCommandAsync itself, which predates this PR. Every existing method that uses it (InitializeAsync, GetSdCardFilesAsync, DeleteSdCardFilesAsync, DownloadSdCardFileAsync, plus the SD-card paths in DaqifiStreamingDevice) shares the same vulnerability. The right fix is a SemaphoreSlim inside ExecuteTextCommandAsync so all callers benefit — bolting one onto DrainErrorQueueAsync alone would still leave the other paths exposed. Filed #186 to do this properly across the board.

@tylerkron
Copy link
Copy Markdown
Contributor Author

Real-hardware verification

Tested against a Nyquist Nq1 FW 3.4.4 over USB serial (/dev/cu.usbmodem101 @ 115200).

What worked

  • Connect + InitializeAsync + drain run cleanly end-to-end
  • Multiple sequential DrainErrorQueueAsync calls — protobuf consumer pauses/restarts correctly each time, no transport corruption
  • Raw SYSTem:ERRor? reply on a clean queue is exactly 0,"No error" — the numeric-prefix terminator parsing introduced for Qodo issue feat: init of project structure with CICD #1 hits the right pattern

Firmware caveat (FW 3.4.4)

On this firmware, errors are reported inline only and are NOT enqueued in SYST:ERR?. Every bad command I tried produced an inline **ERROR: line but left the queue at 0,"No error":

Command Inline reply Queue after
SYSTem:DOES:NOT:EXIST? **ERROR: -113, "Undefined header" 0,"No error"
SYSTem:POWer:STATe 99 **ERROR: -224, "Illegal parameter value" 0,"No error"
SD:GET "nonexistent.bin" **ERROR: -200, "Execution error" 0,"No error"
Standby + StartStreamData (exact issue-174 reproducer) **ERROR: -200, "Execution error" 0,"No error"

The original issue mentioned >10 stale errors queued on FW 3.4.6b1, so queue-population is a newer firmware behavior. DrainErrorQueueAsync will return empty on 3.4.4 but should populate correctly on firmware that implements the queue (terminator parsing already verified against the standard 0,"No error" reply shape).

What I couldn't prove on this hardware

  • "N popped entries when N are queued" — would need 3.4.6b1+ to seed and verify

The mechanical primitives all work; the populated-list path is exercised by the unit tests.

@tylerkron tylerkron merged commit 9bf3889 into main May 5, 2026
1 check passed
@tylerkron tylerkron deleted the feat/drain-error-queue branch May 5, 2026 03:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(scpi): add SYST:ERR? drain helper returning popped errors for logging/diagnostics

1 participant