fix: addToRecentSpools mutex guard, drain write queue before restart (#106, #107)#118
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 15 minutes and 28 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThe changes add thread-safe queue management to NFCManager by introducing mutex-protected snapshot logic in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/NFCManager.cpp (1)
2257-2320:⚠️ Potential issue | 🔴 CriticalRace condition:
recentSpoolsarray modified after mutex release.The mutex is released at line 2300 before modifying the
recentSpoolsarray (lines 2302-2319). However,getRecentSpools()andupdateRecentSpoolSyncStatus()both accessrecentSpoolsundertagMutexprotection. This creates a data race where concurrent calls could read the array whileaddToRecentSpools()is shuffling entries.The mutex should be held for the entire operation, including the array modifications.
🔒 Proposed fix: hold mutex for entire operation
void NFCManager::addToRecentSpools() { if (tagMutex == nullptr) return; if (xSemaphoreTake(tagMutex, pdMS_TO_TICKS(50)) != pdTRUE) return; if (!currentSpool.tag_data_valid) { xSemaphoreGive(tagMutex); return; } // Check if this spool already exists in recent list int existingIndex = -1; for (size_t i = 0; i < recentSpoolsCount; i++) { if (strcmp(recentSpools[i].spool_id, currentSpool.spool_id) == 0) { existingIndex = i; break; } } - // Snapshot data under mutex RecentSpoolEntry newEntry; memset(&newEntry, 0, sizeof(newEntry)); strncpy(newEntry.spool_id, currentSpool.spool_id, sizeof(newEntry.spool_id) - 1); opt_get_material_type(¤tSpool.tag_data, &newEntry.material_type); opt_get_primary_color(¤tSpool.tag_data, newEntry.color); if (opt_get_brand_name(¤tSpool.tag_data, newEntry.manufacturer, sizeof(newEntry.manufacturer)) != OPT_OK) { newEntry.manufacturer[0] = '\0'; } float full_weight = 0.0f, consumed = 0.0f; opt_get_actual_full_weight(¤tSpool.tag_data, &full_weight); opt_get_consumed_weight(¤tSpool.tag_data, &consumed); newEntry.grams_remaining = (int)(full_weight - consumed); newEntry.last_seen = time(nullptr); newEntry.valid = true; newEntry.synced_to_spoolman = false; int32_t smId = -1; opt_get_gp_spoolman_id(¤tSpool.tag_data, &smId); newEntry.spoolman_id = smId; - xSemaphoreGive(tagMutex); - if (existingIndex >= 0) { // Spool exists - shift entries to remove it from current position for (size_t i = existingIndex; i > 0; i--) { recentSpools[i] = recentSpools[i - 1]; } } else { // New spool - shift all entries down to make room size_t shiftCount = (recentSpoolsCount < MAX_RECENT_SPOOLS) ? recentSpoolsCount : (MAX_RECENT_SPOOLS - 1); for (size_t i = shiftCount; i > 0; i--) { recentSpools[i] = recentSpools[i - 1]; } if (recentSpoolsCount < MAX_RECENT_SPOOLS) { recentSpoolsCount++; } } // Place new entry at front recentSpools[0] = newEntry; + + xSemaphoreGive(tagMutex); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/NFCManager.cpp` around lines 2257 - 2320, addToRecentSpools currently releases tagMutex before shuffling/modifying the shared recentSpools array, creating a data race with getRecentSpools() and updateRecentSpoolSyncStatus(); fix it by holding tagMutex for the entire operation: move the xSemaphoreGive(tagMutex) call from before the array-shift logic to after recentSpools/recentSpoolsCount updates (i.e., only release the mutex once recentSpools[0] = newEntry and any count changes are complete), and ensure all early-return paths still release the mutex where appropriate (preserve the existing releases when currentSpool.tag_data_valid is false and when xSemaphoreTake fails).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/WebServerManager.cpp`:
- Around line 784-790: The code pauses the scan task via
NFCManager::getInstance().pauseScanTask() before waiting for the write queue to
drain, but processWriteQueue() runs inside scanLoop(), so pausing prevents the
queue from ever emptying; fix by reversing the sequence: wait (with a bounded
loop and small delays) for NFCManager::getInstance().isWriteQueueEmpty() to
become true (or time out) first, then call
NFCManager::getInstance().pauseScanTask(), and only after confirming the queue
is empty proceed to delay and call ESP.restart(); keep the existing timeout
behavior to avoid blocking indefinitely and reference NFCManager::getInstance(),
isWriteQueueEmpty(), pauseScanTask(), and scanLoop() when making the change.
- Around line 850-855: The write-queue drain is happening after the scan task
was paused, so isWriteQueueEmpty() may always report empty; move the queue-drain
logic to the start of the upload flow so the queue is drained before pausing the
scan task: in handleApiUploadFirmwareChunk() (when transitioning to
UPLOAD_FILE_START) call the same write-queue drain loop that checks
NFCManager::getInstance().isWriteQueueEmpty() and delays until drained, then
pause the scan task; keep the existing drain in
handleApiUploadFirmwareComplete() only as a safety check.
---
Outside diff comments:
In `@src/NFCManager.cpp`:
- Around line 2257-2320: addToRecentSpools currently releases tagMutex before
shuffling/modifying the shared recentSpools array, creating a data race with
getRecentSpools() and updateRecentSpoolSyncStatus(); fix it by holding tagMutex
for the entire operation: move the xSemaphoreGive(tagMutex) call from before the
array-shift logic to after recentSpools/recentSpoolsCount updates (i.e., only
release the mutex once recentSpools[0] = newEntry and any count changes are
complete), and ensure all early-return paths still release the mutex where
appropriate (preserve the existing releases when currentSpool.tag_data_valid is
false and when xSemaphoreTake fails).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9d4a999e-d828-4ea0-add0-1858ae93cd3f
📒 Files selected for processing (3)
src/NFCManager.cppsrc/NFCManager.hsrc/WebServerManager.cpp
| // Wait for write queue to drain (#107) | ||
| for (int i = 0; i < 20 && !NFCManager::getInstance().isWriteQueueEmpty(); i++) { | ||
| delay(100); | ||
| } | ||
| delay(200); | ||
| ESP.restart(); |
There was a problem hiding this comment.
Same issue: scan task already paused during OTA upload.
The scan task was paused at UPLOAD_FILE_START in handleApiUploadFirmwareChunk() (line 821), so by the time handleApiUploadFirmwareComplete() runs, the scan task is already suspended and the write queue cannot drain.
For OTA, this is less critical since writes during firmware upload are uncommon, but for consistency and to fully address issue #107, the drain should happen before pausing.
🔧 Proposed fix: drain queue at start of upload before pausing
In handleApiUploadFirmwareChunk(), drain the queue before pausing:
if (upload.status == UPLOAD_FILE_START) {
Serial.printf("OTA: Upload start: %s\n", upload.filename.c_str());
+ // Wait for pending NFC writes to complete before pausing (`#107`)
+ for (int i = 0; i < 20 && !NFCManager::getInstance().isWriteQueueEmpty(); i++) {
+ delay(100);
+ }
// Pause NFC scan task during upload to prevent interference
NFCManager::getInstance().pauseScanTask();Then the drain loop in handleApiUploadFirmwareComplete() becomes a safety check for any writes somehow queued during upload.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/WebServerManager.cpp` around lines 850 - 855, The write-queue drain is
happening after the scan task was paused, so isWriteQueueEmpty() may always
report empty; move the queue-drain logic to the start of the upload flow so the
queue is drained before pausing the scan task: in handleApiUploadFirmwareChunk()
(when transitioning to UPLOAD_FILE_START) call the same write-queue drain loop
that checks NFCManager::getInstance().isWriteQueueEmpty() and delays until
drained, then pause the scan task; keep the existing drain in
handleApiUploadFirmwareComplete() only as a safety check.
5cf02c3 to
85fd991
Compare
Summary
tagMutexbefore readingcurrentSpooldata to prevent stale reads from concurrent access (bug: addToRecentSpools reads currentSpool without tagMutex #106)Test plan
Summary by CodeRabbit