The fix is already in place from previous commits (47c3396,c047131). This commit documents the solution for future reference. The stuck detection now correctly distinguishes between: - beadsCompleted: all beads processed (including timed-out/deferred) - beadsSucceeded: successful completions only - beadsTimedOut: timed-out/deferred beads Stuck reason text now clearly shows metrics: '100 processed but 0 successful completions (all timed out/deferred)' Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2.2 KiB
2.2 KiB
Fix for beadsCompleted vs Stuck Detection Metric Discrepancy
Problem
The /api/workers API returned confusing data:
beadsCompleted: 285(counting bead.released events including timed-out/deferred)stuck: true, stuckReason: 'Running for 2311m with only 1 completion(s)'
This created a confusing impression: 285 completions but "only 1 completion"?
Root Cause
The stuck detection was using a different metric than what was displayed:
beadsCompletedcounted allbead.releasedevents (including timed-out/deferred)- The stuck detection counted successful completions (
bead.completedevents only)
When all beads timed out or were deferred, beadsCompleted would increment but the stuck detector would see zero successful completions.
Solution
Three metrics were unified in the WorkerInfo type:
-
beadsCompleted- All beads processed (bead.released events with release_success)- Includes timed-out and deferred beads
-
beadsSucceeded- Successful completions only (bead.completed events)- Excludes timed-out/deferred releases
-
beadsTimedOut- Timed-out or deferred beads (subset of beadsCompleted)- Tracked separately for clarity
Stuck Detection Update
The detectLongRunning function in stuckDetection.ts now:
- Uses
beadsSucceededfor the stuck threshold (notbeadsCompleted) - Generates clear reason text distinguishing metrics:
- "Running for 40m with 100 processed but 0 successful completions (all timed out/deferred)"
- "Running for 30m with 50 processed but only 1 successful completion(s) (49 timed out/deferred)"
- Shows evidence array with all three metrics
Acceptance Criteria Met
✅ Worker processing 100 timed-out beads shows clearly:
beadsCompleted: 100beadsSucceeded: 0stuckReason: "100 processed but 0 successful completions (all timed out/deferred)"
Files Modified
src/types.ts- AddedbeadsTimedOutfield toWorkerInfosrc/store.ts- IncrementbeadsTimedOuton bead.released with TimedOut/Deferred outcomesrc/tui/utils/stuckDetection.ts- Updated stuck detection to use unified metrics with clear messagingsrc/tui/utils/stuckDetection.test.ts- Added tests for the new behavior