diff --git a/notes/bf-27e4.md b/notes/bf-27e4.md index a3058a2..e356e85 100644 --- a/notes/bf-27e4.md +++ b/notes/bf-27e4.md @@ -1,34 +1,56 @@ -# bf-27e4: Fix beadsCompleted vs stuck detection metric discrepancy +# Fix for beadsCompleted vs stuck detection metric discrepancy (bf-27e4) -## Summary -This bead was fixed in commit `47c3396` (2025-06-07). The fix unified the stuck detection metric with `beadsCompleted` to eliminate the confusing discrepancy where `/api/workers` returned contradictory data. +## Problem +The `/api/workers` endpoint returned contradictory data per worker: +- `beadsCompleted: 285` (counts bead.released events) +- `stuck: true, stuckReason: 'Running for 2311m with only 1 completion(s)'` -## What was fixed -**Before**: `/api/workers` showed confusing data when all beads timed out: -- `beadsCompleted: 285` (counted all bead.released events including timed-out/deferred) -- `stuck: true, stuckReason: 'Running for 2311m with only 1 completion(s)'` (counted only successful completions) +The stuck detection was counting a different metric (successful completions) while `beadsCompleted` counted all beads released (including timed-out/deferred). When all beads timed out and were deferred, `beadsCompleted` incremented but the stuck detector saw zero successful completions and flagged the worker as stuck. -**After**: The stuck detection now clearly distinguishes: -- `beadsCompleted`: Total beads processed (including timed-out/deferred) -- `beadsSucceeded`: Only successful completions (bead.completed events) -- `beadsTimedOut`: Beads that timed out or were deferred +## Solution (Already Implemented) -## Implementation changes -1. Added `beadsSucceeded` and `beadsTimedOut` counters to `WorkerInfo` type (src/types.ts) -2. Increment `beadsSucceeded` on `bead.completed` events (src/store.ts) -3. Increment `beadsTimedOut` on `bead.released` with `TimedOut` or `Deferred` outcome (src/store.ts) -4. Updated stuck detection to use `beadsSucceeded` for threshold and show clear reason text (src/tui/utils/stuckDetection.ts) +The fix was already in place in the codebase: -## Acceptance criteria met -✅ Worker processing 100 timed-out beads now shows clearly: -- `beadsCompleted: 100` (processed) -- `beadsSucceeded: 0` (successful) -- `stuckReason: 'Running for X minutes with 100 processed but 0 successful completions (all timed out/deferred)'` +### 1. WorkerInfo Type Definition (src/types.ts) +```typescript +export interface WorkerInfo { + /** Total beads processed (bead.released events with release_success, includes timed-out/deferred) */ + beadsCompleted: number; -✅ Stuck flag still fires with accurate reason text + /** Total beads successfully completed (bead.completed events only, excludes timed-out/deferred releases) */ + beadsSucceeded: number; -## Tests -All 2516 tests pass, including: -- `src/tui/utils/stuckDetection.test.ts` (18 tests) -- `src/store.test.ts` (verifies beadsSucceeded and beadsTimedOut increment correctly) -- `src/web/server.test.ts` (verifies API returns correct metrics) + /** Total beads that timed out or were deferred (subset of beadsCompleted) */ + beadsTimedOut: number; + // ... +} +``` + +### 2. Stuck Detection (src/tui/utils/stuckDetection.ts) +The `detectLongRunning` function now properly distinguishes between processed and successful completions: +- Uses `worker.beadsCompleted` for total processed beads +- Uses `worker.beadsSucceeded` for successful completions only +- Uses `worker.beadsTimedOut` for timed-out/deferred count +- Generates clear reason messages: + - "X processed but 0 successful completions (all timed out/deferred)" + - "X processed but only Y successful completion(s) (Z timed out/deferred)" + +### 3. Store Tracking (src/store.ts) +The store properly increments the counters: +- `beadsCompleted++` on `bead.released` with `release_success` (line 716) +- `beadsSucceeded++` on `bead.completed` (line 702) +- `beadsTimedOut++` when outcome is `TimedOut` or `Deferred` (line 720) + +## Verification +All tests pass: +```bash +npm test -- src/tui/utils/stuckDetection.test.ts +# 18 tests passed +``` + +## Acceptance Criteria Met +- ✓ A worker that processes 100 beads (all timed out) shows clearly in the UI that it processed 100 but completed 0 successfully +- ✓ The stuck flag still fires but the reason text is accurate + +## Status +Fix already implemented and tested. No code changes needed.