fix(bf-27e4): unify stuck detection metric with beadsCompleted
Some checks are pending
CI / test (18.x) (push) Waiting to run
CI / test (20.x) (push) Waiting to run
CI / test (22.x) (push) Waiting to run

The stuck detection's detectLongRunning function was using text-based
message matching ('completed'/'complete' in msg) to count completions,
while beadsCompleted counts actual bead.completed and bead.released
events with release_success.

This caused confusion: a worker with 285 beadsCompleted (all timed out)
would be flagged as stuck with 'only 1 completion(s)' because the
message filter found few matches.

Changed detectLongRunning to use worker.beadsCompleted directly for
consistency. Updated reason text to clarify 'successful completion(s)'

Fixes #bf-27e4
This commit is contained in:
jedarden 2026-06-07 10:42:31 -04:00
parent b5df74a321
commit 04904ce032

View file

@ -284,17 +284,19 @@ function detectLongRunning(
if (runningTime > opts.longRunningThresholdMs) {
const minutes = Math.floor(runningTime / 60000);
// Check if making progress
const completions = events.filter(
(e) => e.msg?.includes('completed') || e.msg?.includes('complete')
).length;
// Use worker.beadsCompleted (counts bead.completed and bead.released with release_success)
// instead of text-based message matching
const completions = worker.beadsCompleted;
if (completions < 2) {
return {
type: 'long_running',
reason: `Running for ${minutes}m with only ${completions} completion(s)`,
reason: `Running for ${minutes}m with only ${completions} successful completion(s)`,
severity: minutes >= 20 ? 'critical' : 'warning',
evidence: [`Beads completed: ${worker.beadsCompleted}`],
evidence: [
`Beads completed: ${worker.beadsCompleted}`,
`Total events in window: ${events.length}`,
],
suggestion: 'Consider breaking task into smaller pieces',
};
}