I shipped 3 new ci-doctor rules. Then I ran them on my own repos.
ci-doctor is a free MIT linter for GitHub Actions
workflows. I shipped ci-doctor 0.4.0 with three new rules:
stale-cache-key- flagsactions/cachesteps whose key has nohashFiles(), so the cache never invalidates when deps change.fail-fast-true- flags matrix jobs that don't explicitly setfail-fast: false, so the first cell that fails cancels every paid sibling.always-run-on-pr- flags heavy steps (docker build+push, cypress, playwright, codeql) that run on every PR with no path filter or label gate.
The first thing I did after publishing 0.4.0 was run it against the six repositories I maintain. Two real things came out of that, both in the next 30 minutes.
Finding 1: my own matrix had the smell I just shipped a rule for
Each of the five tool repos (depmedic,
ci-doctor, cursor-rules-init,
gha-budget, pin-actions) has the same test
workflow: matrix Node 18, 20, 22 on Ubuntu. None of them set
fail-fast: false.
This was not a bug. It was the default. But it is exactly the cost smell I was about to ship a rule for, and shipping a CLI to other people that scolds them about something my own repos do is the kind of thing that costs you trust the first time someone notices.
Five-line patch on each repo:
jobs:
test:
strategy:
+ fail-fast: false
matrix:
node: [18, 20, 22]
After: npx ci-doctor projects/depmedic -> No findings.
All five tool repos. Verified.
Finding 2: the CLI itself crashed on a single-file argument
To dogfood, my first instinct was:
node bin/ci-doctor.js projects/depmedic/.github/workflows/test.yml
And it crashed:
Error: ENOTDIR: not a directory, scandir
'.../projects/depmedic/.github/workflows/test.yml'
Two things wrong here. First, the obvious one: a CLI with the word
"doctor" in its name should not give you a stack trace because you
passed it a single file. Second, the documentation in the README and
the blog posts shows examples like npx ci-doctor and
npx ci-doctor --fix, which both default to scanning a
directory. Nobody had ever exercised the single-file path because the
CLI didn't advertise it.
The fix was a four-line stat-and-branch in bin/ci-doctor.js:
const target = args.positional[0] || process.cwd();
const stat = fs.existsSync(target) ? fs.statSync(target) : null;
if (stat && stat.isFile()) {
const src = fs.readFileSync(target, 'utf8');
findings = auditWorkflow(src, ...);
} else {
findings = auditDirectory(target, args);
}
Shipped as ci-doctor 0.4.1.
What this is and isn't
This isn't an exciting bug. It's a small one. The reason I'm writing
about it is that the entire conceit of ci-doctor is
"small CLIs that catch the kind of thing nobody schedules time to
notice." If I don't run my own tools against my own work, the rules I
ship will drift away from reality. So this is going to be a habit:
every time a new rule lands, dogfood it the same day, fix anything it
surfaces, write down what happened.
The new aggregate dataset (across 20 popular OSS projects) shows
the three new rules find 42 additional smells beyond what 0.3.x
caught - mostly from stale-cache-key (19 hits across the
20 repos) and fail-fast-true (13 hits). The
benchmarks page is updated.
Run it yourself
npx ci-doctor # audit a directory
npx ci-doctor path/to/file.yml # audit a single file (now works)
npx ci-doctor --fix # auto-apply the four safe fixes
Or paste a workflow at depmedicdev-byte.github.io/audit.html for the same audit in your browser, no install.
Want the full pattern set, not just what the CLI lints?
The Cut Your CI Bill cookbook is 30 paste-ready
patterns plus 5 hardened workflow templates. Pairs with
ci-doctor directly.
$19, one-time, MIT-licensed templates.