Last post I showed off the chainsaw I built myself — my build-autonomous loop, the padded room, the whole rig — and cut down a little tree with it: a Go CLI to boss around a smart plug. It was well worth building! I also think you should probably stop building your own. Yes, including me. Especially me.
You forge one by hand to understand it. You don't forge every one you'll ever use.
Here’s the thesis, and it’s a distinction I think a lot of us are getting wrong right now: building your own harness is how you learn. Adopting someone else’s is how you scale. Those are two different jobs. I conflated them, rode my own tool a few steps too far, and had to catch myself doing something I’d nag a junior engineer about — reinventing infrastructure out of pride.
Let me walk through how I got here, because the getting-here is the interesting part.
You Can’t Learn Without Doing
I stand by every word of the last post. Building build-autonomous myself was 100% the right move — for what it was for.
You cannot learn to swim by reading a book about swimming. I said that last time and I’ll say it forever, because it’s the truest thing I know about this whole field. I did not understand agentic loops — really understand them, in my hands, in my gut — until I built one. Wiring up the vision gate, the requirements gate, the test-first corridor, the three-way review — that building is where the knowledge lives. Reading about someone else’s loop would have taught me the vocabulary and none of the swimming.
So yeah. I built my own. On purpose. And I got exactly what I went in for: I now actually grok how these loops work, where they’re fragile, what makes a gate a gate. That knowledge is mine now and nobody can take it back.
The only way out is through. Get in the water. (Yes, this sticker again. It earns its spot.)
That’s the sticker from last time, and I’m putting it right back here because it’s the entire point of this section: get in the water. Build the thing. Flail around. Swallow some water.
But here’s what nobody puts on the sticker: once you know how to swim, you don’t have to keep building your own pool.
The Two Scaling Problems
Learning was job one, and it’s done. Now I’ve got a different problem. Two of them, actually.
Scaling across people. In my day job I lead a software org, and I want this way of working spread across the whole team — not as a party trick I do, but as the way we work. Here’s the inconvenient truth about that: we are not homogenous on Claude Code. We’ve got a few different coding agents in play across the org. A harness that only I understand, that only lives in my ~/.claude, that only I can fix when it breaks — that scales to exactly one human. Me. That’s not leadership, that’s a bus factor of one wearing a cape.
Scaling myself. Even setting the org aside — do I want to be the guy maintaining a complex homegrown harness forever? Every skill, every gate, every edge case, on me. Every time Claude Code ships a change, on me. That’s a second job I didn’t ask for. The whole promise of this loop was to get toil off my plate, and somewhere in there I’d quietly signed up to babysit the toil-remover. Geesh.
Both problems have the same shape. The thing I built to learn is now a thing I have to maintain, and maintenance doesn’t scale — not across people, not across time, not across my own attention.
My Superpowers Mistake (Spoiler: It Was Ego)
Now the embarrassing part, because I don’t sell silver bullets and I don’t hide my own dumb moves.
I’d looked at Superpowers — Jesse Vincent’s open-source skills framework — early on. And I bounced right off it. “Too complicated,” I said. “Over-engineered.” So I did the thing that felt smart and was actually just ignorance in a trench coat: I tried to cherry-pick bits out of it. Grab a skill here, a pattern there, wire them into my own thing by hand.
Think about how dumb that is for a second. Superpowers has an atomic install and an atomic uninstall. One command in, one command out, no residue. And I was over here hand-copying pieces of it like I was salvaging parts off a car in a junkyard. Cherry-picking when there’s a clean install/remove mechanism isn’t clever. It’s a confession that you don’t yet understand how the thing works. I was manually extracting value from a system precisely because I hadn’t sat with the system long enough to see that it was a system. So I rejected it. From a position of not-getting-it.
Ugh. But that’s the honest timeline.
What Changed My Mind
Then I read Jesse’s “Superpowers 6” post. And the part that flipped me wasn’t the headline stuff — not the “uses fewer tokens” angle everyone latches onto. That’s nice, sure. What stopped me cold was how many environments it already supports.
That’s the gold. That’s the whole game.
Because breadth of support is a proxy for everything that actually matters when you’re trying to scale instead of learn:
- Adoption will be higher. People are already running it, in the tools they already have. I’m not asking my org to adopt Greg’s Weird Thing. I’m asking them to adopt the thing that already has a huge following across numerous platforms.
- It’ll get better faster. More users, more eyes, more bug reports, more contributors. My harness improves at exactly the rate I have free evenings. Theirs improves at the rate of a whole community — and Jesse is hiring staff to work on it. I cannot out-ship that. Nobody with a day job can.
- It’ll be more widely understood. When something is broadly adopted, the knowledge about it is broadly held. Blog posts, coworkers who’ve already hit the sharp edge you’re about to hit - more docs for AI agents to consume! My homegrown loop has exactly one expert on Earth, and he’s tired.
And that forced me to say the quiet part out loud: it is hubris to think I can build a better harness than a group of smart, dedicated people whose actual job is building it. I built mine to learn, and it was great for that. But insisting on running my own for real work — for anything past the learning — isn’t engineering judgment. It’s ego. And ego is not a winning strategy (unless you start rich enough, but that’s another story).
This is the “both can be true” thing I keep coming back to. Building it myself was right. Keeping it as my daily driver would be wrong. Both true. The trigger that flips one into the other is the word scale.
One person cannot run the whole boat. On the Russell we stood watches as a crew — and the crew is the point.
The Non-Negotiable Step: Frisk It First
Now — before you think I just rolled over — none of this means “trust it blind.” The exact opposite.
I wrote a whole section last time about how you frisk the chainsaw before you trust it: read the manifest, grep for the dangerous verbs, inventory every executable, read the auto-firing skills, check provenance, and run it in the jail first. All of that applies double to something I’m about to recommend across an entire org. A skill package is executable instructions running with my permissions on my machine. Adopting the popular one doesn’t change the threat model — it just means more people are pointing the same chainsaw around, which is a reason for more scrutiny, not less.
But here’s the thing that makes wide adoption possible: because Superpowers passes that frisk — because I can read it, test it, and inspect it — it becomes something I can actually advocate for. I can’t in good conscience tell my org “everybody run my personal thing that only I’ve reviewed.” I can say “here’s a widely-used, open, inspectable framework; here’s my review of it; here’s it passing the same tests I hold my own code to.” That’s a defensible recommendation. That’s how you move an org that isn’t even all on the same agent.
Inspection isn’t the tax you pay to adopt. Inspection is the thing that lets you adopt, out loud, with your name on the recommendation.
Apples to Apples: I Ran My Own Test Against It
Talk is cheap. So I did the obvious experiment: I ran Superpowers through the exact same test I used in the last post.
Same tree. The Shelly Plug US Gen4 controller CLI. I took the VISION.md from the original plugctl project — literally copied the file — dropped it into a fresh, clean project called plugctl2, and turned Superpowers loose on it. Same starting intent, different chainsaw. The most honest comparison I could set up.
And I’ll tell you what — it felt familiar in the best way.
- It asked good questions. One at a time, drawing intent out of me before writing a line of code. Exactly the discipline I’d hand-built into my own vision gate.
- It did the design, then asked me to review it. A human gate on the architecture before implementation. Again — that’s precisely how mine works. Two independently-built tools landing on the same discipline is a pretty strong signal that the discipline is right, not just mine.
- It offered me a choice: one-shot the whole build with sub-agents, or work sequentially in phases gated on my approval. Since I was doing apples-to-apples against my own loop, I picked one-shot with sub-agents. Let it rip, no phase gates.
And damned if it didn’t just… do it. Perfectly. Well. Kind of. The result is public: github.com/emergingrobotics/plugctl2.
Same tree, felled clean, by a chainsaw I didn’t have to build or maintain.
plugctl vs plugctl2: The Head-to-Head
Naturally I couldn’t leave it there. I had two implementations of the same spec — one from my homegrown loop, one from Superpowers — so I did what any recovering not-invented-here sufferer does: I put them side by side and had an agent do a cold, detailed comparison. Which one is subjectively better, and what are the concrete differences?
And here’s the twist I did not see coming, and I’m going to give it to you straight because I don’t polish over facts on this blog: my hand-built harness produced the better tool. Not the shiny new community one. Mine.
Let me back that up, because a claim like that is worthless without evidence.
First, the fair part: both work. Both compile, both pass go vet, both are gofmt-clean, both pass 100% of their own tests. Neither is broken. plugctl2 is a clean, correct, dependency-free Go program that does exactly what a smart-plug CLI should do. As a one-shot — design and build the whole thing in a single autonomous pass — it’s genuinely impressive.
But “works” and “complete” are different words. Here’s the shape of it:
plugctl (mine) |
plugctl2 (Superpowers) |
|
|---|---|---|
| Total Go LOC | ~1,000 | ~700 |
--timeout configurable |
yes (+ PLUG_TIMEOUT) |
no — hardcoded 5s |
Switch/channel select (--id) |
yes | no — hardcoded to 0 |
status output |
state + V/A/W + energy + temp + --json |
V/A/W only |
Real --help / -h |
yes → stdout, exit 0 | no — -h is an unknown command, exits 2 |
| Response size cap (CWE-400) | yes (1 MiB LimitReader) |
no — unbounded read |
| Host validation/sanitization | yes (net/url, rejects bad schemes) |
no — string-formatted URL |
go install-able module path |
yes | no — module is bare plugctl |
Verbose trace (-v) |
no | yes (nice one!) |
Where mine pulls ahead is completeness and hardening: a configurable timeout (which, notably, was in the very VISION.md I copied over — so plugctl2 actually under-delivered on the spec there), channel selection, a richer status that reports relay state, energy, and temperature with a --json mode, a response-size cap so a broken or hostile device can’t blow up memory on a Raspberry Pi, and host validation so you can’t smuggle junk into the request line. It reads like operator tooling I’d actually ship. There’s even a protocol difference under the hood — mine POSTs the canonical JSON-RPC envelope to /rpc, plugctl2 uses the GET-with-query-params channel and leans on HTTP status codes for errors. Both are legal Shelly Gen2+. Mine handles the nastier “2xx response that’s secretly an error” case.
And two of plugctl2’s gaps are just plain bugs for a scriptable CLI: no --help (for a terminal tool!) and a module path that breaks go install.
Now — credit where due, because plugctl2 genuinely does some things better than mine: a cleaner idiomatic cmd/plugctl/ layout, a really nicely done -v verbose trace that dumps the resolved host, timeout, and exact RPC URL to stderr for debugging, higher test coverage on the client package (97% vs my 82%), a tidier conventional-commit history with the test-first discipline more visible in the log, and a thoughtful “manually verify the relay actually clicked” checklist in its README that mine doesn’t have. That’s not nothing. That’s taste I’d happily steal.
So the subjective verdict, honestly: on this one bench test, mine is the better tool. More complete, more defensive, more spec-ambitious.
So Why Am I Still Switching?
Because — and this is the whole post in one realization — the head-to-head measures the wrong thing.
Of course my harness won. I built it. It’s tuned to my taste, loaded with my defensive habits, aimed at exactly the kind of tool I like to ship. It’s a tailored suit. It fits me perfectly because it was cut for one body. That’s not an argument for scaling it across an org — it’s the exact reason it can’t scale. The thing that makes it win a solo bench test is the thing that makes it mine-and-only-mine.
And notice how it won: not on architecture, not on discipline — both harnesses asked good questions, both did design-then-review, both went test-first. They landed on the same process. Mine won on a handful of features and hardening details. That’s a gap you close with a config tweak and a couple of prompts. It is not a moat.
Meanwhile, plugctl2’s misses — the missing --help, the module path — are precisely the kind of thing a funded team with thousands of users fixes in a week, in a release I don’t have to author, test, or maintain. My tool’s edge is frozen at whatever I last had the evening to build. Theirs compounds. Give it two months and I’d bet the gap inverts, and I won’t have lifted a finger.
That’s the trade. A slightly-more-complete tool today that only I can maintain and only I understand — versus a slightly-less-complete tool today that a whole community is driving forward and half my org could already run. For learning, I’d pick mine every time. For scaling, that’s not even a close call.
Conclusion
Here’s the whole thing in one breath: you build your own to learn, and you adopt the shared one to scale. I needed to build build-autonomous to actually understand agentic loops — no shortcut exists for that, the swimming is the point. But the moment the goal changed from learning to scaling — across my org, across time, across my own finite attention — insisting on my homemade rig stopped being engineering and started being ego.
Superpowers wins the scaling job for reasons that have nothing to do with me being a worse engineer and everything to do with math: broader adoption, faster improvement, wider understanding, and a funded team whose whole job is making it better. I can’t out-ship a community. Neither can you. And that’s good news — it means the toil is genuinely somebody else’s problem now.
And yes — my homemade saw actually cut a slightly cleaner tree this round. I’m switching anyway, and if that sounds irrational, re-read the head-to-head: my edge is frozen and lonely, theirs compounds with a whole community behind it. I’m keeping my own loop as a teaching tool, a place to keep learning by building. But my daily driver — and what I’ll advocate across my org — is the thing I can inspect, that passed my frisk, that a hundred other people are making better while I sleep.
Build to learn. Adopt to scale. Know which job you’re on.
And if you’re still hand-copying pieces out of a tool that has a one-command install because it “feels more in control” — friend, I’ve been there, it’s ego wearing an engineer’s hat, and there’s an atomic uninstall waiting for you when you’re ready.
If this helped, drop me a note on LinkedIn. And remeber: Pay it forward!