I tracked Birdie's training in a Notes app for six weeks. Command-level entries, dates, whether she got it or didn't. By week three it was a mess.
The real problem: I had no baseline. I couldn't tell if "sit" had actually improved or if I was just feeling better about it because we'd been doing it longer.
What I actually needed was a success rate per command over time. Not a journal. Not a log. A number I could look at and say: this is better, or it isn't.
I ended up building PawFormance after that experience. It tracks success rate per command across sessions. Birdie's "sit" went from 58% to 94% over six weeks. That's the number that told me the program was working.
The insight that surprised me: I thought I needed more notes. I needed fewer notes and one specific metric.
Anyone else run into this with any kind of habit or performance tracking — the log becoming noise instead of signal?
One thing I'd be careful with:
The interesting question may not be whether success rate is a better metric than notes.
It may be what decision that metric is actually being used to support.
Those sound similar, but they can lead to very different product decisions as more users arrive.
I wouldn't make that call casually this early.