We weren’t able to agree on a good way to measure this. Curious - what’s your op...

arcwhite · 2025-12-17T21:49:44 1766008184

I've seen code persist a long time because it is unmaintainable gloop that takes forever to understand and nobody is brave enough to rebuild it.

So no, I don't think persistence-through-time is a good metric. Probably better to look at cyclomatic complexity, and maybe for a given code path or module or class hierarchy, how many calls it makes within itself vs to things outside the hierarchy - some measure of how many files you need to jump between to understand it

refactor_master · 2025-12-18T05:12:26 1766034746

I second the persistence. Some of the most persistent code we own is because it’s untested and poorly written, but managed to become critical infrastructure early on. Most new tests are best-effort black box tests and guesswork, since the creators have left a long time ago.

Of course, feeding the code to an LLM makes it really go to town. And break every test in the process. Then you start babying it to do smaller and smaller changes, but at that point it’s faster to just do it manually.

nerevarthelame · 2025-12-17T23:16:46 1766013406

You run a company that does AI code review, and you've never devised any metrics to assess the quality of code?

dakshgupta · 2025-12-17T23:25:09 1766013909

We have ways to approximate our impact on code quality, because we track:

- Change in number of revisions made between open and merge before vs. after greptile

- Percentage of greptile's PR comments that cause the developer to change the flagged lines

Assuming the author is will only change their PR for the better, this tells us if we're impacting quality.

We haven't yet found a way to measure absolute quality, beyond that.

nluken · 2025-12-18T00:29:42 1766017782

Might be harder to track but what about CFR or some other metric to measure how many bugs are getting through review before versus after the introduction of your product?

You might respond that ultimately, developers need to stay in charge of the review process, but tracking that kind of thing reflects how the product is actually getting used. If you can prove it helps to ship features faster as opposed to just allowing more LOC to get past review (these are not the same thing!) then your product has a much stronger demonstrable value.

wordpad · 2025-12-17T21:23:25 1766006605

I've seen code entropy as the suggested hueriatic to measure.