Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The last point I think is most important: "very subtle and silently introduced mistakes" -- LLMs may be able to complete many tasks as well (or better) than humans, but that doesn't mean they complete them the same way, and that's critically important when considering failure modes.

In particular, code review is one layer of the conventional swiss cheese model of preventing bugs, but code review becomes much less effective when suddenly the categories of errors to look out for change.

When I review a PR with large code moves, it was historically relatively safe to assume that a block of code was moved as-is (sadly only an assumption because GitHub still doesn't have indicators of duplicated/moved code like Phabricator had 10 years ago...), so I can focus my attention on higher level concerns, like does the new API design make sense? But if an LLM did the refactor, I need to scrutinize every character that was touched in the block of code that was "moved" because, as the parent commenter points out, that "moved" code may have actually been ingested, summarized, then rewritten from scratch based on that summary.

For this reason, I'm a big advocate of an "AI use" section in PR description templates; not because I care whether you used AI or not, but because some hints about where or how you used it will help me focus my efforts when reviewing your change, and tune the categories of errors I look out for.



I think we need better code review tools in the age of LLMs - not just sticking another LLM to do a code review on top of the PR

Needs to clearly handle the large diffs they produce - anyone have any ideas


I was about to write my own tool for this but then I discovered:

   git diff --color-moved=dimmed-zebra
That shows a lot of code that was properly moved/copied in gray (even if it's an insertion). So gray stuff exactly matches something that was there before. Can also be enabled by default in the git config.


I would love if GitHub implemented this in their UI! There’s and issue: https://github.com/orgs/community/discussions/9632


I used autochrome[0] for Clojure code to do this. (I also made some improvements to show added/removed comments, top-level form moves, and within-string/within-comment edits the way GitHub does.)

At first I didn't like the color scheme and replaced it with something prettier, but then I discovered it's actually nice to have it kinda ugly, makes it easier to detect the diffs.

[0] https://fazzone.github.io/autochrome.html


That's a great solution and I'm adding it to my fallback. But also, people might be interested in diff-so-fancy[0]. I also like using batcat as a pager.

[0] https://github.com/so-fancy/diff-so-fancy


Perfect. This is why I visit this website


Thanks:)


I personally agree with you. I think that stacked diffs will be more important as a way of dealing with those larger diffs.


Yep, this pattern of LLMs reviewing LLMs is terrifying to me. It's literally the inmates running the asylum.


When using a reasonably smart llm, code moves are usually fine, but you have to pay attention whenever uncommon words (like urls or numbers) are involved.

It kind of forces you to always put such data in external files, which is better for code organization anyway.

If it's not necessary for understanding the code, I'll usually even leave this data out entirely when passing the code over.

In Python code I often see Gemini add a second h to a random header file extension. It always feels like the llm is making sure that I'm still paying attention.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: