Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is why I like to use LLMs to write the code to manipulate my data, draw my graphs, etc. That way I end up with the data/graph/etc artifact, but I also have the code that created it tucked away safely in my repo. So I can tweak and improve that code over time, either on my own or with the help of AI.

Here's an example where I recently used aider and GPT-4o to plot a graph.

https://aider.chat/2024/05/13/models-over-time.html

The graph itself is kind of interesting. It shows how LLM code editing skill has been changing over time as new models have been released by OpenAI, Anthropic and others.



> This is why I like to use LLMs to write the code to manipulate my data

One of the nice things about human programmers is that you can derive intent and create responsibility. Sometimes we have to encode that in a commit message or ticket, but it's there. When you find out that subtly the program was piping all data into devnull without realizing it you at least have a human to figure out how you got here. Another example of this is the xz debacle.

What are you supposed to do when that's a thousand layers deep? What about when your next generation of programmer's ability to do things is exceptionally stunted by this effort?


I'm explicitly working on this in my startup's product (a GenAI for code product).

The obvious answers: record the human's intent in the form of their prompt, and record the LLM's raw output (if you use a conversational LLM out of the box, it almost always includes this even if you explicitly prompt it not to, lol). Of course, depending on your UX this may or may not work. For autocomplete there is no obvious user intent.

There are additional approaches which I'm exploring that require more intentional engineering, but essentially involve forcing "structure" so that more intent gets explicitly specified.


To be clear you're not working on what I pointed out, you're just doing the same thing. The prompt "may" encode intent, but that has no bearing on what code gets written or stored or changed.

Think about this as well: You're creating processes that have no reasonability. I am 100% responsible for all code I write, even if I wrote it wrong, but if the code your tool generates is wrong it's not my fault, I didn't write it. Multiply this by the hundreds of thousands of times this will happen in a given year and by each employee.

Frankly, you should re-evaluate if you even want your product in the world. What kind of future hellscape are you enabling?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: