Hacker Newsnew | past | comments | ask | show | jobs | submit | lucumo's commentslogin

It's got our number... https://translate.kagi.com/?from=en&to=Hacker+News+speak&tex....

"I like burgers." in Hacker News speak:

> I'm curious if anyone has looked into the scalability of burgers. Honestly, I've been DIYing my own patties since I realized the local joints aren't really optimized for flavor-to-cost ratio. Does this even have an API?


There's something viscerally distasteful about a one-liner comment berating the author of a long thoughful comment for exerting too little effort.


Why would oral verification be needed? Hand-written answers on paper in a proctored classroom should still work fine. That was the way most verification worked when I was in school, and still is the most used verification method used currently around me.

Homework assignments are harder, but those were always a bit difficult for teachers. It's not like cheating was invented by Gen Z...


Gen Z definitely didn’t invent cheating, but LLMs brought qualitative difference and scale. That changes the properties of the system.

During my university most courses had a good mixture of take-home assignments/projects and in-class exams. Yes, people could always cheat either through plagiarism (usually easily caught) or at the extreme by getting someone else to do the work (which I have never personally seen).

Anecdotal data around me shows:

* outright paper/assignment generation via LLM

* using chatGPT as a “professor” proofreading and polishing course work before submission (arguably good use but depends on the personal effort)

* avoiding reading by asking chatGPT for summaries

* using chatGPT to help explain various concepts (this is a good example of using LLMs as a source for learning…accepting that occasionally they can lie)

In a small classroom where a good teacher-student interaction happens, I guess it’s easier to catch people cheating. But some universities (maybe most) have massive classes where a professor may never have an actual conversation with some students. That context makes cheating harder to detect.

I accept my outlook on this may be a bit bleaker (hopefully), but saying it’s business as usual is at the other extreme.


My college classes usually had one offline written test per quarter, and about half the classes had an assignment with them. I can see how those would be easier to cheat on now, though they were already hardly cheat-free. (Not just plagiarism, also free-riding on group assignments for example.) The written examinations carried the heaviest load precisely because of that.

Offline written tests solve the issue quite well. They scale well too. At least as far as assignments do.

People saying that oral examinations are the last bastion of cheat-free examinations are really over-stating the case.

> But some universities (maybe most) have massive classes where a professor may never have an actual conversation with some students.

Probably most yeah. At least it was my experience.


I would say that you also don't know the false positive rate. The only person who truly knows is the one who wrote/generated the text. And they have every incentive to say it's not AI-generated, whether or not it truly is.

Personally, when I see the number of accusations thrown around, I very much suspect that the false positive rate is pretty high.


I think that's Gemini trying to personalize the answer specifically for you. It really leans heavily into that to the point of being galling.

You can give it additional instructions in the settings, but you have to be careful with that too. I've put my tech stack and code preferences in there to get better code examples. A while later I asked it about binary executable formats and it started ending every answer with "but the JVM and v8 take care of that for you."

Which is both funny in an "I, Robot" kind of way, and irritating. So I told it to ignore my tech stack. I have a master's in CS and can handle a bit of technical detail.

Turns out, Gemini learned sarcasm. Every following answer in that thread got a paragraph that started with something like "But for your master brain, this means..."


The new memory feature in Gemini got turned on by default and every answer came out like this. It kept working in details from one particularly long thread. Everything was framed in terms of the common elements. Everything. I turned it off immediately.


This seems like a huge risk factor for users who are at risk for schizophrenia - if someone is using the LLM as an "AI companion", the model is likely to reinforce, or even suggest, illusory connections between events or experiences the user has described in their conversations.


How can you turn it off without turning off history ("My Activity") altogether?

I noticed the "memory" too and it's turned Gemini into a useless syncophant for me, but so subtle that I almost didn't spot it.


https://gemini.google.com/saved-info

The toggle by "Your past chats with Gemini"


Even Gemini 2.5 was extremely snarky. I basically disable all guardrails via prompts and instructions, and it started getting snippy at me for apparently acting like a know-it-all.


Yes, and it's a detection loop without feedback. You can never verify that a piece of work in the wild is actually AI. The poster is the only one who really knows, and they'll always say it's not.

This is a problem, because you can easily get stuck in a self-reinforcing loop. You feel strengthened in your convictions that you're good at ferreting out LLM-speak because you've found so much of it. And you find so much of it because you feel confident you're good at it. Nobody ever corrects you when you're wrong.

Combine that with general overconfidence and you get threads where every other post with correct grammar gets "called out" as AI generated. It's pretty boring.

There's a similar effect with contentious subject. You get reams and reams of posts calling the other side out for being part of a Russian/Israeli/Iranian/Chinese troll network. There's no independent falsification or verification for that, so people just get strengthened in their existing beliefs.


>Yes, and it's a detection loop without feedback. You can never verify that a piece of work in the wild is actually AI. The poster is the only one who really knows, and they'll always say it's not.

Yes. People keep saying, in response to points like this, "oh but you/I can tell pretty easily." But it's not the detection, it's the verification! (see what I did there)

Where I'd push back is the idea that the problem is the boring "call out" discourse that follows each accusation. The problem of verifying human provenance is fundamental to the discussion of trust and argumentation, but the simple "the zone is flooded" problem is also an ecological one. There's terrible air/water/soil quality in the metro area I live in; people have to live with it w/o regard to how invested they are in changing it.


Ever since the sloppification of the internet began, I’ve called out hundreds of LLM slop posts. I’ve gotten about 50 responses back from the author, most of them admitting to LLM usage, with only a single one initially vehemently denying it, but then later admitting it.

I cannot know what this says about my false negative rate, but at the very least I am confident in my false positive rate.


At this point it’s pretty easy to detect unaltered LLM output because it is such bad writing. That will change over time with training I would hope. At some point I imagine it will be hard to tell.

I honestly don’t know what sites like this will do when that happens and the only way of detecting LLMs is that they are subtly wrong or post too much, we’d be overrun with them.

Not sure if we should be hopefully or fearful that they will improve to be undetectable but I suspect they will.


> That will change over time with training I would hope.

There's precious little training material left that isn't generated by LLMs themselves.

Consider this to be model collapse (i.e. we might be at the best SOTA possible with the approach we use today - any further training is going to degrade it).


> There's precious little training material left that isn't generated by LLMs themselves.

Percentage-wise this is quite exaggerated.

> Consider this to be model collapse (i.e. we might be at the best SOTA possible with the approach we use today - any further training is going to degrade it).

You consider this above factor to lead to model collapse? You’ve only mentioned one factor here; this isn’t enough. I’m aware of the GIGO factor, yes. Still there are at least ~5 other key factors needed to make a halfway decent scaling prediction.

It is worth mentioning one outside view here: any one human technology tends to advance as long as there are incentives and/or enthusiasts that push it. I don’t usually bet against motivated humans eventually getting somewhere, provided they aren’t trying to exceed the actual laws of physics. There are bets I find interesting: future scenarios, rates of change, technological interactions, and new discoveries.

Here are two predictions I have high uncertainty about. First, the transformer as an architectural construct will NOT be tossed out within the next five years because something better at the same level is found. Second, SoTA AI performance advances probably due to better fine-tuning training methods, hybrid architectures, and agent workflows.


> There's precious little training material left that isn't generated by LLMs themselves.

> Percentage-wise this is quite exaggerated.

How exaggerated?

a) The percentage is not static, but continuously increasing.

b) Even if it were static, you only need a few generations for even a small percentage to matter.

> You consider this above factor to lead to model collapse? You’ve only mentioned one factor here; this isn’t enough. I’m aware of the GIGO factor, yes. Still there are at least ~5 other key factors needed to make a halfway decent scaling prediction.

What are those other factors, and why isn't GIGO sufficient for model collapse?


I wouldn't say it's "bad writing", but rather that the sheer volume of it allows the attentive reader to quickly identify the tropes and get bored of them.

Similar to how you can watch one fantastic western/vampire/zombie/disaster/superhero movie and love it, but once Hollywood has decided that this specific style is what brings in the money, they flood the zone with westerns, or superhero movies or whatever, and then the tropes become obvious and you can't stand watching another one.

If (insert your favorite blogger) had secret access to ChatGPT and was the only person in the world with access to it, you would just assume that it's their writing style now, and be ok with it as long as you liked the content.


It is objectively bad writing:

Overly focussed on style over content

Melodrama even when discussing the mundane

Attention grabbing tricks like binary opposites overused constantly

Overuse of adjectives and adverbs in particularly inappropriate places.

Lack of coherence if you’re generating large bits of text

General dull tone and lack of actual content in spite of the tricks above

Re your assertion at the end - sure if I didn’t know I’d think it was a particularly stupid, melodramatic human who didn’t ever get to the point and probably avoid their writing at all costs.


Sites like this will have to start using bot detection. Captchas, Anubis.


> At this point it’s pretty easy to detect unaltered LLM output because it is such bad writing.

And yet people seem to still be terrible at that. Someone uses an em-dash and there's always a moron calling it out as AI.

> I honestly don’t know what sites like this will do when that happens and the only way of detecting LLMs is that they are subtly wrong or post too much, we’d be overrun with them.

My personal take is that it doesn't really matter. Most posts are already knee-jerk reactions with little value. Speaking just to be talking. If LLMs make stupid posts, it'll be basically the same as now: scroll a bit more. And if they chance upon saying something interesting then that's a net gain.


Never seen this in the wild, but that sounds unfortunate about em-dashses.

Personally, I think it will matter deeply if sites like this are overrun by bots. If you believe your description, why are you here?


> Perhaps if Sweden adopted a different policy it would have an even longer life expectancy!

The policy of being between 55 and 69 N? I'm not sure the world is ready for another viking age.

Joking aside, GPs point was that Sweden has long nights and long days. Based on the studies you'd expect life expectancy to be worse there than in more Southern parts, like most of Canada. It isn't.


So because we're used to it? I know perfectly how those C numbers will feel. Haven't got a clue about the F numbers.

Anyway, I doubt that that analogy goes for noon. I eat lunch by the clock, not when the sun's highest. I expect most people do. Especially the ones that are cooped up in an office during the daytime.


> Then a brick hits you in the face when it dawns on you that all of our tools are dumping crazy amounts of non-relevant context into stdout thereby polluting your context windows.

Not just context windows. Lots of that crap is completely useless for humans too. It's not a rare occurrence for warnings to be hidden in so much irrelevant output that they're there for years before someone notices.


The old unix philosophy of "print nothing on success" looks crazy until you start trying to build pipes and shell scripts that use multiple tools internally. Also very quickly makes it clear why stdout and stderr are separate


Also becomes rapidly apparent that most modern tooling takes an extremely liberal view of logging levels. The fact that you’ve successfully processed a file is not INFO, that’s DEBUG.


"Finished conversion of xyz.mp3 to xyz.ogg" is valuable progress information to a regular user, not just to developers, so it belongs in INFO, not DEBUG.


I suppose this is subjective, but I disagree. If I want to know the status of each item, I’d pass -v to the command. A simple summary at the end is sufficient; if I pass -q, I expect it to print nothing, only issuing a return code.


> If I want to know the status of each item, I’d pass -v to the command.

I don't disagree. In my opinion, the default log level for CLI applications should be WARN, showing errors and warnings. -q should turn this OFF (alternatively, -q for ERROR, and -qq for OFF), -v means INFO, -vv DEBUG, -vvv TRACE. For servers and daemons, the default should probably be INFO, but that's debatable.


It never felt crazy to me, with the exception that are many situations where having progress and diagnostic information (usually opt-in) made sense.

I guess it comes down to a choice of printing out only relevant information. I hate noisy crap, like LaTeX.


Yeah. Maybe we only need:

   BATCH=yes    (default is no)

   --batch   (default is --no-batch)
for the unusual case when you do want the `route print` on a BGP router to actually dump 8 gigabytes of text throughout next 2 minutes. Maybe it's fine if a default output for anything generously applies summarization, such as "X, Y, Z ...and 9 thousand+ similar entries".

Having two separate command names (one for human/llm, one for batch) sucks.

Having `-h` for human, like ls or df do, sucks slightly less, but it is still a backward-compatibility hack which leads to `alias` proliferation and makes human lifes worse.


It's latin. "de omnibus" on the second line is pretty well recognizable. But holy hell is Gutenberg's font terrible. Look at the first word on the second line, it ends in seven undotted sticks that seem to bleed over into each other a bit. I read that as "mim" before I figured out it was "num".

Anyway, it's the Gutenberg bible. The epistle of St. Jerome, according to the alt text.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: