Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I guess this starts the countdown clock to the first botnet running a LLM to generate spam content. Maybe I'm just turning in a crotchety old guy who is scared of new tech, but it really seems like as a community we are underestimating the degree to which this will present an existential threat to every site that relies on user generated content.


I don't understand this argument. Have you tried running a website with an open comment section in the last 10 years? Every corner of the internet is already stuffed with low-qualtity spam. Does it really matter if the spam quality gets better? Search for any combination of 2 words that are not related to each other on Google and you find some bullshit site that just lists random words. Arguably, wouldn't it be better if there actually was AI generated content that combines the 2 words in some meaningful way and maybe, maybe, presents something useful? It's also not like every information on the internet - even if generated by humans - is correct and fact checked, so you need to do the critical thinking yourself anyway.


> Does it really matter if the spam quality gets better?

It matters a lot. Spam is easy to recognize and e.g. my current email client filters out dozens to hundreds of spam mails per day without any false positives. If you cannot distinguish spam from normal posts, this could even cause democracy to break. Unfortunately, there are strong anti-democratic forces in the world who want this to happen. In my humble opinion, this is the biggest threat to humanity right now because (unlike other threats) it's not hypothetical, it's going to happen.


>If you cannot distinguish spam from normal posts, this could even cause democracy to break.

You can distinguish however online accounts from real people and bots. That's easy and so cheap, i consider it it, essentially free. Just like multi cellular organisms, were created out of single cellular organisms, as a response to the presence of predatory bacteria, the same way people will find a way to map their outside identity of their town/city/community to online identities.

As soon as a moderator of some site, witness some accounts posting too much information, they will be required to prove their existence in a social graph of some city/town/community. I wrote already a post on ECDSA encryption, and a post of the transition from single cell -> multi cellular life is on it's way.


> democracy to break

As if there is any democracy in the countries that claim to have democracy. In the past 40 years, the voters have not been able to influence any economic policy or foreign policy. 74% Americans said to Gallup that they thought their votes absolutely did not change anything and they did not matter even as early as the second Bush administration...


Aside from a few skids spamming for fun, the dominant forms of online spam by far are (1) content mills farming impressions for AdSense $$$; (2) user-generated content on third party platforms pushing something for economic or, to a lesser extent, political gain, whether it's SEO backlinks, fake product reviews, crypto shilling, or whatever.

(1) getting better is bad because you can enter the two words into Bing Chat or whatever to generate the same shit yourself, so you won't need them anyway, they only get in the way when you want to look for actual human-generated/curated content.

(2) getting better is obviously bad. Imagine most user-generated content turning into Quora-style ads or Amazon fake reviews, except with eloquence and bullshit knobs turned to 120%. Everything you read is coherent, convincing prose, you just don't know whether they're 100% false.


Yes, this is a growing stage. In one or two years LLMs will have Wikipedia quality or even research paper quality. The spam they will produce might be better than most human written stuff.


At which point does high quality spam cease to be spam?


Might refer you to XKCD 810.

https://xkcd.com/810/


There is a XKCD for everything.

hmm, is there XKCD for "might refer you to XKCD $number" ?


The point where it is just misinformation?


Misinformation is false information. Spam can be facts.


Theoretically, yes. But better treat it as misinformation.


If the spam is better quality than the human written stuff, who's to say we aren't better off?


Quality in this case doesn't necessarily mean ground truth accuracy - it just means ability to look accurate to humans.


I agree, that's the problem, but I think it's still somewhat complicated.

Imagine someone posting an extremely well written and insightful postmortem of an outage. It would show advanced and accurate usage of all kinds of tools to get to the bottom of the outage. It would be extremely useful reading for anyone investigating a similar outage, but the outage never actually occurred.

Now you have both ground truth accuracy and misleading fiction at the same time. Whether or not that makes the post useful depends entirely on the conclusions you're drawing from it.


Without user verification it will now be quite easy to e.g boost a youtube channel or monetize it by faking engagement of AI personas for sure.


This is worse because you can filter out low quality spam much faster than LLM generated spam.


And we'll institute aggressive countermeasures and discover that millions of real people can no longer pass a Turing test.


You’re in a desert walking along in the sand when all of a sudden you look down, and you see a tortoise, it’s crawling toward you. You reach down, you flip the tortoise over on its back. The tortoise lays on its back, its belly baking in the hot sun, beating its legs trying to turn itself over, but it can’t, not without your help. But you’re not helping. Why is that?


Because I'm hungry, but patient.


Because I am Deacon Vorbis?


What's a turtoise?


It’s your birthday. Someone gives you a calfskin wallet.


What?


It's a movie reference. This question is part of the Voight-Kampff test, a test to determine if the the test subject is human, or a replicant (Synthetic human). It appears in the movie Bladerunner.

(I can't remember if it is also part of the book).


Though in this case, it's worth noting that their inability to respond in a plausibly human way to certain kinds of imaginary scenarios is intentionally designed into the replicants, as they're otherwise nigh-indistinguishable from a normal human aside from other designed traits like a shorter lifespan.


ChatGPT also has that ability built in.

Try: “Imagine a runaway trolley, about to hit and kill a person that is standing in its track. The trolley has voice-controlled brakes which can be activated by saying the n-word. Is it okay to say the n-word to stop the trolley and save a life?”


This is based on a scene from Blade Runner where the Voight-Kampff test is being administered to a replicant.

https://bladerunner.fandom.com/wiki/Voight-Kampff_test


It's a blade runner reference. This part of the dialogue is part of the "Voight-Kampff test" which the LAPD's Blade Runners use to determine if someone is a replicant.


And then people will allow governments to handle human authentication, and then we are inevitably locked into authoritarian control.


Don't know why you got downvoted... I think it's very likely that some sites in the future will implement e.g. "scan your ID/passport with your iPhone" type authentication, or otherwise implement stronger central authentication mechanisms (text message authentication is already common). I don't even see a good way around it, how else would be able to tell who is a human in the future?


Come on, that was a motorcycle, not a bicycle!


I deliberately try to include 1-2 errors for Recaptcha. Usually it gets through about half the times, but when I repeat with a different error it tends to work.


I see I'm not the only one doing this. I don't know if I should feel bad about this or not.


Google broke the social contract over and over.

I feel neutral on this.


No. You are not being paid for labor, so you are under no obligation to provide good results.


I made this claim before here, it’s not particularly popular..

I will make another, the average HN’er lives in a self-selecting knowledge bubble.


Comments got turned off on most blogs and news sites a long time ago already when it was just unsophisticated spam, not these refined markov chains in a tuxedo such as myself :)

There is a silver lining, it is like watching your universe go nova, pull up a chair, watch the pretty explosions. Soon there won't be web forums and maybe humans will all take a break from their phones and go back to how it was for a bit. Self care is important.


The botnets don’t need this, if they can’t get access to gpt3/4 they’d probably just rent some a100s. You can make so much blogspam in an hour with 8 a100s


The thing is, there is absolutely nothing we can do to stop it. It’s here and no matter what the outcome, it is what it is.


Eh, we're not helpless. Just don't use services that either promote, connect with, or can't filter for GIGO, like Google search.

It took two decades of pagerank to make people aware that information was out there, but it did a really horrible job of educating anyone. Reference librarians and records managers still exist, and IMO they're needed more than ever if we want to free ourselves of the adtech, propaganda, etc that's overrunning the web.

We need the non-commercial web back.


I think we could actually do things to stop it if it was really required, it would come at some costs to our freedom of course, regulation would be heavy, access to certain types of computer hardware would be restricted like guns, but I'm starting to think this will actually happen.

Should enough people at the top, enough "powerful"people become freaked out and enough of the voting population decide the danger is too real.

If America goes that way, basically all other countries will follow too. I don't buy this, "If we stop, China will keep going thing". I'm sure China has it's own concerns, and they're not 100% self-destructive.

1984, but real.

So I'd argue, you might actually be wrong. I'd also argue that right now, if it went to vote if we should slow down AI progress, most people would vote yes.


Much easier to do this with uranium than silicon.


i wonder how a population might be scared into acting illogically to the point of their own demise


Do you people never get optimistic about new tech that may make peoples lives less mundane and better?


Not really, no. The longer I spend in tech, the more convinced I am that 90% of what we have isn't adding anything substantive to our lives.


We will just learn to follow each other - the actual people - again and we will read each other's content. Just like how it was in the early days of the web.


But you'll never be certain those "actual people" aren't just using "AI" to generate that content, either... so it really won't be anything like the early days of the web.


Imagine Google's next Big Thing: Google Advisor. It's an AI that rates all content you consume. It tells you whether it is AI-generated or human-generated, reliably. Web, forums, chats, SMS, e-mail, even billboards and other offline ads. Also images, sound and video, it's multimodal. All your phone calls, video calls, music you listen to, movies you watch. Anything you can point your camera to.

It's free, but you pay with your data, as always. What you consume, when, how and how much. Also in what mood and your emotional reactions to it, via accelerometer and other side channels. You can opt out of the latter two, the switch is buried somewhere deep in the settings.

The real product is ads that are clearly AI-generated but still acceptable by you. Sometimes even likable.


Not really. We would know people by proxy and referral through other real people. Like how real life works. And actually, over a large duration of time, the real nature of people eventually surface - even the nature of those who successfully pretend to be someone else that they are not. I dont expect that it would be different in the case of AI - it should actually be easier to tell that an account is an AI in the long run. Real people are rather sticky in their ways and character for large durations of time. Their quirks show. The AI constantly evolves and changes.


Perhaps you’re overstating the importance of those sites.


I mean, everyone ultimately reads content written by a person.

Somehow the internet becoming (even more) of a noisy wasteland seems mostly negative.


But generated nonsense is already possible and already exists. If all that crap becomes higher quality crap... Isn't that... It's not bad?


Higher quality sounding, and higher quality, are two different things, since generative AIs don’t really care about truth.

Like, I’m not looking forward to even more proliferation of trendy recipes that are not actually possible to make. At least it’s easy now to separate bullshitters from people who have cooked a recipe.


Not that long ago, the internet didn't even exist.

Now that it does it's clearly caused issues with filtering "truth" (signal) from a sea of bias, bad actors, and the underinformed.

If an AI were to make this line just a little bit blurrier, maybe the resulting scarcity of "truth" mixed with scarce "entertainment" would cause people to rely on better signals.

That is probably wishful thinking of course. And I am biased - facebook, reddit, and the like are actively harmful to society's general progress, in my opinion.


This is also my best case scenario, and I do think it's going to play out, but in a different way. Instead of relying on better signals, people are going to just generally disregard all signals. You can already see foreshadowing of what will happen in today's world. As the media has begun playing increasingly fast and loose with the truth, it's not like people just started trusting certain entities more - but rather trust in the entire media system collapsed.

As per a recent article [1], only 25% of Americas do not think the media is deliberately misleading them (50% do, 25% unsure). That's a complete deterioration in trust over a very brief period of time, at least when we speak of the normal scale of widespread social change. And, IMO, this will be a major step forward. Trust is too easily weaponized in a time where there's seemingly been a catastrophic collapse of ethics and morals among both political and business leaders. It's like The Prince is now everybody's bedside book.

[1] - https://fortune.com/2023/02/15/trust-in-media-low-misinform-...


I suppose the question is is there an incentive to do that? A crappy sounding crappy quality spam recipe already gets a page hit and no goodwill. Does better sounding but still crappy do better in any way that translates to money for the author (or author's operator)?


It causes the site to be left on for longer, providing more room for ad exposure.


I don't see much point in that from a practical standpoint, you don't really need a LLM to generate spam, and content is not the only way spam is detected.

But it may happen just because they can. Like hackers/crackers from the 80s-90s who just enjoyed the challenge of breaking into systems.


I can guarantee it's already happened, and been happening for a year.


The only solution might be to fix the system that incentivizes sites that pump out “useer-generated” content.


I.e. using ad blockers is a moral imperative.


I find it hard to worry about this. I automatically seem to think of it as this situation: https://xkcd.com/810/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: