Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I get a slightly uncomfortable feeling with this talk about AI safety. Not in the sense that there is anything wrong with that (may be or may be not), but in the sense I don't understand what people are talking about when they talk about safety in this context. Could someone explain like I have Asperger (ELIA?) whats this about? What are the "bad actors" possibly going to do? Generate (child) porn/ images with violence etc. and sell them? Pollute the training data so that the racist images pops up when someone wants to get an image of a white pussycat? Or produce images that contain vulnerabilities so that when you open that in your browser you get compromised? Or what?



I'm not part of Stability AI but I can take a stab at this:

> explain like I have ~~Asperger (ELIA?)~~ limited understanding of how the world really works.

The AI is being limited so that it cannot produce any "offensive" content which could end up on the news or go viral and bring negative publicity to Stability AI.

Viral posts containing generated content that brings negative publicity to Stability AI are fine as long as they're not "offensive". For example, wrong number of fingers is fine.

There is not a comprehensive, definitive list of things that are "offensive". Many of them we are aware of - e.g. nudity, child porn, depictions of Muhammad. But for many things it cannot be known a priori whether the current zeitgeist will find it offensive or not (e.g. certain depictions of current political figures, like Trump).

Perhaps they will use AI to help decide what might be offensive if it does not explicitly appear on the blocklist. They will definitely keep updating the "AI Safety" to cover additional offensive edge cases.

It's important to note that "AI Safety", as defined above (cannot produce any "offensive" content which could end up on the news or go viral and bring negative publicity to Stability AI) is not just about facially offensive content, but also about offensive uses for milquetoast content. Stability AI won't want news articles detailing how they're used by fraudsters, for example. So there will be some guards on generating things that look like scans of official documents, etc.


So it's just fancy words for safety (legal/reputational) for Stability AI, not users?


Yes*. At least for the purposes of understanding what the implementations of "AI safety" are most likely to entail. I think that's a very good cognitive model which will lead to high fidelity predictions.

*But to be slightly more charitable, I genuinely think Stability AI / OpenAI / Meta / Google / MidJourney believe that there is significant overlap in the set of protections which are safe for the company, safe for users, and safe for society in a broad sense. But I don't think any released/deployed AI product focuses on the latter two, just the first one.

Examples include:

Society + Company: Depictions of Muhammad could result in small but historically significant moments of civil strife/discord.

Individual + Company: Accidentally generating NSFW content at work could be harmful to a user. Sometimes your prompt won't seem like it would generate NSFW content, but could be adjacent enough: e.g. "I need some art in the style of a 2000's R&B album cover" (See: Sade - Love Deluxe, Monica - Makings of Me, Rihanna - Unapologetic, Janet Jackson - Damita Jo)

Society + Company: Preventing the product from being used for fraud. e.g. CAPTCHA solving, fraudulent documentation, etc.

Individual + Company: Preventing generation of child porn. In the USA, this would likely be illegal both for the user and for the company.


Their enterprise customers care even more than Stability does.


The bad actor might be the model itself, e.g., returning unwanted pornography or violence. Do you have a problem with Google’s SafeSearch?


> Could someone explain like I have Asperger (ELIA?)

Excuse me?


You sound offended. My apologies. I had no intention whatsoever to offend anyone. Even if I am not diagnosed, I think I am at least borderline somewhere in the spectrum, and thought that would be a good way to ask people explain without assuming I can read between the lines.


Let's just stick with the widely understood "Explain Like I'm 5" (ELI5). Nobody knows you personally, so this comes off quite poorly.


I think ELI5 means that you simplify a complex issue so that even a small kid understands it. In this case there is no need to simplify anything, just explain what a term actually means without assuming reader understanding nuances of terms used. And I still do not quite get how ELIA can be considered hostile, but given the feedback, maybe I avoid it in the future.


Saying "explain like I have <specific disability>" is blatantly inappropriate. As a gauge: Would you say this to your coworkers? Giving a presentation? Would you say this in front of (a caretaker for) someone with Autism? Especially since Asperger's hasn't even been used in practice for, what, over a decade?

> In this case there is no need to simplify anything

Then just ask the question itself.


AI isn't a coworker, not a human so it's not as awkward to talk about one's disability.


I don't see how this is a response to anything I've said. They're speaking to other humans and the original use of their modified idiom isn't framed as if one were talking about their own, personal disability.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: