Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not "is used as training data" but more like can be used.

That also means that they can use your interactions with GPT to see what they can improve, but still do the learning on other datasets



The article says "...an Amazon lawyer told workers that they had 'already seen instances” of text generated by ChatGPT that “closely” resembled internal company data', so there seems to be some evidence that it is actually happening - assuming that the evidence actually is from confidential data, it seems more plausible that it got into ChatGPT this way, than through some other leak.


The pull quote in the article seems to confuse code and data. Are we sure this isn't a case of a lawyer getting over-excited about an if() and a couple of variables names? ChatGPT isn't great at generating anything more than, say, ten lines in length.


No, we cannot be sure, but this whole thread is a discussion of plausibility, not certainty. Just above, notahacker has suggested (subsequent to your post here) a way ChatGPT might be getting internal data, that seems quite plausible to me.


In practice there is very little reason to not use data they legally can since AI models work better with more training data.


And if you've asked the bot a novel question (like "can you summarize this $corporation internal strategy document on $market?"!) and approved the answer, that's excellent feedback a chatbot company absolutely should want to use for future questions covering similar topics which have low correspondence with anything else in its database. ChatGPT might be capable of hallucinating plausible-sounding strategy documents it's never read, of course, or internal API calls that don't exist but that can be tested to a degree by comparing answers of lots of similar questions.

We've already seen the hilarity that ensures when a ChatGPT-based bot explains that it's not allowed to refer to its internal codename Sydney and with sufficiently creative prompting about Sydney will happily emit the entire document....




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: