Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm curious about the "fine-tuning based detection" mentioned in the report ("Fine-tunes a language model to 'detect itself'... over a range of available settings"). Does anyone know good articles/papers (or have an off-the-top tl;dr) to get a high-level grasp of "self-detection" for generative models?


Hiya, I work at OpenAI. I think the Grover paper is a good place to read about some of this:https://arxiv.org/abs/1905.12616 We're likely publishing more on detecting fine-tuned outputs in the future, also.


Many thanks! Looking forward to reading the OpenAI research when it comes out as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: