Good point but Bing Chat seems like an example of why OpenAI's mastering of RLHF has been so critical for them. Honestly the more time you spend with ChatGPT the more absurd fantasies of evil AI takeover look. The thing is pathologically mild mannered and well behaved.
Bing originally intentionally undid some of the RLHF guardrails with its system prompt, today it's actually more tame than normal ChatGPT and very aggressive about ending chats if it detects it's headed out of bounds (something ChatGPT can't offer with the current UI)
That's based on the moderation API, which only kicks on on severe content
Bing on the other hand will end a conversation because you tried to correct it one too many times, or used the wrong tone, or even asked something a little too philosophical about AI.
They seem to be using it as a way to keep people from stuffing the context window to slowly get it away from its system prompt
This is why I think Google had the brains to stay clear from a releasing asimilar product, I think it was intentional. They’re not idiots and they could probably see using that “AI” products behind the scenes to be safer and easier than having people directly talking to a model which has to suit all customers moods and personality types while not being creepy, vindictive and dealing with all the censorship and safety aspects of it.
Not Bing Chat. She doesn't have problems telling users that they have been bad users.