But how are system messages given to GPT, are there any other lower level prompts? This may be outdated but last I remember ChatGPT is just GPT with a prompt like
The following is a chat between an AI and a user:
- AI: How can I help?
- User: ...
At least that's how I simulated chats on the OpenAI playground before ChatGPT.
Is this done differently now, or if not I wonder if anyone has been able to guess what that prompt says and how the system message gets inserted.
There are no lower-level prompts than the ones described in the link. If you're asking about how the model sees the context - the messages are formatted using ChatML [1] which is a format with tokens to denote messages with their roles (and optional names) in the chat context, so it can clearly differentiate between different messages.
To put it more clearly, a conversation with the official ChatGPT frontend might look like this in API terms of messages:
{"role": "system", "content": "You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.\nKnowledge cutoff: 2022-01\nCurrent date: 2023-10-11\nImage input capabilities: Enabled"}
{"role": "user", "content": "Hi!"}
{"role": "assistant", "content": "Hello! How can I assist you today?"}
You can see how it would look in the end with Tiktokenizer [2] - https://i.imgur.com/ZLJctvn.png. And yeah, you don't have control over ChatML over the ChatCompletion API - I guess the reason they don't allow you to is because of issues with jailbreaks/safety.
I have suspicions that there is middleware somewhere for the ChatGPT interface to the chat models underneath; something to enclose the normal prompt, to update weights, or to manipulate logits.
Just last night I began seeing a behavior that I could formerly reproduce 100% of the time: asking it to critically evaluate my instructions, explaining why it didn’t follow them, and suggest rewrites. Since the beginning of ChatGPT itself, it would reliably answer that every time. As of last night, it flat out refused to, assuring me of its sincere apologies and confidently stating it’ll follow my instructions better from now on.
If I understand correctly, the older method you describe has been replaced by exposing a GPT model to some further training (as opposed to "pre-training") with successful conversations. I think this premiered with the InstructGPT paper: https://arxiv.org/pdf/2203.02155.pdf
Is this done differently now, or if not I wonder if anyone has been able to guess what that prompt says and how the system message gets inserted.