How could GPT-3.5 possibly have been a finetuning of the 175B model? They didn't...

espadrine · on May 10, 2023

Finetuning might not be the best word; sometimes it is a grey line.

Token embeddings can be trained without changing the other parameters. There is a number of models which add tokens as a finetuning step. Here is recently StarCoder adding ChatML-equivalent tokens: https://huggingface.co/blog/starchat-alpha#a-standard-format...

sebzim4500 · on May 10, 2023

Sure, you can add a few tokens, but in this case they changed almost every token.