I'm always curious for things like this where people get training data.

AustinZzx · on Feb 21, 2024

If you are referring to the LLM used in the demo, it's a simple GPT. If you are referring to audio data, there are some (not a lot) public datasets, although be careful of the license of the dataset. To get more data, you might want to build a studio to collect from contracted voice actors, or you can purchase from other sources.