If you are referring to the LLM used in the demo, it's a simple GPT. If you are referring to audio data, there are some (not a lot) public datasets, although be careful of the license of the dataset. To get more data, you might want to build a studio to collect from contracted voice actors, or you can purchase from other sources.