I'm curious now: how do you transcribe the videos? And how do you align the transcript with the video (in terms of timing)? Are there libraries doing that?
I'm curious now: how do you transcribe the videos? And how do you align the transcript with the video (in terms of timing)? Are there libraries doing that?