"First of all, what does sample rate has even remotely to do with hearing range?"
The sample rate (among a few other things like the quality of the low-pass filter) determines what's the highest frequency you can perfectly read from a signal. Since most human beings can't hear frequencies above 20 kHz you simply need to choose a sample rate that can represent a 20 kHz signal perfectly and that's something like 44 or 48 kHz.
Ok that does make sense and indeed defines overall distortion. Moreover, what his video shows is example with single tone, while reality is that single instrument would produce multitude tones and typical track would employ multiple instruments.
The sample rate (among a few other things like the quality of the low-pass filter) determines what's the highest frequency you can perfectly read from a signal. Since most human beings can't hear frequencies above 20 kHz you simply need to choose a sample rate that can represent a 20 kHz signal perfectly and that's something like 44 or 48 kHz.