The thing to note is that waitress is not built for the highest speed, or the fastest, or anything along those lines.
Its primary use case is that it is pure python, doesn't rely on any specific libraries or compilers to run/build, and is a threaded WSGI implementation so it uses Python threads to run a WSGI app.
It works well for what it needs to do, and hopefully it is fairly robust. I've personally ran waitress directly facing the internet, but will readily admit that in most cases running it behind a load balancer is a good idea, especially since it doesn't support SSL out of the box (yet, I should say, it's on my roadmap).
It won't win any speed contests and it won't win performance contests, but it holds its own.