Found a 2014 thread where 10-18% of the test harness time was spent booting the interpreter for the 13,000 required instances. The deeper thread was showing some 500-700 seconds of test time was just the interpreter overhead. The original point of the article was how much worse the overhead was in Python3 vs Python2.
https://mail.python.org/pipermail/python-dev/2014-May/134528...