Compare this to the Enterprise platform I am dealing with on a current project, which has 4 x ec2 nodes with 8 CPUs and 32 Gb RAM.
I can DoS it with a single java client running 50 threads. If I use 100 the p95 shoots up to 30 - 40 seconds.
But the kicker is that no one (other than me) really cares. 50 concurrent threads is probably around the peak load it will get in prod, and various people involved think why bother trying to fix it?
Oh man, don’t remind me. We have a bunch of GraphQL proxies in ECS that somehow cannot handle more than 5 connections each, so naturally the solution is to just spin up 19 more of them to get to 100 concurrent connections...
Honestly, I think that's what makes working at a large "web-scale" company so attractive to me. When you're running at that sort of scale you can't afford to be as apathetic about performance so there's a lot more engineering effort put into efficiency, because it makes financial sense to do so. OTOH, in a lot of enterprise type companies, you can be nothing more than a "feature monkey".
I can DoS it with a single java client running 50 threads. If I use 100 the p95 shoots up to 30 - 40 seconds.
But the kicker is that no one (other than me) really cares. 50 concurrent threads is probably around the peak load it will get in prod, and various people involved think why bother trying to fix it?
It's driving me nuts.
Thanks for listening