Does anyone have some real-world use cases for something like this? The algorith...

codaphiliac · on Sept 16, 2024

Thinking this could be useful in a multi tenants service where you need to fairly allocate job processing capacity across tenants to a number of background workers (like data export api requests, encoding requests etc.)

jawns · on Sept 16, 2024

That was my first thought as well. However, in a lot of real world cases, what matters is not the frequency of requests, but the duration of the jobs. For instance, one client might request a job that takes minutes or hours to complete, while another may only have requests that take a couple of seconds to complete. I don't think this library handles such cases.

hinkley · on Sept 16, 2024

Lots of heuristics continue to work pretty well as long as the least and greatest are within an order of magnitude of each other. It’s one of the reasons why we break stories down to 1-10 business days. Anything bigger and the statistical characteristics begin to break down.

That said, it’s quite easy for a big job to exceed 50x the cost of the smallest job.

codaphiliac · on Sept 16, 2024

defining a unit of processing like duration or quantity and then feeding the algorithm with the equivalent of units consumed (pre or post processing a request) might help.

dtjohnnymonkey · on Sept 16, 2024

To mitigate this case you could limit capacity in terms of concurrency instead of request rate. Basically it would be like a fairly-acquired semaphore.

hinkley · on Sept 16, 2024

I believe nginx+ has a feature that does max-conns by IP address. It’s a similar solution to what you describe. Of course that falls down wrt fairness when fanout causes the cost of a request to not be proportional to the response time.

itake · on Sept 18, 2024

The text suggests a method for managing GPU or rate-limited resources across multiple clients. It highlights the problem of spikey workloads, where a client might generate a large number of events (e.g., from a CSV upload) causing resource starvation. The text advises against using naive solutions like FIFO, which could disadvantage clients with steady live traffic.

mnadkvlb · on Sept 16, 2024

I responded above, but it could be used maybe for network libraries for eg. libvirt. I did my thesis on this topic a couple years ago.

I am very intrigued to find out how this would fit in, if at all.

otterley · on Sept 16, 2024

Rate limiters are used to protect servers from overload and to prevent attackers--or even legitimate but unintentionally greedy tenants--from starving other tenants of resources. They are a key component of a resilient distributed system.

See, e.g., https://docs.aws.amazon.com/wellarchitected/latest/framework...

This project, however, looks like a concurrency limiter, not a rate limiter. I'm also not sure how it works across a load-balanced cluster.