Yeh but isn’t the Redis one just a biased? What might have been interesting woul...

arp242 · on May 21, 2023

The Dragonfly benchmark runs one Redis instance on a 64-CPU machine and compares it with one Dragonfly instance on the same machine.

But there is nothing stopping you from running 64 Redis instances on one machine if it has 64 cores, which is what Redis did (actually, they ran just 40). That actually seems like a nicer design overall, as it scales "naturally" to multiple machines without any extra effort/code, it keeps the code simpler, you can also have one of these Redis instances segfault without bringing your entire cache down.

Other than that, they seem to have run the same benchmark. YMMV for other types of workloads of course, and perhaps Dragonfly could be configured better in some way.

Either way: it seems the Dragonfly benchmark is not just biased, but highly misleading. And while the Redis benchmark may be biased, it certainly doesn't seem highly misleading.

ChocolateGod · on May 21, 2023

To me, spinning up multiple copies of the database is cheating. You're comparing a box of Apples to a single apple.

Yes, using a Redis cluster is the only way to get Redis to actually use system resources effectively, but its a relatively complex thing to create and manage compared to just running 1 server.

arp242 · on May 21, 2023

I don't think it's cheating at all; it's how its designed to work.

If you want to say "but this is more difficult": okay, fair enough (although in my experience it's not difficult at all), but then say that instead of posting a misleading benchmark which runs Redis in a way it's not supposed to run. You can place all sorts of artificial "yeah but I don't want to do it like this" constraints on all sort of things.

romange · on May 21, 2023

hmm and ships are designed to sail, yet you use planes to cross atlantic. Nokia was designed as strongest and most affordable phone, yet you use Iphone that costs 1000$. it's not about how it was designed but whether it addresses your current needs. Developers do not want to manage a cluster of single cpu processes. Not on their laptops and not in the production. And it's not just about management complexity. See this https://github.com/dragonflydb/dragonfly/issues/1229 and it's just one example. Single cpu - is just not enough for today use-cases.

arp242 · on May 21, 2023

That may all very well be the case – let us assume it is for the sake of the argument although I have some comments about that as well – but that still means the argument is "Redis is too complex to run on multiple CPUs" and/or "Redis is poor for these workloads" (I didn't investigate that issue in-depth), and not "Redis is unable to do much work with this very powerful AWS instance". There two are very different things. There is no nuance anywhere in the benchmark. A reader might very well believe that this is all the performance they're going to get out of Redis on that machine, which that's clearly not the case.

> Nokia was designed as strongest and most affordable phone, yet you use Iphone that costs 1000$

Actually I have a Nokia :-)

romange · on May 21, 2023

You are an exception, then :) But I still stand by the claim that fragmenting your stateful workload (i.e. Redis) into bunch of processes instead of having a single endpoint per instance is an acceptable approach in 2023. When your processes are excessively tiny, their load variability overshadows their average load. This imbalance results in unpredictable pauses, latencies, and Out of Memory (OOM) issues. This primarily occurs due to the absence of resource pooling under a single process. While it's challenging to exhibit this issue via synthetic benchmarks, it's certainly present.

arp242 · on May 21, 2023

I think you forgot "not" there before "an acceptable approach in 2023".

These are all fair and reasonable opinions to have, and to some degree I even agree with it, but none of that is captured in the rather simplistic benchmark. Everyone understands that even with the best of efforts it's hard to capture everything in a benchmark, but in this case it's just missing a very obvious way to run Redis.

It's like benchmarking PostgreSQL connections and coming to the conclusion there is no way PostgreSQL can handle more than n connections and that OtherSQL is much better. Is this true? Yes. But it's also true that half the world is running pg_bouncer and that this is widely seen as the way to run PostgreSQL if you need loads of connections. Is it a pain you need to run this and something that should be addressed in PostgreSQL? Absolutely. Such a benchmark would be correct in a strict narrow technical sense, but at the same time also misrepresentative of the real-world situation.

romange · on May 21, 2023

I understand what you are saying. How would you suggest to present it then? Dragonfly is not faster than Redis when running on a single cpu. It can not be, just because it has the overhead of the internal virtualization layer that composes all the operations over multiple shards (in general case). But Dragonfly can scale vertically with low latency and high throughput unlike other attempts of making multi-threaded Redis that used spinlocks or mutexes. So how do we demonstrate the added value?

arp242 · on May 22, 2023

> But Dragonfly can scale vertically with low latency and high throughput unlike other attempts of making multi-threaded Redis that used spinlocks or mutexes. So how do we demonstrate the added value?

Provide more advanced benchmarks which demonstrates those types of differences better.

The situation is that the differences are complex, both in terms of performance and operationally (e.g. running multiple instances is not a huge obstacle, but it is harder). That's always going to be hard to capture in a single graph or a single tagline; I appreciate this isn't easy.

It's your website; you can do what you want with it. And maybe I'm just a grumpy old curmudgeon who has seen too many hype cycles, but to me it just comes off as "too good to be true" – which it kind of is – and leaves a more negative than positive impression. The same applies to "The most performant in-memory data store on Earth" tagline, which seems a bit hyperbolic (what is "fastest" depends, as you mentioned that Redis will always be faster on a single core – some people only need a single core!)

I have the business acumen of a goat, so what do I know? But it seems to me that a lot of people appreciate when products are straight-forward about their weaker points as well, and even straight-up say they're not the best fit for all scenarios, and in that in the long run this is more beneficial.

pritambaral · on May 21, 2023

> To me, spinning up multiple copies of the database is cheating.

What if the database was designed to be run that way?

> You're comparing a box of Apples to a single apple.

Precisely. Dragonfly is a box of apples. Redis is a single apple that can be put in a box with other apples. If you run a "benchmark" comparing your box of apples against a sole apple, you're being either stupid or dishonest.

Temporary_31337 · on May 21, 2023

At least on AWS it is kind of hard to get 40 tiny VMs with sufficient speed on the infra side. Given that laptops get 40+ vCores these days, I think a single instances od anything should have some multi threading.

detaro · on May 21, 2023

The comment you replied to explicitly said (so you don't even have to read the redis article, which also clearly says so)

> The Dragonfly benchmark runs one Redis instance on a 64-CPU machine and compares it with one Dragonfly instance on the same machine.

They were not running 40 tiny VMs!

ec109685 · on May 25, 2023

The should have chosen a 1024 core box and really shocked the world.

detaro · on May 21, 2023

> and consider the overhead of managing 1VM vs 64VMs

They clearly are not running 64 VMs in the test they are describing.

They compare both databases on one VM of the exact same size, both deployed as their makers recommend to deploy them.