> Of a high profile and high quality enterprise disk, unlikely to appear in 99% ...

> Of a high profile and high quality enterprise disk, unlikely to appear in 99% of RAID setups.

> Most RAID setups in the wild consist of low quality enterprise or even desktop platters, high quality setups are rare and expensive

> they're still not very common

More extraordinary claims, requiring extraordinary evidence. Even Baraccuda Pro models (which do cost more but not prohibitively so) have the same read error rate, but I didn't choose that datasheet because it had less data, overall.

> Esp. considering that lots of consumers have

I call "red herring" on this, since consumers also aren't going to have the kinds of choices we're discussing, nor read these forums, nor the original article (which is the context for this whole discussion).

> the error rate is given in bits here, shaving of 3 or 5 OOM again

I'm not convinced of that, since the tables look identical between the drives. Maybe it's a sneaky marketing ploy, but maybe not. Ultimately, you need real world data, which you consistently haven't provided.

Absent that, it seems as though you're relying on assumptions, and my original conclusion, based on the data that has been published by the likes of Google and Backblaze, stands.

> Waste is a matter of efficiency in all domains

You seem to have re-defined waste, so I can't really speak to it.

> I don't think I ever met a sysadmin that insistent frequent failover due to failure is considered best practise.

I fear you are, again, misunderstanding. I didn't mention failure as a cause, although the term "failover" could lend itself to confusion, with the substring "fail" being in it. I used the term only in the sense of "switchover".

Perhaps I merely misunderstood you originally. You did initially state "A failover is still a failover and can reduce performance" which, even assuming you meant failover-due-to-failure, the assertion is questionable in the context of justifying node-level redundancy on top of RAID-level redundancy, if switchover (not due to failure) is engineered to be frequent (or even continuous).

> you can easily quantify risks as you have done using datasheets

I'm not agreeing or disagreeing as to its ease, but I'm asking you to go ahead and perform this, which you assert the ease of, since that seems to be the basis of your point.

(Earlier, I just made some single-disk calculations based on the spec sheet, not array risks.)

> I need to know if the array is likely to experience one (with likely being >1% or any other >X% during a rebuild or during normal operation). This question is more easily answered and doesn't require more than estimations.

Agreed. As I mentioned, coarse estimates (if based on real data) are plenty good enough. However, even a coarse estimate assigns some number to it.

Given your question above, what's the answer, for likely-being->1%, during a 300-hour rebuild? How did you arrive at that answer?