I didn't dig too deep into this blog, but right off the bat I suspect that there...

thraxil · on Feb 3, 2014

I agree with your general assessment and skepticism of performance tests, particularly on top of virtualization.

Minor point though: Digital Ocean at least advertises SSD for their servers, not spinning disk. Fraction of a ms random reads seem within the realm of possibility.

slashdev · on Feb 3, 2014

I came here to say this, glad to see other hacker news people picking up on the impossible disk seek times as well.

voidlogic · on Feb 3, 2014

Looks like -D will make ioping to direct

zzzcpan · on Feb 3, 2014

Won't disable neither disk cache nor raid controller cache though.

rosser · on Feb 3, 2014

If you're using a RAID controller that's worth what you paid for it, then it has disabled the on-board cache on the disks it's managing. Otherwise, the following scenario becomes possible:

  1. OS issues write to RAID HBA, write is stored to NVRAM (or battery-backed RAM on older cards).
  2. RAID HBA issues write command to disk.
  3. Disk accepts write into onboard buffer, acknowledges it as committed.
  4. RAID controller releases cached pages.
  5. Power loss.

...and you've lost data.

EDIT: Notice that the write never actually touches disk in this scenario. Once a disk drive acknowledges a write, the RAID controller releases the data that was "written" from its cache. Disk writes take milliseconds, while memory writes take microseconds, and usually just nanoseconds. That leaves a relatively huge window during which the power could go out, but before which a write has been safely persisted to disk.

arielweisberg · on Feb 3, 2014

The on disk cache can still cache data for reads and they actually may actually cache data for writes as well if they are using write barriers.

There is no reason for the RAID controller to not let the on-disk cache and scheduling work while it is doing writeback, it only needs acknowledgement at the end before it flushes it's non-volatile cache.

This could also be something that I don't know about. Maybe in the world of disks write barriers are < some disable write caching command? Can controllers issue writes large enough to make up for the lack of caching + write barriers? I have no idea, how SATA works.

Again this gets into why it is hard to know what to expect from a disk IO benchmark. You have to know how the caches are operating and there are many of them and configuration can vary.

RyanZAG · on Feb 3, 2014

RAID controllers will usually have backup batteries for just this reason.

rosser · on Feb 3, 2014

Yes, or NVRAM, exactly as step 1 in my scenario mentions.

The problem is that the disks they're managing don't. (EDIT: barring SSDs with supercaps, but that's an entirely other discussion.)

If a write has been accepted by the disk and acknowledged as written — but in reality has only been stored in the disk's on-board cache — and you suffer a power loss before the write can be flushed to permanent storage (be it spinning rust or NAND cells), then you have lost that data.

This is exactly why a RAID controller worth using will disable a drive's onboard cache. Because disks lie.

Was my first comment somehow unclear?