Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What Is Zoned Storage and the Zoned Storage Initiative? (westerndigital.com)
93 points by ingve on Aug 31, 2019 | hide | past | favorite | 34 comments


For anyone wondering "why", this linked article in the source gives some background, https://blog.westerndigital.com/storage-architectures-zettab...

So instead of SMR harddrives and SSDs doing extra work to hide their deficiencies, they're pushing this up into the filesystem. In return you get slightly more storage and slightly faster IO. Perhaps not useful on a desktop, but very useful when dealing with thousands of storage devices in a data centre.

In terms of how it would be exposed to users, this feels very well fitted with object storage. Unlike file-based storage, each object can only be read or replaced. So any partial write relies on the program to read the whole object, then write the updated object.


It also fits well log-structured and copy-on-write filesystems.


Yes! That was my first thought. The way SMR and SSDs work actually aligns pretty well with log structured filesystems.

In this case, the firmware translation layer is only getting in the way.


It's not every day a new disk storage implementation detail makes it this far up the storage stack.


True - the great thing is that we have been optimizing for this type of interface for the last decade. I.e., due to the benefits of making writes sequential (both for HDDs and SSDs)

We, the industry, have just been missing the interface to actually perfectly align our workloads to the media that we store the data on. The zone interface bridges this gab.


A presentation of the SSD benefits is available here:

https://m.youtube.com/watch?v=9yVWb3rbces

(Full disclosure - my talk at OCP 2019)


" In SMR, unlike conventional recording, tracks are written in an overlapping manner. ... once the tracks are overlapped, they cannot be written independently. there are disadvantages for device-side localized management. ... Managing the complexity on the host side is almost a requirement"

also links: https://www.zonedstorage.io/ for further information.

Sounds good to me; exposure of the actual mechanisms instead of outdated abstractions makes sense. I'd even argue for open access to motor controllers and raw signal buffers but apparently I'm insane.


Host managed also means messing with the file system, and the patch set for btrfs is not small. Plus you have to disable a bunch of features (like preallocation since you can’t move the write pointer backwards) which is going to surprise people in unfun ways. These drives are useless for general purpose. If you are going to use them as expensive tape then by all means, but otherwise I have serious doubts about their usefulnes.

Edit: I’m talking specifically about the SMR side. The general zoned stuff is interesting, but when you start putting restrictions on how you can write to certain zones you wind up with a lot of weirdness that application developers are going to be surprised by.


> If you are going to use them as expensive tape then by all means, but otherwise I have serious doubts about their usefulness.

Only writing is more complicated. Reading is still simple and fast; and random access reading is fast, unlike tape.


Drive-managed SMR looks OK to me for general use, although I don't have hands-on experience with it. They also have a non-shingled region that can accept random I/O at higher speeds, and the drive deals with moving data to a linear-only region once it fills up, or at garbage collection time.

It's very much like the QLC drives where a portion is treated as SLC/MLC.


It's like the QLC drives with SLC/MLC cache (e.g., Intel 660p) except the performance cliff is far, far worse (and obviously the "fast" mode is far worse as well).


Agreed, drive managed is far better from a usability standpoint.


Some of the fallocate modes don't make much sense on CoW filesystems anyway. ZoL doesn't support preallocation either, you have to use FALLOC_FL_PUNCH_HOLE.

Lack of DIO might be a bigger issue. Then again it's often used by write-heavy workloads which probably don't want to use SMR drives anyway.


I totally agree, how on earth could being given optional access to lower level functionality ever be considered a negative?


It makes it much easier to accidentally break something, and much harder to swap out the implementation for a better but different one?


Expensive but very fast tape with easy random rewind, but still largely sequential write access, is something that spark / hadoop could readily use.


Candidly, the manufacturers haven't shown enough of a price savings to make these drives worthwhile. As others have mentioned, there's a fairly significant cost in development to using these drives efficiently. The only way that makes sense is if there's a significant discount over non-SMR drives. Nice blog post WD, but you're going to have to drop the price to about half of what it currently is for these to make even a little bit of sense


Half of the world bits from HDDs are estimated to be on SMR in 2023 - the gains are significant when deploying at scale.

For SSDs, it gets even more fun as zones aligns with the characteristics of the media and you get this effect of significant increase in capacity (20% with a 28% OP drive), order of magnitude reduction in dram, and eliminates device side garbage collection on the drive (commonly between >1-5), which improves the QoS considerably.

Additionally, one can now run the drives at 100% capacity utilization - conventional drives becomes slower due to increasing device write amplification.


Support for zoned drives is bubbling up through the kernel layers and just reaching filesystems now. Once filesystems paper over the shortcomings well enough any saving is worth it for bulk storage that involves more than 1-2 drives.


Zoned SMR storage has almost no use case outside of tape replacement.

The same drives configured in 100% random I/O mode lose a handful percent capacity and the tradeoff for SMR zones is awful: rewrite your entire filesystem stack (thousands of labor hours) and expect awful performance. For most businesses this is not an obvious win.

Meanwhile, big cheap dense SSDs get bigger, cheaper, and denser every day. You don't have to rototill your filesystem to use them effectively. So I don't see a lot of sense in investing in zoned filesystem support when archive tier bottom-dollar storage can just use the unzoned, marginally smaller spinning rust today, and switch to the unzoned, super lower write endurance flash media tomorrow.


SMR does have a use case outside of tape replacement. It has higher density. Take a look at DropBox MagicPocket - they designed their entire storage system around SMR technology


It has marginally higher density; I didn't forget this benefit. The tradeoff is something like 5% more space vs much worse performance and many man-years rewriting filesystems.


As SSDs get denser, moving to QLC/PLC, the number af writes to the same place on media gets to the low hundreds.

By using zones, the total amount of available host writes increases by 4-5x (when considering non-optimized file-systems). Technically the device write amplification is reduced from 4-5x to ~1x.


I want write once storage. For photo backup. No virus or user error can ever delete my photos.


Some companies are already supplying write-once hard drives, restricted at the firmware level: https://en.wikipedia.org/wiki/Write_once_read_many#Current_W...


There's always optical media or SD cards with physical write-protect toggles. (If you leave the optical media in the writer, though, conceivably an absurdly specific virus could cause the device to burn out existing contents.)


The read/write switch on SD cards is 100% up to the reader device to honor or ignore (seriously)


The drive controller could strongly enforce the onceness of writes. Possibly by attempting a read before every write. Or if that's not enough, with independently articulated read and write heads, the write head's movement could be mechanically restricted...


Needs to be accessible real time.


What’s the difference between this and RAID?


Like RAID-0 specially


On my Firefox Android this link goes into what seems infinite redirects. Um, thanks?


Works fine on Firefox for android for me

I did just experience Firefox opening infinite tabs when I opened a PDF last week though, but I tracked that down to a problem I introduced, where I told the OS to open PDFs in Firefox, and Firefox was still set to defer to the OS for PDF handling. Could there be something similar going on here?


I seriously question the value of having this bloat up the kernel and general purpose filesystems. Tapes work just fine with userspace support and there isn't any reason for not having custom FUSE's for this as well.

Put another way, everyone suffers with code bloat and extra complexity for the tiny percentage of users (which are large, and can manage their own systems) this supports.

Edit:

The SMR drives should just support the SCSI streaming command set and be done with it. That way nothing really needs to change anywhere but in the drive firmware.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: