Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
36TB FreeNAS Home Server Build (ramsdenj.github.io)
203 points by sengork on Jan 15, 2016 | hide | past | favorite | 181 comments


There is a lot of cargo cult about this.

Firstly the claim that freebsd has wider testing is utter trash. In terms of TBs installed ZFS on linux >> freenas/freebsd

the amount of money behind ZoL is now surprisingly large.

Also, extensive memtest of ECC ram is pointless, ECC ram has checksumming built into the chip which will tell you if there is a memory error and correct it if possible.

As for most drive fail in the first hours: https://www.backblaze.com/blog/how-long-do-disk-drives-last/ evidence says other wise. Infant mortality is a thing, but its a matter of months, not hours. (internal QA grabs most of the ones that die within a few hours.)

The best way to combat simultaneous failure is to mix hardrive types, this makes it much less likley that a single fault class will be triggered on all disks.


> Also, extensive memtest of ECC ram is pointless, ECC ram has checksumming built into the chip which will tell you if there is a memory error and correct it if possible.

Yes, but if the chip is getting lots of memory errors, correctable or uncorrectable, I probably don't want it. I don't want to build my system and put it into production and then find out.


you'd pick this up straight away, ECC ram detects memory errors in real time. The reason why you run memtest is to systematically go through all the bits and flip them in a known pattern.

You can check that known pattern to see if the bit changed properly.

With ECC this is redundant, as each byte(or word, or some unit of memory) has a hardware checksum. So using ECC ram in normal operation will indicate if there are errors.


Yes, I understand that.

The reason to run memtest on ECC memory is to verify that every bit of memory is good before you finish building the system, installing an OS, and putting it into production. If you have, say, 32GB of memory, there's no guarantee you'll hit those bad bits for ECC to log an error until you're doing something memory intensive two weeks later.

The primary use case of ECC is to handle random bit flips from cosmic rays and whatnot, not to mitigate bad hardware. If you have bad memory, replace it, don't rely on ECC. The only way to test memory is to run something like memtest, where you read and write from every bit in memory.

If you're OK with finding out a month later that ECC is logging a hundred uncorrectable errors an hour whenever you do anything memory intensive, then sure, don't bother memtesting. If you would rather deal with it beforehand, then run memtest.


but you have hardware monitoring linked to an alerting framework right?


Yes, but again, you may not get that alert and find out that memory module is bad until months from now. It all depends on where the chip is bad at, how much is bad, how much load is usually on the system, etc.

I (and I would guess most people) don't want to find out a month (or more, potentially much more) from now that the memory module is bad, I want to know now so I can RMA it and get it over with without having to take a running system down and apart.

That's the problem that memtest solves, ECC does not solve the same problem at all.


> Firstly the claim that freebsd has wider testing is utter trash. In terms of TBs installed ZFS on linux >> freenas/freebsd the amount of money behind ZoL is now surprisingly large.

In that case OmniOS (or any IllumOS based) distribution is a much better choice. It has a more full featured ZFS (features not yet ported to FreeBSD or ZoL) and a more complete set of DTrace probes to analyze problems. Moreover it's still the OpenZFS upstream for all intents and purposes, but that may change in the future. On top of that ZFS on Solaris doesn't try to fit into a foreign kernel. I'm not sure but FreeBSD's GELI or Linux's LUKS may be missing on Illumos (but exists in Oracle Solaris 11), so that can be a disadvantage.

If none of Linux, Illumos or FreeBSD is to your liking, you can try ZFS on NetBSD.


I wish people would stop with the FUD here. Matthew Ahrens has publicly stated that FreeBSD and Solaris derivatives are equal as far as ZFS goes and OSX and Linux are trailing behind.


If that's true then bsdnow.tv should be informed. In a couple discussions and interviews (Bryan?) they talk about feature disparity and that a newer OpenZFS version needs to be imported into FreeBSD trunk.

It's still true that if you want to analyze serious issues, IllumOS DTrace probes will be more complete.

With that being said, I'm glad to hear FreeBSD is not behind IllumOS ZFS. FreeBSD has wider hardware support than IllumOS and thus is preferred by most users.


We're talking about bleeding edge features that were just created, and yes they're often modeled upstream in Illumos. However, there are other features that have been created by companies using FreeBSD first. But they tend to get pushed up to OpenZFS first and then trickle back down to vanilla FreeBSD.

When it comes to stability and performance FreeBSD == Illumos. I wouldn't worry so much about features anymore, because as soon as they're available they get pulled in. Last I looked the awesome ZFS features that are coming still weren't fully accepted upstream yet.


How does it tell which drive has failed? or is not working properly? I never worked with NAS rigs, but I sure have plan to setup one for my home.


There are a number of ways.

The most visible, even if you are not using ZFS is using the SMART output http://www.techrepublic.com/blog/linux-and-open-source/using... There are a number of metrics that might indicate the health of the drive

not all raid cards support this, however most consumer devices do. If you have a card that doesn't do this, either its a highend jobby with another mechanism, or a pile of shite.

second when you actually bump into a hard error, you'll see things in /var/log/messages that say "disk timed out, retrying" (but then you're probably fucked by then)

However using ZFS/BTRFS data is checksum'd when written and read. This means that bit flips and otherwise silent bit rot has a greater chance of being caught early. Performing "scrub" (consistency checks) often means you're more likely to catch errors early. (this is another reason why you'll want, but ECC ram. as it'll tell you when you get bit flips in ram, which undermines on disk checksums)

Part of the reason why people choose raid6/z2 is because it takes a long time to rebuild an array after a disk crash. It involves reading all the data again and computing the hamming code to recover the lost data.

when you buy a job lot of disks, they tend to be the same batch, which means they could be prone to the same bug/defect. So this general means that multiple disk failure happen at the same time. having another parity disk means you can withstand more failure before total loss.

However, none of these mechanisms are a replacement for a backup. even if you have a 36 gig raid, you still need a backup. There will always be a time where something goes horrifically wrong and you need to get access to an archived version.

anyone who fights against this has yet to experience the fun of total failure.


ZFS and BTRFS both provide snapshot capability. This substantially changes the metric for a backup - i.e. if disaster recovery is an absolute priority, then you need it, but your backups will include your snapshot set.


Cool! Disk failures and losing data scares the shite out of me.

>even if you have a 36 gig raid, you still need a backup

and a backup of that backup too.


ITs up to you.

I'm happy with many copies of my core files that I desperately need (photos and or stuff that gets me money)

I have my ZFS box, which is backed up to a remote disk miles away. My time machine has a few local copies too.

Just be ware, snapshots are not the same as a full backup.

They are brilliant, but not a backup.


Yes, multiple backups, kept at remote places. Don't store the backup in the server room! ;-)


At different remote locations for each backup.


I used to build my own stuff, but now I buy Synology. It's really convenient, has great UI and support, a bunch of different packages you can install, etc.

Although I ran into a problem a couple of months after setting it up where the NAS became so slow, it was unusable. I saw that my IOWait was 100%, so I figured it was a disk, but nothing indicated a problem. Eventually I was able to get to a log that showed that one of the drives had some weird error messages, so I bought a new drive from Amazon that came the next day, pulled the drive, and replaced it and instantly everything was okay again.

I would have expected something on Synology's status programs to show a problem, but they were all green, so that was annoying.


  > It's really convenient, ...

  > ... would have expected something on Synology's status
  > programs to show a problem, but they were all green, 
  > so that was annoying.
So perhaps not all that convenient.

I went down the QNap path a couple of years ago, on the assumption it'd be robust, well patched, more convenient, and basic functionality would work / be tested. I was wrong on most counts, and am now basically scared off these low-end SOHO systems entirely.


It's convenient most of the time, but sometimes building your own stuff is inconvenient most of the time. It's all about trade-offs.

For example, a while back I had to track down an issue on a server with Nagios constantly triggering memory low errors. Turns own that when you stat a lot of files, XFS will load all of that file information into cache in the kernel's slab. This causes some RAM stats to be skewed. This RAM will be instantly freed if needed, but some system stats count it towards "RAM usage."

My question is: How does one incorporate all such edge-cases into something that is user-friendly? How many system admins even know how to track down such an issue (in the absence of a flashy UI)?

Unless you want to vertically integrate the system from hardware => kernel => UI, then you're bound to have these sorts of issues where everyone is only paying attention with what's happening in their own little area of concern and multiple decoupled systems can end up interacting poorly (even when their local actions make sense to them).


Re cached memory, isn't that how all file systems work across all the major operating systems? The kernel space will cache files to reduce disk IO but will immediately free that memory if a requesting application allocated more memory that's what's available in free memory alone. Which is why file servers perform best with ooodles of RAM and why one should always look at both the free + cached memory when calculating available memory on more general purpose systems.

Or was your NAS behaving subtly different from the standard above? (eg how ZFS cache doesn't appear as FS cache in ZoL despite freeing in much the same way as the above)


This is a specific XFS behaviour. In this case, the slab was using up 4GB of RAM on a 8GB machine, and all of the normal tools were telling us that we were running out of free space in RAM. No such problems on other file systems.


What they offer is a good form factor with lower power. What would you suggest for a home brew, that would match size and power consumption?


Form factor?

A case like these: http://www.newegg.com/Product/Product.aspx?Item=N82E16811219... http://www.newegg.com/Product/Product.aspx?Item=N82E16811219... http://www.newegg.com/Product/Product.aspx?Item=N82E16811163... http://www.u-nas.com/product/nsc800.html

Low power usage?

Well many Synology units use Intel Atom CPUs. Anyone can buy a small board that has these same Intel Atom CPUs and stick them in those cases.

Or make your server with NUC parts which use laptop CPUs. Like <15W parts.

Even higher end CPUs these days idle at really low wattage. So your NAS when idling wont use much power.


Well that's disappointing.

I just bought a QNap TS-251+ to replace my Microserver FreeNAS since I was tired of all the sysadmin work, needed something lower profile and wanted features like automatic Google cloud storage sync. It seemed like QNap has a boatload of excellent features, and I was really hoping it works without glitches. But now that I'm stuck with it.. we'll see...


Patch support for SOHO NAS systems is miserable. We bought some Cisco and Qnap NAS systems a few years ago. Every feature I touched had some bug or another, or not the features I needed, or remote changes (Google sync, Apple Time Machine) broke a feature and it was never patched, etc. pp. …

After six months they were all reconfigured to expose a single iSCSI drive and a Linux box did everything else.


I got a low-end 2-bay qnap for home and quickly ended up putting Debian on it (http://www.cyrius.com/debian/kirkwood/qnap/)

It's very well supported (running kernel 4.3 from testing now, the only thing that doesn't work is the crypto accelerator) and it gives you a lot of flexibility while keeping the advantages of the hardware (low cost, footprint and power consumption).


That would have been an interesting alternative, yes. (The NAS systems are long sold or dead, but the Linux boxes that replaced them are running Debian too.)

The hardware is decent for the price, it's just the software that's problematic.


Very similar to my experience. A multitude of features I would never want to use, partly out of doubt (security, stability, etc).

My primary need was native (robust, sane, GUI-driven) iSCSI presentation ... which it completely failed to provide out of the box. That is, with a single iSCSI target, any significant activity over a GbE link would cause the QNap to crash. QNap support guided me towards a beta release of their software -- this solved the immediate problem, but I've never felt comfy upgrading (a one way process) from that release for fear of breaking iSCSI or other basic features. A less than ideal situation.


I've been using 4-bay QNAP for about 5 years now (don't remember the model off hand). The only thing I'd recommend is that you schedule SMART tests and make sure you have the email functionality configured properly.

During that time I've had to replace two drives (I'm still running 4 x 2TB in RAID5), and rebuilds take me about 10 hours.

It's not a perfect device, but it still fits my needs for now. Eventually I'll probably build my own, likely using FreeNAS, but then I'd probably want to at least go with 8x6TB drives, and I'm not ready to spend that money yet.


Well, I've had a QNAP TS419 for years. It's been great, never had a problem, receive regular updates, has a decent package manager (optware) and decent UI.

My one complaint is that when I use the web-based file browser sometimes it tries to generate thumbnails for all the items in the current directory and ties itself up for a few minutes if there are lots. I've only encountered it a few times because I use SSH.


The prices for the larger ones are ridiculous though. 800$ for a 5-bay NAS without HDDs is a lot. You could easily build 2 or more NASes with the same specs for that money.

Also, having used a Synology I think it's a niche product. It's too powerful for people who simply want to put files and backups on their network. But for people who like to tinker and control, it's too restricted. You can basically only install software that has been ported to Synology. Getting it to do automated tasks the way you want to can be annoyingly complicated. I would recommend it to an advanced customer, who needs features like Owncloud or VNP without having the burden to maintain a system.


For me it was a matter of wanting basic network backup and storage as well as DVR/storage for a couple of IP security cams I have around the house. I know I could have set up something from scratch but instead I just went with the Synology for ease of setup, relatively affordable price (didn't have enough of the right parts laying around for a franken-build), and as you say, less burden of maintenance.

I wouldn't necessarily want to depend on it for something "mission critical" but it's a convenient solution along with online backup for media storage and security cam management. My only regret is not buying one with a more powerful processor since this one isn't really capable of transcoding media. Instead, I have all of my home media backed up to the Synology and run a Plex server on my primary desktop. The Plex server reads the media from the NAS and can then send to Chromecast or be accessed from Plex or Kodi running on my Android TV.


I was playing with the parts in this article, and it was up near $2500 from amazon. $250MB motherboard, 4x$60 8GB dimms, etc.


I was talking about the 5bay Synology model.


xpenology lets you run synology on commodity hardware - I have a 212j, and run xpenology on an old Optiplex I had lying around for Plex with Transcoding. pretty neat


Also got a synology -- pretty UI, but keep running into weird limitation (ex. only two timed snapshot, inability to get standard linux tools, ex rsnapshot, DLNA slow to list files, PhotoStation unusably slow, fs weird and does not logically match to what is presented in UI)

And built a multi VM box (with VM in VM support): https://pcpartpicker.com/user/okigan/saved/#view=nmQD4D

Btw, pcpartpicker was useful during my built, Node 804 was a case I considered, but wanted to have an External 5.25.


I like pcpartpicker quite a bit especially their $/GB search feature for hard drives. I do wish they would include some basic server hardware for enthusiasts though, big ECC dimms are getting affordable and it would be nice to do some price comparison.


You can buy a FreeNAS box built by the same company that runs the project on Amazon. A bit pricey, but it's been perfect for me for over a year.


I cannot recommend prosumer level synology gear after having had nightmare after nightmare with it. The chips not used in their pro line are just not powerful enough for rebuilding large arrays in a reasonable amount of time.

RAID6 rebuilds of a 12 4tb drive array take nearly a full week on my DS1812+

I'm in the process of moving everything to a OmniOS server and will not look back.


The Linux kernel can do some really strange things when a drive fails. I had an external drive fail to the point where if I wrote to it, commands like ls would fail and be unkillable until I rebooted the system. I wonder if that's similar to what you were experiencing, masked by the shiny UI


It's not limited to Linux - the problem is that, while you can go around resetting increasingly large parts of the IO stack, lots of devices in the path may or may not actually respond to "off" or "reset" when they're in a bad state, so a drive being in a bad state can cause the entire IO path it's in to go south for the winter if nobody is smart enough in the path.

This is one of the actual reasons that enterprise drives can be better - almost all of them support an equivalent of TLER (time-limited error recovery), which is basically a programmable timeout on read/write errors.

Most parts of the IO stack, hardware and software, on every platform I've used deal a lot better with explicit errors than a device that hasn't appeared to vanish but is acting like a black hole.


Out of curiosity what RAID setup would you use for replaceable media (e.g. blu ray movies)?

Is there a synology solution for checksums to pevent bitrot which ZFS advocates talk about a lot?


For bluray movies I simply use JBOD and a worldwide distributed network of peers who keep hashed pieces of the files...


Synology are adding Btrfs support in the next version of their OS, for what that's worth.


Except they are using MDADM and LVM on top of it so it can only detect corruption, it cannot self heal like BTRFS can when it's in it's own redundant configuration.

https://www.reddit.com/r/synology/comments/3qpezu/btrfs_and_...

https://www.reddit.com/r/synology/comments/402m8d/so_i_was_g...


Can existing synology boxes upgrade for that or would it only be in newer models?


"Btrfs is a modern file system developed by multiple parties and now supported by select Synology NAS models."

The higher performance models I would expect. DS716+ and various rack-mount models presently listed... I think you would be needing an Intel-based model at least to support this when DSM 6 is released.


You can run periodic scans to look for bitrot. I think this is the standard way of handling this, my ReadyNAS before this had the same functionality.


How can it look for bitrot without any checksum info?


Instead of running some homemade script, you better use something like SnapRaid (http://www.snapraid.it/)


RAID has redundant information so they use that.


This doesn't unambiguously detect mismatches though. If it's a mirror, you have two copies of data, which one is bad? All RAID 1 knows is that they're different. If it's RAID456, again, all it knows is that there's a mismatch, it doesn't know if the data strips are wrong, or if the parity strips are wrong. The way ZFS and Btrfs deal with this is data is checksummed, and the fs metadata which includes the parity strips, are checksummed. So there's a completely independent way of knowing what's incorrect, rather than merely that there's a difference.


In RAID6 you can find which of the different combinations shows a mismatch and which combination doesn't. Run through all combinations and find the one that shows no mismatch, the dropped drive is the one with the bad data, rewrite it and go on with your life.

This can only detect one bad drive, if you have two you are toasted.


FWIW Linux software RAID doesn't do that. I think the argument was that differences like this were mostly related to power loss where some disks have the new data and others the old data. And at the block layer, it's impossible to tell which is which and so the code just picks a winner basically at random.

I'm not 100% convinced myself that a 'majority wins' strategy like you described wouldn't be superior, but I can see why they decided otherwise.


Except Synology uses MDRAID (Linux Kernel RAID) and even in RAWID 6 mode it doesn't do this.

It just overwrites the corrupt sector with a new value to make the parity data consistent. It doesn't know which is right or wrong even though technically with RAID6 it would be possible to determine.


Depending on what RAID you are running, you will only know you have bitrot, you can't fix it since you don't know what harddrive has the correct information.


Theoretically for RAID6, but not for anything less than that. And in any case I'm pretty sure Linux RAID doesn't implement that. Though, as the owner of a Synology NAS I'd love to be wrong about that.


I am not aware Synology offer this at the moment.


And the word 'watt' is no where in the article. I'd be curious to the yearly power costs.

I've been the eying the Lenovo Thinkserver TS-140: http://amzn.com/B00FE2G79C with a Xeon E3-1225v3.

Some comments state that it has an idle draw of less then 40 watts. Which is hard to believe. My Dual core intel atom box idles at about 50 watts (of course there is no difference between idle and full load draw with the atom, just super slow either way...)


It sounds like something is wrong with how your hardware or OS is doing power management if you don't see a difference between idle and full load. Also, what else is in your system? 50 ways seems pretty high. For just the CPU, the most power-hungry Atom I can find has a TDP of 10 watts.

http://ark.intel.com/m/products/59683/Intel-Atom-Processor-D...


I was somewhat being facetious. The difference of idle and load on the Atom is just a handful of watts. Completely nil when compared to my (older) desktop which is about 120 at idle and if you do some number crunching on GPU and CPU, the draw is about 230.


Agreed. Something isn't right here. An Atom system should be drawing much less power at idle.

I have an older low end AMD system I built with the intention of having it be low power. It draws around 40W at idle.


The dual core atom from the 2010 era is a strange beast. True it idles at less than 4 watts, but the chipset they were paired with draw 15 watts or so.

My estimates for my power usage are:

  Chipset/motherboard: 15 watts
  CPU: 6 watts
  2 hard drives: 10 watts
  really old crap PSU from 2002: 20 watts
Although I didn't measure each component by it self, all the power requirements and specs are posted online mostly. Except my really old PSU, I can only assume that is where the power is going. I probably should have bought a new PSU for the build, instead I bought $20 in adapters just to make the PSU work with the board :) But my killawatt meter clearly shows the box drawing 50-55 watts around the clock.

And if you are nosy, this is what the dual core atom box runs: http://shorewall.net/XenMyWay.html

I got tired of having a separate web dev box and firewall box in my house. So I stood up Xen on a debian OS on the atom hardware, and now my firewall and web dev box are Xen guest. Make for fun Dom0 host upgrades when the internet is being provided by a guest that it is itself hosting...


Depends upon the generation of Atom honestly. Older Atoms had extremely inefficient chipsets paired with them despite the processors having some varying power draw, but the CPU on those only took somewhere between some mW up to several watts which were eclipsed by the rest of the system typically (the chipsets drew at least 11 w idle or load last I remember).

I had one of the older Ion 2 Atom 330 boxes that were popular for media center PCs and it didn't really matter if it was made by nVidia either - power draw was pretty much the same no matter what.


Similar to what sister posts say; I have a reasonably large home fileserver (one of these http://www.newegg.com/Product/Product.aspx?Item=9SIA6ZP3K293... with dual xeon 5606's and 12 non-green 7500 RPM disks) and don't find it makes me surprised in terms of consumption.

Power bill for the apartment typically runs me ~80-100$ a month, (I'm mildly embarrassed that this is the metric I gauge my server by as opposed to true avg. wattage) not insane, higher than I might like though. I'll likely shrink to whatever the latest gen of that neat Intel mini-itx server board they recently released is, when the big build starts to go, but I'm not sure how much that will end up helping.


Your $/kWh is important though, $80 in your area could be $200-300 in others


Is your dual core atom the first generation atom, that shipped with a 945 chipset? IIRC, the cpu was about 10w, but the chipset was 25w :(



If you're trying interested in the power usage (and not just trying to make a point or comparison), I just recently built a very similar box (same motherboard and CPU, different RAM, and 4 TB instead of 6 TB HDDs + a pair of SSDs for the OS and ZFS ZIL and L2ARC). I can plug it into my Kill A Watt and measure the power consumption if you'd like.


Yeah that would be cool. I tend to leave my NAS in sleep mode, since I don't keep daily use stuff on it.


Probably not 100% accurate, but the APC UPS on my nas build of similar design (xeon e3, 8x3tb red main array + 3tb scratch/hot spare + 2tb scratch + 500gb ssd for jails)

reports 30w when idle, 42w when reading from the array, 75w when transcoding in plex

My energy bills aren't crazy, and I leave it on 24/7


Ten drives in idle alone is 30W (WD Reds are 2.7 each).


I must assume I'm wrong in this calculation but 40 watts seems pretty cheap in cost. http://www.wolframalpha.com/input/?i=40+watts+in+kwh+per+mon... at a 3.3 cents per kWh rate is just under a $1 per month right? On the top tier of my local utility it would still be under $4 per month. http://www.austinenergy.com/wps/portal/ae/rates/residential-...


1 W draws about 9 kW-hr/year, so for residential pricing of about 10 c/kW-hr, you get about 1 $/W-yr.


I think most Americans pay between $0.20 and $0.40 per kWh.

I am lucky though that here it is only $0.06. But in summer months it goes to $0.14 for every kWh over 600. Staying under 1000 kwh's a month is hard to do.


Speaking of summer, if you're trying to cool the house with an air conditioner, you can multiply the effective wattage of the computer by 3.


True, that is a good point.


The average price of electricity in the US is $0.13. https://www.eia.gov/electricity/monthly/epm_table_grapher.cf... And you would have to be somewhere really remote to pay $0.40.


$.106 and free weekends.


In California, PG&E has tiered pricing such that even a moderate user in the bay area will probably be paying a marginal rate of ~$0.35/kWh.


In the U.S. $0.20 per kWhr is a more common rate.


Those prices are an Alaska/Hawaii and New England thing. Everywhere else its pretty cheap.

http://www.eia.gov/electricity/state/


I have a TS-140 with four disks in it running FreeNAS. I'll hook it up to a killawatt tomorrow and let you know what it draws.


Yes, please do so! Is it the one with the Xeon? Also what/how much ram...


There's a very active subreddit[0] that discusses a lot of stuff like this. Worth checking out if you've considered having a server in your home.

[0] https://reddit.com/r/homelab



Oh the irony. I have an HP54L microserver that I used to boot BSD from the exact same usb key (sandisk cruzer fit). One day boot hanged badly, cold reboot fubared some data, I had to make a spare boot key which didn't have the ZFS disk config. I failed to reimport the ZFS pool properly, wiping the root nodes off the drive. 3TB of mostly inaccessible data sleeping. I still hope that I find the time and brain resources to write a program to reconstruct the metadata from the fs nodes still on disk. ZFS sources analysis gave some hint about magic numbers and other patterns that could help scan and infer node positions.

Anyway, as always, be cautious. And, when too many things are down, don't fiddle.


Also /r/datahoarder


I'm regretting building my NAS. It's expensive, the storage is small (16TB of drives gave me ~7.9TB of usable space) and any question or doubt about your configuration prompts responses like: http://blog.ociru.net/2013/04/05/zfs-ssd-usage#comment-17223...


> the storage is small

What did you expect here? 50% is typical trade off these days in disk arrays.


You go from 16 terabytes to 14.5 tebibytes right out of the gate. And then if you use Raid-Z2 (Raid 6), which uses 2 disks for parity, on a four disk set, half of your drives will be used for parity. This was of course the right call, because with the size of current drives Raid-Z1 (Raid 5) is really risky.

Personally on my ZFS based home built server I use mirroring, due to the well publicized issue with the extreme difficulty of increasing the size of ZFS's VDEVs. Which has the same 50% usable space reduction.


Yep, I've gone to a pair of vdevs too. I'm calming down now, even if I had to format an external drive to ZFS just to get it to mount (yes, yes, I know, USB is terrible and I'm going to lose all my data). I would not recommend FreeNAS for a home user.


Why not? I've been extremely happy with FreeNAS. I have two mirrored pairs of disks in mine. Besides performing NAS duties with aplomb, I also have a half dozen jails running various HTPC tasks such as hosting Plex.


You are making me very glad that I went with Ubuntu rather than a NAS oriented OS like FreeNAS.


My first NAS build used FreeNAS. For a simple Samba box, it's just about perfect. However, I wanted my NAS to run a few more services than the ones that were easily available and pre-built "blessed" plugins.

I had a low spec CPU in the NAS and went through hell and back to build/install/jail a few packages from source (along with their dependancies). It was a steep learning curve and wasn't terribly fun.

When it came time to add a second NAS, I chose Ubuntu and ZFS on Linux. It's been running like a champ for well over a year without a single hiccup. I don't think I've even built a single package ever. Best of both worlds, in my opinion.


Yeah, you're right. But I'd already blown my budget and then I hear "oh you should have used 6x3TB or 6x4TB in RAIDZ2". ...it's a 4 bay NAS, for home use.


I don't want to be That Guy, but the number of drives that fit best with each RAIDZ-level is documented and is the reason why I decided to finish out with six drives before I dropped a couple grand on my setup. (6x4 TB, 14TB usable space, also for home use. I did start with a 2x4TB mirror and expand several months later.) And a mobo and case with six SATA ports is not at all hard to find and IIRC is exactly what the "recommended FreeNAS hardware" thing comes with.

One thing that is definitely true about FreeNAS is that you should read all the documentation and advice on the forums before you order a single part.


Is this because of hot swaps? I would be pretty disappointed by a 50% usable for RAID6. But I don't include the hotswap(s) in this equation.


Right I was talking about RAID10, the only RAID that is sensible to use these days...

Even with ZFS as I understand it, you should just set up your storage pool then mirror it and be done....boom 50%


Yeah I ended up going with mirrored vdevs for the ability to expand to 6TB drives later... then I was told (in IRC) that a single bit-flip would cause a complete pool failure.


Uhh if they're mirrored, a bit flip is not only detected by ZFS but also corrected, whether it happens passively during a normal file read, or scrubbing, or presumably a resilver.


I'm curious how that is?


If you are into this type of thing, check out the storage form on [H]ardForum: http://hardforum.com/forumdisplay.php?f=29

Specifically, the showoff thread: http://hardforum.com/showthread.php?t=1847026


This post covers everything except the cost. The cost will vary from region to region and country to country, but it would have been nice to get a ballpark figure for what this cost.


I just put together a list from this post on pcpartspicker: http://pcpartpicker.com/p/9f3s99

It sums up to around $2600.


And power consumption!


I have 3 NASes but my largest is a similar build with 7 drives (6x6TB WD Red + an old WD Black for the OS). It pulls about 70 watts.

Full specs here: https://pcpartpicker.com/user/Viper007Bond/saved/xwP9TW


The author probably bought all of those hard drives at once, from the same vendor. They're very likely in the same batch.

What if something goes bad with a drive? Well, ZFS to the rescue. Maybe even two.

What if the whole batch was bad?

I've built my NAS with not quite as much storage but more drives using drives from different vendors and different batches from different manufacturers.


It is good to be concerned about batching disk drives, as you imply.

However, given that the problems that arise with a bad batch are physical ones, properly burning in the drives does, I think, alleviate those concerns.

We burn ours in for 5-6 days[1] before we put them into production and history shows this weeds out the bad ones. If there was an entirely bad batch, we would catch it that way.

In my opinion, far, far more likely to find yourself with fake-new drives than with an actual "bad batch". We see that all the time from amazon sellers that claim brand new drives, but SMART says otherwise...

[1] with a zero tolerance policy for even the tiniest deviation from normal in the SMART output. Even a blip and that drive is out.


If there was an entirely bad batch, we would catch it that way.

You're a very optimistic guy. That test would not have caught IBM deathstars or the more recent Seagate 3tb barracuda failures.

I lost 6 of the 8 drives in my NAS due to that last one.. luckily not all at the same time.


I personally think SilverStone's DS380 would make for a better case for something like this. 8x hot-swap 3.5in bays + 4x 2.5in bays in M-ITX form factor.

http://www.silverstonetek.com/product.php?pid=452

It's what I'm using right now for my server and I love it. I have it filled up with 6 drives and haven't had any issues with heat so far. Can't say the same about the ASUS P9A-I motherboard I'm using it with though...


I second the DS380, had a HD failure the other day and it was easy to hot swap. +1 for the case -1 for Seagate Barracuda's


$2,870.15 from Amazon right now which isn't as bad as I expected. That is a great build you've put together. After losing a 3TB drive of thankfully replaceable data I have been eyeing a similar setup but not as intense.

https://amzn.com/w/MHNNS9EDAORX

Side note, I would love to have a list or something on Amazon because the wish list isn't right. A purchase list perhaps? It doesn't include the quantity by default when clicking Add to Cart. I had thought about adding in the Amazon Associates code but I've never actually had that make any money.


Looking forward to FreeNAS 10 when it is available. Thinking about rebuilding my HP N54L microserver currently running Windows Server 2012 R2 with a 'virtual' NAS, Ubuntu + ZFS, running under Hyper-V (yes, this is unnecessarily complicated).

It would be great if whatever virtualisation is built into FreeNAS supports the AMD Turion II the N54L uses but support for AMD virtualisation sometimes seems a bit spotty (not supported in SmartOS for example).


I found the FreeNAS community to be a bit difficult to work in - less than 32GB of ECC RAM? Using something other than RAIDZ2 with 6+ high performance drives? You're terrible, and you'll lose all your data!


The FreeNAS community is definitely not very friendly and they'll jump on you if you have any problem whatsoever and you didn't follow one of their recommendations, but let's not misquote them. The FreeNAS forum folks say 8GB minimum ECC memory and don't suggest "high performance" drives, merely drives with firmware hypothetically optimized for NAS applications.

RAIDZ-2 is recommended on lots of drives for the same reason RAID6 is, so that's not something unique to the FreeNAS community.

https://forums.freenas.org/index.php?threads/hardware-recomm...


I was put off by this part of the FreeNAS community too. Eventually I found NAS4Free (a fork of FreeNAS), which seems to work rather well on the "old" dual-core AMD computer that I wanted to use.

http://www.nas4free.org/


FWIW, I run 10 4G WD red disks in raidz2 using ZFS on Linux on Ubuntu 14.04 with only 4G of RAM (it's an old motherboard, doesn't support any more). No problems and perfectly acceptable performance for mostly backup and streaming workloads.


Looking forward to FreeNAS 10 when it is available

I'd like to know when that will be. The forums on freenas.org don't seem to discuss specifics like that. FreeBSD 10.2 has been out for about 5 months, but FreeNAS 10 doesn't seem close to being released.

In another comment I mentioned that I'd like to know what, other than a GUI, FreeNAS adds to the base FreeBSD. It must be extensive since it takes quite a few months. But the whole activity seems quite cryptic.


I wonder if the author has run into many problems with FreeNAS, or the terrible community. I've seen posts on the forum where the problem was with Samba not authing correctly but the first response is "You don't have enough RAM".

At least they're patching the SSH CVE from today, but it's not just a pkg upgrade, it's a tarball that upgrades the whole root drive.


When I built my home NAS, I started with FreeNAS, but it was a pain. The instance of ECC RAM, the lack of USB support[], the community that seemed more focused on office solutions exclusively were killers, and a hermetically sealed distribution were all killers. I switched to Linux based OpenMediaVault[1] and all my problems went away. It uses the same UI as FreeNAS, but defaults to EXT4, supports USB backups, and lets me do anything I want on it. It's great.

[] The answer to multiple independent requests about backing up the NAS to a USB enclosure, and met with a refrain of "USB drives are crap, so you're stupid for using them. You should back up your NAS to another NAS, that you never move." Fuck you. I know the limits of my failure model.

[1] http://www.openmediavault.org/


I know that ZFS is cool and LVM isn't, but I literally just finished repairing my LVM-based home NAS, and it left me with a good feeling about LVM. Overall, a stack of md, LVM and XFS is a lot more complicated than ZFS, but each piece is more understandable in isolation.


ZFS is a filesystem which does waaaay more than LVM does as a container. It's not just that it's "cool". Rapid filesystem snapshots, checksums, dedupe, do some research and you'll see why it's recommended.


LVM does snapshots & checksums. dedupe and compression require huge amounts of memory and cause problems with databases. I've used ZFS before, and I switched back.


Since when does LVM do block-level checksums and recovery?

Dedup uses a ton of memory, and has a lot of "please don't do this unless you really know what you're doing" flags, but the compression is basically free.


Since forever. A parity stripe is a checksum. To use it for data integrity, perform a regular scrub, just like recommended practice for ZFS.


I have a ~8TB NAS on an Atom C2758 system. I've loaded less than 4TB so far and FreeNAS is really getting on my nerves. Am I "protected" from a single drive failure with LVM? Can I grow the storage size by swapping out one drive at a time? If so, I might just go to Debian and be done with it, although FreeBSD 10's bhyve hypervisor looks really good.


No, LVM doesn't provide any protection, you need to layer mdadm below LVM to provide any sort of protection.

Even with the use of mdadm, it doesn't provide near the sort of protection that ZFS does. Due to pervasive checksuming of data, ZFS handles the bit-rot and corruption that a dying disk does much better than the traditional raid that mdadm provides. For example if you have your disks mirrored or RAIDed, if the disk doesn't provide a read error, mdadm will pass the data back to the OS. Since the data isn't checksumed, there is no way for it to know if it needs to read from the mirrored disk or the parity drives.


- as of RHEL 6.3, LVM supports raid4/5/6 without mdadm. It has supported raid1 (mirroring) and raid0 (striping) for much longer.

- any LVM or mdadm mode with parity contains a functional checksum. To use it for data integrity, do a regular scrub. You should be doing a regular scrub with ZFS anyways, so ZFS's checksum on read doesn't add much except for slowing things down.


> any LVM or mdadm mode with parity contains a functional checksum. To use it for data integrity, do a regular scrub.

That doesn't work. Scrubbing the RAID can detect errors, but when they occur, the block layer has no idea which copy is the correct one. I haven't verified for LVM, but at least for mdraid, Linux explicitly does not make any attempts at recovering a 'correct' block even in cases where there is more than one copy. It just randomly picks a winner and overwrites the other versions. You still want to scrub for the error detection.


I've seen it work. It knows which block has the error because the disk reported the error. It then rewrites the sector with the correct data, the disk moves the sector, you see "read error detected, corrected" or some such in your kernel logs.


Thanks. I'm hanging out for FreeNAS 10, which seems like it will solve a lot of things.


What's the problem?


The FreeBSD 9 base makes it really hard to mount ext4 drives (no FUSE until 10), the security patching regime relies on upgrading the whole rootfs (see the SSH client vuln today) and I'm in unfamiliar territory on a BSD having used Linux for the last decade.


I am running ZFS on Linux and it is working great for me. ZFS on Linux is considered production ready[0]. A lot of the showstoppers, like being able to boot from ZFS have been worked out.

[0] - https://clusterhq.com/2014/09/11/state-zfs-on-linux/


Here here for ZFS on Linux as well. My home server is built in one of these http://www.u-nas.com/product/nsc800.html with a C2550D4I motherboard and 8 4TB drives in RAIDZ2. Runs great, and I've got the drives configured for spin-down when not in use to conserve power.

It's the successor to the 20-disk system I setup while I was still at my parents house, though that's a lot noisier (but fortunately lives in the basement service room) - downside is it's all based on 1.5 and 2TB disks, upside is RAIDZ3 is really nice to have.


In my experience dedup in ZFS performs horribly on hard drives.


I thought the problem was that it required an epic amount of system RAM to be efficient? It seemed a bit counterproductive to me to try to save some cheap disk space by buying hundreds of dollars of RAM.


I gave it much more than the already-high recommended amount of RAM. At one point I tested something like 6GB of RAM for 100GB of data.

As far as I know dedup scatters small data chunks by hash across the disk. Absolutely awful performance when your seek times are non-zero. I was looking at speeds in the single digit megabytes per second. Compared to saturating things just fine with dedup off.

I gave it a chunk of SSD for L2ARC and that didn't help either, and it never wrote more than a few hundred megabytes of data to it.

Currently I'm doing out-of-band deduplication on btrfs and it works great. Dedup uses the same copy on write as snapshots do, and causes zero problems.


FWIW, DragonflyBSD HAMMER1's dedup runs well on small machines (2GB) without hiccups.


ECC RAM should be the minimum requirement if using ZFS.


>If your system supports it and your budget allows for it, install ECC RAM.

There are some threads of overstatement of the necessity of ECC RAM or the opposite, but the above is the best advice. It might be more wise to invest your money more efficiently in backup resources than more expensive RAM.


The aforementioned quote should be: "If you value your data, you should use ECC RAM. Especially so, if you use a checksumming filesystem such as btrfs or zfs."

There is no use investing money in offsite backups if the onsite backup is corrupted already from faulty RAM.


ECC RAM should be the minimum requirement on any serious file server.


It should be the min req on any server, really.


I want it on workstations and laptops, too. I've seen bit flips in the office.


Pfft. Outside of a few specialist areas, most people's data isn't backed up, and if it is, it's on SD cards, thumbdrives, iCloud or the occasional portable USB hard disk tossed in a backpack.

The most difficult thing about setting up a home NAS was swallowing the cost and reading about all the tradeoffs I'd inadvertently made.


We are discussing the OP which is a 36 TB storage system, which warrants things such as ECC RAM + ZFS. A home NAS, for the typical end user, will not reach 36 TB. Storage a small percentage of that on AWS or Torsnap will, over time, cost more than the ECC RAM.


What I would like is more discussion of choosing FreeBSD vs FreeNAS.

The author was inexperienced and so chose FreeNAS for "ease of use". But what, other than a GUI, does FreeNAS really provide? I've never read a detailed explanation. The forums on freenas.org don't seem to address this fundamental question. Everything seems to be predicated on the choice already having been made, nothing helps people make the choice in the first place.

Perhaps FreeNAS is more aggressive than FreeBSD about patching storage related bugs?

Can anyone point to a detailed discussion about choosing vanilla FreeBSD vs FreeNAS?


FreeNAS is distributed and supported as an all-in-one appliance primarily. The boot process is generally expected to be run off of USB flash drives because, similar to how VMware ESXi works, is typically expected to be run on systems where installation to a local disk is not only a waste of space but potentially dangerous (mounting your USB device with write wears it out faster than if it was read-only with RAMdisks mounted). I generally build my file servers so that each disk is part of the data pool and the boot device is a USB flash drive that either shipped with the computer by the OEM (HP, Dell, Cisco, etc.) or one I imaged myself and put onto the available USB port on the motherboard (not the rear or forward ports typically in server hardware for security reasons at least).

It comes with a lot of features in the web interface that any decent FreeBSD admin could install and manage, and many out of the box settings are optimized for situations that are common for SOHO file servers. This buys a bit of time and makes it easier for others to maintain that may not be FreeBSD gurus necessarily.

There are a few tunings (sysctl stuff) and customized options specific to ZFS servers that FreeNAS offers as well. For example, most ZFS users don't have an encrypted scratch partition created on each drive in their ZFS vdevs, but FreeNAS creates them for you by default as a strong recommendation unless you explicitly turn it off with a slightly obscure setting in the web GUI.


Thank you for your comments. That helps.

I'm putting together a home NAS and am leaning toward FreeNAS. Someone else here mentioned waiting for FreeNAS 10, but it probably won't have any "gotta have it" features above what FreeNAS 9.3 already has.

The FreeNAS Mini (not mentioned in the article) seems a bit underpowered (Atom processor) but it's a turnkey solution for $1000 plus disks. I might go that way rather than trying to screw together a box by myself.


I think the main advantage of FreeNAS is the the GUI and the support tooling around this web interface. This means creating storage pools and datasets and managing those. Creating and managing jails as well as access control and sharing. It really makes managing the nas a lot more easy.


I'd rather have a rack-mount, rather than that case. In my opinion, it would be a bit easier to replace faulty hard drives.

Otherwise, it seems quite neat!


As someone with a similar build (Freenas, 8x3tb raidz2 main array, 3tb scratch/hot spare, 2tb scratch, and 500gb ssd for jails)

I would love a rackmount for drive accessibility, but I can't justify 10x the case cost and additional engineering in making it as quiet and clean as my fractal design R4. Rackmount stuff just isn't designed to be either of those, high static pressure fans and expectation of pre-filtered air. (That said I'd totally be willing to pay ~500$ if such a case existed)


10x the cost?

I picked up a 24 bay supermicro case on eBay for $265. This came with 24 hotswap bays and even a SAS2 expander backplane so I only have to connect 1 cable from the motherboard SAS controller to the case, and all 24 disks just work.

As far as making it quiet. Well that consisted of buying a $11 fan wall, installing 3 120mm Noctua PWM fans inside it and writing a simple script to govern the fan speeds based on the HDD temperatures to keep them in check with the minimum fan speeds necessary.

It's sitting in my bedroom next to my bed and I can hardly hear it.

I will concede the dust issue, but I have a normal air filter in my room and don't normally have dust issues with my computers. Maybe I clean them out once a year in the spring if they need it.


I'd also prefer 19" racks, but for home server, I'd advise against it due to the fact that most rack cases are unbearably noisy thanks to higher RPM fans and the way the air flow is designed in the case. I'd like to put in quiet, slow 120mm fans, but I've not seen 19" rack case that could easily accommodate them.


This is all great info. The only thing I shudder at is that there is one huge single point of failure. I've learned one thing when building huge dumb storage devices, build two and mirror them. I've got 32TB of storage mirrored so if one hits the fan I've got an exact copy.


i would recommend you also do snapshots. mirroring will just replicate a bad fuckup.

as for OP, i share your concerns - i'm skeptical of the reliability of consumer gear in an application like this, especially in the absence of an actual backup solution (maybe he doesn't care).

he calls it a backup server, but it's actively serving files.


Yes, I agree, mirroring will help when a disk goes back, but it won't help if you accidentally delete data. This is the one that people tend to forget a lot, which is ironically the original reason for having backups.


I have two sets of drives on my home server and keep them mirrored using a nightly rsync. It's infrequent enough that I can recover from accidents.


This is why people should use ZFS (or BTRFS) - snapshots. My home server runs a minutely, hourly, daily and monthly snapshot job. Cryptolocker could have a field day and the most likely scenario is at worst I'd lose a day of data - more likely maybe 15 minutes worth.


I'd love to but my OS limits me from running ZFS.


What I hate about FreeNAS is that it is not permissive of what kind of disks you put in. I bought a Drobo5N just because I was able to slap any drives I wanted, in any size or configuration and it would just work.

When FreeNAS can handle that, automatically and on the fly, I'll switch to that.


interesting article. a few more points which may be of interest

- in addition to raid it's worth having automated off-site backup. the best solution i could find is duplicity as its encrypted and supports a bunch of backends.

- freebsd supports full disk encryption using geli. with some work its possible to make it boot (only) from a usb key, so some protection if the server is stolen. I believe newer versions of Intel Atom support hardware AES acceleration, so this isn't a large overhead.

- if the memory requirements of ZFS are too large (which to be honest for a soho application they are!), then you can use UFS together with freebsd software raid1 (gmirror)


But why?

I'll assume the media mentioned in the article that's stored is illegal. From the cost of that home server you could very likely legally watch everything and actually support financing the creation of new stuff. Even if that's not true, how many of the movies you watched do you watch more than once? And 24 TB? How do you find time to watch that much stuff?


> you could very likely legally watch everything

this might be true if you live in the US, but in some other countries it is still hard to actually buy movies or tv shows.

as for the legality itself: in some countries downloading media itself for personal use is not illegal.

for example in switzerland, there is a media tax/fee included in the price of every device that potentially could store pirated materials. the income from this tax is then distributed to media producers and artists. in return, copying copyrighted material is tolerated for private use and consumption... this excludes redistribution and uploading.


> this might be true if you live in the US, but in some other countries it is still hard to actually buy movies or tv shows.

I actually live in Germany, and you also can't get everything here, so this is true for me as well. Sometimes you simply have to accept that something is not available. It's not like there's not enough media out there.

> in some countries downloading media itself for personal use is not illegal.

But offering the download you use is still illegal. But that's all just semantics and doesn't matter that much. What I find more important is the moral issue.

> for example in switzerland, there is a media tax/fee included in the price of every device that potentially could store pirated materials. the income from this tax is then distributed to media producers and artists. in return, copying copyrighted material is tolerated for private use and consumption...

We have a very similar fee in Germany and it's there to allow normal copying in personal use (something that would be called fair use in the US). I guess the swiss fee is there for the same reason and is not there to allow people to get the majority of their media for no additional money. The fee most likely is not enough to finance the media producers and artists. Why spend it on hardware when you can give it to the people that produce what you enjoy?


Given the author talks about "workplace", I would more expect something like storing his own videos and photos (4k and RAW are huge), instead of digital hoarding of pirated media.


Yep. I have a more modest setup but the amount of media I've ripped from owned CDs/DVDs and the number of raw photos far outweighs the handful of pirated movies I've picked up over the years.

Mainly I use my NAS as a bit of protection against losing those files that would be difficult or impossible to recover if my local storage in the workstation failed. Online backup is good too but for quicker or more frequent access, a NAS fills the gap nicely.

The other big storage hog is security cam video. There are occasional reports of burglaries in my neighborhood and sometimes I just like to know if a package was ever dropped off or someone bumped the car while parallel parking. So I picked up a couple of inexpensive IP cameras and rather than shelling out monthly for some unreliable and potentially insecure "cloud" storage plan, I use Synology's Surveillance Station IP cam software to manage recording, playback, and storage of camera footage. The amount of space on the NAS means I can easily keep a week or more worth of recordings from both cameras and with the actual NAS being stashed away out of easy view, it's unlikely to be stolen in the event of a burglary. Granted I could include those files in online storage but currently I don't have it set up that way.

Either way, the point is that many modern homes have plenty of sources of large files outside of pirated movies that can make a NAS useful.


> how many of the movies you watched do you watch more than once

If I don't watch it more then once, why would I keep it at all, or need a media server in the first place?

All your statements contradict each other.


Didn't seem to list the total price.


Wow what on earth are you doing (at home) that you need that much space backed up with that level of reliability?


Having fun.


Two words: "kids" "videos".

Between me and my wife we have our phones, a SLR, and then there are other people's phones. Whenever they fill up I dump them onto my file server and delete from the phone; it's amazing how fast the terabytes get filled up. Nice problem to have I guess.


Same here and shooting picture of kids in raw format needs space :-)


The OP bought 6x6TB drives. I truly hope that they didn't configure it to be a 36TB zpool. That should be RAIDZ2 at the very least. Heck, I have 4x2TB drives and I am running RAIDZ2. a 2TB drive took 24 hours to rebuild the data on the replacement disk. I would probably be linear, so 72 hours to rebuild a 6TB drive, during which time the other drives are doing tons of reads.


The article says they're using RAIDZ2


I stand corrected; I see that now. The HN submission is in error, then.


Why is it in error? It's 6x6=36TB. Parity is still storing data. Who says the data storage total number has to be for only user data?

My ZFS pool is 12x4TB in RAIDZ2 and when I query zpool list I get: name size alloc free tank 43.5T 25.3T 18.2T

The size reported by ZFS itself is 43.5TiB which is close to the 48TB that 12x4TB is.


Eh. I suppose. When any of my friends discuss NAS size, we quote it in "usable space". So you have a 40TB usable NAS. I have a 4TB usable NAS with 4x2TB in RAIDZ2.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: