with a side of crazy
So the zfs experiment continues. Upon the release of b129 I set off into the unknown on a voyage of dedupe. Which at first had the promise of lower disk usage, faster IO speeds and a warm fuzzy feeling deep down that you only get from awesome ideas becoming reality. ahem
Most sources say you need more ram, and that is true, what they don't say is how much ram for what size data set, which might be more useful to home users like me. My boxes have 2gb of ram each, and that is not enough for dedupe, no way near. Not if you have a 6 TB of randomish data. I might retry when I get to 8gb ram but not before. You see, if it can't keep the whole of the dedupe table in ram ALL the time, any write to a dedupe enabled volume will result in reads for the rest of the table, or at least seeks. So what I saw was a gradual slowdown while writing to the volume, I was determined to let it finish, to see what savings I would make, and then scrap it due to performance, but after waiting 16 days for the copy, I cancelled it.
The only way I found to even see the contents/size of the dudupe table (DDT) is: zdb -DD
DDT-sha256-zap-duplicate: 416471 entries, size 402 on disk, 160 in core DDT-sha256-zap-unique: 47986855 entries, size 388 on disk, 170 in core DDT histogram (aggregated over all DDTs): bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 45.8M 5.69T 5.66T 5.66T 45.8M 5.69T 5.66T 5.66T 2 394K 43.0G 40.3G 40.3G 821K 89.0G 83.0G 83.1G 4 9.90K 527M 397M 402M 47.0K 2.35G 1.76G 1.79G 8 2.06K 125M 82.4M 83.4M 21.1K 1.20G 795M 806M 16 391 13.7M 8.54M 8.76M 7.26K 272M 162M 166M 32 69 1.17M 776K 822K 3.08K 51.3M 32.7M 34.8M 64 17 522K 355K 368K 1.43K 36.9M 25.1M 26.2M 128 6 130K 7K 11.2K 1.07K 31.3M 1.50M 2.23M 256 2 1K 1K 2.48K 833 416K 416K 1.01M 512 4 2K 2K 4.47K 2.88K 1.44M 1.44M 3.32M 2K 1 512 512 1.24K 2.79K 1.39M 1.39M 3.46M Total 46.2M 5.73T 5.70T 5.70T 46.7M 5.78T 5.74T 5.74T dedup = 1.01, compress = 1.01, copies = 1.00, dedup * compress / copies = 1.01
Saving's of around 80gb with dedupe and compression (backup box so no real world performance requirement) is just not worth the need for 3-n times the ram and possibly an ssd for the l2arc cache to speed things up. Yep, the suggestion and observed behaviour was to hook up a cheap small (30gb) SSD for cache to accelerate it. I don't mind that so much for a primary but this is my backup/2nd copy box so it's not really ideal. Certainly not for 80gb of savings, or at current prices around $5 of disk.
My second attempt is now underway, this time I've sliced up my data sets into more volumes, and by more that means smaller average size, so this time around 2TB max per volume, which from experience at work I've learned is a good rule of thumb. So now I can enable compress+dedupe on only specific bits, hopefully where the most savings is to be made, and then the rest is just stored raw. This way the savings might be similar, but without the major write speed penalty. I've also realised for the production box if I want screaming performance, I'll throw an ssd on there, but that means more sata ports, which means a major change. I also need to work on power management too.
One thing that has gone right this time, is I'm now using CF->IDE adaptors and booting off that. This way the OS think's it's on a 2gb hdd, so booting doesn't have the complexity of usb boot and also uses less power and doesn't take up a sata port. Of course new boards don't have pata anymore so I might need to get a CF->sata one in future.
Another thing that must be said, Solaris's CIFS server is fast.