Archive for January, 2010

Zfs experiment continued

posted by robert
Jan 18

So the zfs experiment continues. Upon the release of b129 I set off into the unknown on a voyage of dedupe. Which at first had the promise of lower disk usage, faster IO speeds and a warm fuzzy feeling deep down that you only get from awesome ideas becoming reality. ahem

Most sources say you need more ram, and that is true, what they don’t say is how much ram for what size data set, which might be more useful to home users like me. My boxes have 2gb of ram each, and that is not enough for dedupe, no way near. Not if you have a 6 TB of randomish data. I might retry when I get to 8gb ram but not before. You see, if it can’t keep the whole of the dedupe table in ram ALL the time, any write to a dedupe enabled volume will result in reads for the rest of the table, or at least seeks. So what I saw was a gradual slowdown while writing to the volume, I was determined to let it finish, to see what savings I would make, and then scrap it due to performance, but after waiting 16 days for the copy, I cancelled it.

The only way I found to even see the contents/size of the dudupe table (DDT) is: zdb -DD <poolname> which results in an output like this

DDT-sha256-zap-duplicate: 416471 entries, size 402 on disk, 160 in core
DDT-sha256-zap-unique: 47986855 entries, size 388 on disk, 170 in core

DDT histogram (aggregated over all DDTs):

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1    45.8M   5.69T   5.66T   5.66T    45.8M   5.69T   5.66T   5.66T
     2     394K   43.0G   40.3G   40.3G     821K   89.0G   83.0G   83.1G
     4    9.90K    527M    397M    402M    47.0K   2.35G   1.76G   1.79G
     8    2.06K    125M   82.4M   83.4M    21.1K   1.20G    795M    806M
    16      391   13.7M   8.54M   8.76M    7.26K    272M    162M    166M
    32       69   1.17M    776K    822K    3.08K   51.3M   32.7M   34.8M
    64       17    522K    355K    368K    1.43K   36.9M   25.1M   26.2M
   128        6    130K      7K   11.2K    1.07K   31.3M   1.50M   2.23M
   256        2      1K      1K   2.48K      833    416K    416K   1.01M
   512        4      2K      2K   4.47K    2.88K   1.44M   1.44M   3.32M
    2K        1     512     512   1.24K    2.79K   1.39M   1.39M   3.46M
 Total    46.2M   5.73T   5.70T   5.70T    46.7M   5.78T   5.74T   5.74T

dedup = 1.01, compress = 1.01, copies = 1.00, dedup * compress / copies = 1.01

Saving’s of around 80gb with dedupe and compression (backup box so no real world performance requirement) is just not worth the need for 3-n times the ram and possibly an ssd for the l2arc cache to speed things up. Yep, the suggestion and observed behaviour was to hook up a cheap small (30gb) SSD for cache to accelerate it. I don’t mind that so much for a primary but this is my backup/2nd copy box so it’s not really ideal. Certainly not for 80gb of savings, or at current prices around $5 of disk.

My second attempt is now underway, this time I’ve sliced up my data sets into more volumes, and by more that means smaller average size, so this time around 2TB max per volume, which from experience at work I’ve learned is a good rule of thumb. So now I can enable compress+dedupe on only specific bits, hopefully where the most savings is to be made, and then the rest is just stored raw. This way the savings might be similar, but without the major write speed penalty. I’ve also realised for the production box if I want screaming performance, I’ll throw an ssd on there, but that means more sata ports, which means a major change. I also need to work on power management too.

One thing that has gone right this time, is I’m now using CF->IDE adaptors and booting off that. This way the OS think’s it’s on a 2gb hdd, so booting doesn’t have the complexity of usb boot and also uses less power and doesn’t take up a sata port. Of course new boards don’t have pata anymore so I might need to get a CF->sata one in future.

Another thing that must be said, Solaris’s CIFS server is fast.


A year in reflection

posted by robert
Jan 11

I started thinking about this blog post half way through December but I didn’t start writing it until a week into January.

2009 was yet another year where I was busy yet feeling like I achieved very little.

In March I bought a racing bike, and by end of the year managed to ride 8000km on it, 9000km in total for the year.

I rode in 5 of the CycloSportif rides (all except the first one) and that gave me both a team activity and an area of personal aspiration for improvement.

In October I went to Melbourne for a community bike ride, the Around the Bay in a Day ride. It was my first time in Melbourne and it was great. I managed to complete the 220km (210km event) ride in 9 hours elapsed (7 hours 22 mins riding, 30.2km/hr average).

Then at the end of October I participated in the first ever Tour de Freedom WA ride from Esperance to Perth (not the short way). I managed 850km over 5 days. That was a massive challenge and ultimately a personal victory (not without issue) which I think was both the high point to the year for me and related to the low point too. Since then I haven’t done a whole lot of riding which is bad but I just haven’t been in the mood for it.

After years of sort of trying, I managed to stop my web hosting business and cancel my colo server. Part of that exercise was to move my web stuff onto a different service, which ended up being a VPS. The move went smooth enough and at least the server setup is reproducible (I scripted it) and the backup of the box is easy, fast and complete.

In July I got a grown up couch which has both been a blessing and a curse for various reasons. The bean bag still gets plenty of use.

I managed to watch a lot of TV this year. I picked up several new shows, some got canned in season one, others were old and still going, and some even finished. Towards the end of the year I put in a huge effort to finish off Seinfeld which I bought on dvd the year before. One particular clip from The Chronicle (clip show right before The Finale) featured a montage with Greenday’s Time of your Life acoustic playing. That reminded me of my high school graduation where this song was played right at the end before leaving. The relevance of this is Seinfeld ended in May of the same year.

Notable shows that I watched this year; Breaking Bad (2 seasons), Entourage (6 seasons), Two and a Half Men (7 seasons), Chuck (season 3 just started), The Middle (season 1 still airing), Defying Gravity (cancelled in season 1), The Inbetweeners (2 seasons), Stargate Universe (please get good and don’t get cancelled), Dollhouse (pity it’s got cancelled), The Big Bang Theory (season 3 still airing), True Blood and many more which I’d started previously. Both Dexter and Weeds finished on a high, and both seasons of Top Gear (UK) were as usual, awesome. There is plenty of good TV shows out there, it’s just a pity the networks seem to air crap rather than quality AND pack too many ads into it. I even have a DVR now (well if you could call PlayTV a DVR), and I hardly ever watch live TV anyway, now I never have to.

I ended up buying Rock Band 2 from the UK, as it’s still not out locally, despite being released in October 2008 overseas. I’m yet to really finish it, as I sort of lost interest. It’s still fun, but it’s just not the same as it used to be.

I moved my storage systems from Linux (ext3) to FreeBSD (zfs (tactical solution)) and now (finally) to OpenSolaris (zfs dedupe). OpenSolaris was the original goal, even from October 2008 when I started playing with it. I just wasn’t able to boot it consistently on my hardware, and now I’m using a CF-IDE adaptor it’s far more reliable than trying to get usb booting to work 100%.

I also got my air con repaired. This was a mission and a half. It took 10 months from logging the fault to having it maybe fixed. The service agent for my area is totally incompetent and I doubt qualified to service anything. I think my letter of complaint to LG about the service agent’s lack of service was what finally got it fixed, though I never got any response from LG and it was only my continued calls to the service agent which resulted in me thinking LG did anything. I’ve been meaning to send them a follow up but haven’t bothered. The unit is still not right and if I have any further issues with it, it’s going to be removed and replaced by a unit from a company that isn’t serviced by the useless agent. Unfortunately this will cost me money, but hopefully save my sanity.

The government managed to get their internet filter legalisation through the lower house. I’ve already started writing my thoughts on this whole thing, and that might be my next blog post. Once it’s polished up and balanced and not too much rambling.

All of that happening while being busy at work. Ok so the workload comes in ebbs and tides but on the whole I’ve been very busy all year. We’ve had 2 drilling rigs on the go, planning for a large dual facility shutdown and the whole Asia Pacific LNG thing came along.

I guess in hind sight I did achieve a lot. Let’s see what 2010 can bring for me.