random complexity

I’ve been looking at mixing it all up - completely - post 1 of 2.

For a storage box I’ve been running OmniOS for a while now (previously opensolaris) however OmniOS is now at a critical junction in the community. The corporate sponsor has decided to no longer support it, which was to force the community to step up and participate more. They took a gamble on the project dying or regrouping and moving on; so far we’re yet to see the outcome. There is a new group trying to continue on as OmniOSCE which I’ll keep an eye on.

I have just upgraded to the latest (last) release of the original so will be ok for a while, and I build off a local repo anyway, it just means no more upstream updates until the fork takes off. However for the mean time, I’ve been looking at switching distros for it. I don’t run any add ons, and just have a basic (ansible’d) setup applied to a pxeboot installed system. This does mean a few VM’s - package server and test nas box VM, just to support the two physical boxes which are “production” in comparison.

Originally running opensolaris was to ensure I get the latest ZFS release possible, initially in a VM due to hardware compatibility, and then natively bare metal. Since then the ZFS landscape has changed a lot. OpenZFS is a thing now, zfs on linux is packaged for most if not all distros and considered stable. It’s also running the current version that the other platforms run. FreeBSD also has the lastest too, which is in FreeNAS. So now I have a fair few potential NAS OS’s to select from for my modest requirements (NFS and cifs is all I use now, but I have used luns in the past). Supporting one fewer bespoke distros would benefit my ansible problem - this is a big plus for running Centos for the NAS box.

One thing that hasn’t changed, is the ZFS reshaping problem. If you want to expand your raidz2 stripe size you’re still screwed with a dump and reload operation. I’ve solved this problem by having an offline second copy, so reshaping means update the mirror, destroy, create, and sync it back. Now however, the mirror is approaching the capacity limit where consuming any more space will destroy performance - it’s nearly too full, so it will need reshaping. But that means new drives and the chassis is full, so it’s a painful problem. This needs out of the box thinking. There are some tricks people use around smaller raid groups and inplace size upgrades but this consumes more parity disks, so actually makes the problem worse not better.

Out of the box thinking starts with rationalize (delete/cleanup) and ends up with wanting to stash it in the cloud with a local cache. The data access pattern is quite predictable. Some areas are hot (new stuff), some are cold with predictable warming (rewatching an old show for example) and it’s mostly write one read sometimes. You could say the data is fairly cold in nature. The idea of leveraging the cloud would be for one of the copies, most likely the online one, and then just have an offline second copy locally just in case.

Ages ago I looked at and played with S3QL storing data in S3 and GCS, at the time I had latency issues (cloud instance too) and the author didn’t want to even consider supporting backblaze b2 (which was heaps cheaper) purely because they only had one datacenter (now they have 2). Now it looks like someone else has completed the b2 code but has yet to be merged. I’ll have to look at it again - though I noticed the rpm’s have fallen out of the repo due to neglect. I might have to see if I can clean that up.

Playing with this idea was good timing as PlexCloud happened, and it works quite well. The only gotcha is you have to store the data unmodified - raw and clear text in the cloud. Not encrypted and chunked. So that’s a risk to be honest. It also is limited to the types of cloud, Dropbox/Drive vs object stores. So pricing is different and more likely to be closed due to excessive data stored rather than just charging more like the bucket storage does with utility pricing. Ignoring that, I tested with some test files hosted in google drive, and was very impressed with the plex cloud server streaming back down, and even via 4G to a phone. This could work, so will need further research.

I have already mentioned NetApp Altavault in a previous post. I’m quite happy with this product as long as I have Vmware running, except for the memory consumption for the 40TiB model. It’s a commercial solution to the same problem that s3ql tries to solve - big file system to write stuff into, dedupe and encrypt it. It does work on backblaze B2 but not directly, so I had to use s3proxy to interface it. With this setup I had terrrible performance which I never got to the bottom of, it was either a threading issue or ISP throttling. Upstream was fine, just downstream was unusable.

For the second copy, I’ve considered using SnapRAID as it would work acceptably with my irregularly synced infrequently accessed second copy. No reason to not run it on centos too. This also solves the drive upgrade issue, as it doesn’t require all drives to be the same size (parity drives need to be the largest, thats the only restriction). It would be possible to add a few cheap 10TB archive drives in for the parity drives and gain some capacity that way.

This is just part of the problem - it’s a big part, but still just a part.

Youre part of the problem

Over the years my environment has grown in leaps and bounds for various reasons. Many years ago everything just ran off one linux box. NAS, downloading and backups. Everything. Over time this has swelled up and is now beyond a joke.

There was at time when I ran a single esx host with passthrough PCIe card to an opensolaris VM for NAS, and a linux vm for everything else. Maybe a windows VM for the vsphere client and that was it.

Now I’m at a stage where two decent speced hosts are over loaded (always RAM) and a collection of supporting VM’s are eating up a substantial amount of these resources. Part of this reason is to keep my skills current and ahead of my workplace - since I don’t get adequate time to learn at work, and the environments available aren’t suited to some of the experimenting that’s really needed. Also anything cloud related is impossible due to network security and network performace.

However I have labbed up vmware vsan and learned a heap over the 13 months I’ve been running it - yeah it’s been that long. 2 node ROBO deployment with witness appliance (on a third host). This has improved in leaps and bounds from the 6.1 vsan release I was on, up to the 6.6 I’m on today. It’s not without issue of course. I’ve corrupted vmdk’s and in at least one instance lost a vmdk entirely. I would NOT recommend running the 2 node ROBO design on a business site. But compared to a stand alone host it’s probably still worth a go, but be aware of the limits and stick to the HCL and watch the patch releases closely - many have been for data corruption issues. Fortunately patching is simple with the vCenter server appliance (VCSA) now having update manager built in. For now though, the VSAN UI is entirely in the old flash UI, and not the new HTML5 UI. Vsphere 6.5 is a great improvement in every way on the versions before it.

I’ve also labbed up OnCommand Insight which is an amazing product. It’s only issue is it’s way too expensive. This product has a front end real time UI and back end data warehouse for scheduled or adhoc reports. I’ve only labbed up the front end, as it’s great for identifying issues in the Vmware stack and just general poking around at where your resources have gone. For home though, the VM does eat heaps of resources - 24GB ram and 8 cores for the main server, and 24GB ram and 2 cores for the anomaly detection engine (I should see if I can lower that ram usage).

OCI vsan

vRealize Log Insight is similar to Splunk but free (ish) from vmware (depending on your licensing). This eats up lots of resources at home too - 20% cpu all the time (2 cores assigned). It’s default sizing manages nearly 12 months of logs in it, which is way more than could ever need.

Other Netapp bits I labbed up is the Netapp simulator and associated bits - OnCommand unified manager and workflow automation. A handful more vm’s there, and I’ve got 2 versions of the simulator too due to testing upgrading and compatibility. Just not running both at once except when I need to to test something specific.

NetApp Altavault is also one I’ve been playing with. This gives you a CIFS/NFS target locally with a cache and stashes it all in the cloud (s3 bucket style storage). For a while I was keen to utilise this for a cloud backup of my data, however the VM is pretty heavy (24GB ram, and minimum 2TiB cache disk (8TiB recommmended for the 40TiB model)) and the egress pricing out of the cloud is still higher than I’d like. Still, it’s a great product and works fine.

At one stage I had labbed up Vmware NSX too, but due to some issues (which I now believe have been addressed) I had to remove it. Since then I haven’t returned to have another go.

Obviously not all of this needs to run all the time, but in many ways it’s less useful to have them not running constantly due to gaps in the data, or even the time necessary to start up the environment to test again before shutting it down. Or daily tasks within the tools which wouldn’t run if it’s not left running. Yeah yeah, another automation problem.

Automation Problem

Ok so far thats just a numbers game. Too many VM’s, too much disk. Trying to do too much at once. Can’t fault that logic.

The down side is this situation has occurred only because I had the capacity for it to. If I didn’t have 2 decent ESX hosts or a few TB spare for VM’s this would have never occurred. The ongoing challenge now is to rationalise the size of the VM’s down to their minimum and keep things up to date (more pets again by the looks).

Or do I just toss it all in the bin and go back to basics in the interests of costs, time and overheads?

Dumpster Option

Storage and or Cloud

Virtual Complexity Insanity