random complexity

with a side of crazy

Posts Tagged 'netapp'

Containers or hell?

I've been looking at mixing it all up - completely - post 2 of 2.

Mixing things up

The other part to look at is the VM's and app hosting. If I end up running linux on a box to serve it all, I can host many/all of the basic apps I use directly on there. But should they be isolated, and to what degree?

All the hipsters are into containers today but with them I see a common problem of software security/patching. On the operations side we're trading in a full blown VM with a guest OS we support and patch, for a black box container running whatever the developer put in it. People also push/pull these container images from a global repository all the time; at least there is a reputation system but we know how they can be gamed. I'm just concerned about the contents of the image as it's possible the container maintainer is not the same team as the software you want which is in it. The container could contain any code which will run on your network. You're putting the trust in an additional party - or taking on the container packaging role yourself.

Hipster containers

If you take on the role yourself, then you need to ask yourself what are you gaining or protecting yourself from anyway. I run a few python based webapps each as a service out of systemd on a centos VM. One VM, several services, each as a separate user. This VM only has read only NFS access except for a folder where their config/database resides (or where they need to drop or write files). This level of isolation isn't too dissimiliar to containers within docker. With one exception - docker you create a container per app. It is true application isolation.

This lead me to wonder how far should you take it. I run Observium to monitor everything and it uses mysql (mariadb) for some things. Should this database engine be within the container (self contained single app container) or should the database be separate and use linked containers so the app server can find it's DB server if it ever moves to a separate host. The usual googleing turned up a few answers but none that made it totally clear one way or the other. It always depended on x, y or z.

If it's all self contained, then the external attack footprint is smaller (fewer ports opened), but then you lose the ability to scale the app server separately to the DB server, or even run them on separate hosts. Not a huge issue for me to be honest - but lets do things properly and over engineer.

Putting the database in a container of it's own has similar shortcomings to the integrated one. The database files need to be external to the container, which is fine, we want clear separation of data from code so that's ok. Then what about backups? Is there a scheduled task to connect to the container and run the backup (or is the job inside the container), again writing outside the container. How it connects would vary if it's integrated or separate container due to ports being opened. In this case, do we share this DB container with another application which might also need the same database engine. Suggestions say no, due to version dependecies possibly changing between the applications. Yikes. Now we're running multiple DB instances on potentially the same hardware. It's also not clear to what degree memory deduplication works on docker - if at all. This quote sealed that deal for me: If you have a docker host running on top of a hypervisor, then it should be possible for either the docker host or the hypervisor to do memory deduplication and compression. So we're back to running a hypervisor to make up for a kernel feature which exists but doesn't work with docker due to process isolation (cgroups), oops.

Docker also seems to go against my ansible quest. Since the docker way of updating is to throw the instance away and start a new one - the data you need to keep is not touched as it's outside of the container. I do like this bit, but I've already done that by having the apps sit on an NFS export. This approach does have merit, as the dockerfile is a top down script on how to build the container content. Being focused on a single goal some I've looked at are quire concise, however others are hugely complicated. YMMV.

Oh and then don't forget you can run containers on ESX now with Vsphere Integrated Containers.

I've said many times before, the plot chickens.

The Plot Chickens

Virtual Complexity Insanity

Over the years my environment has grown in leaps and bounds for various reasons. Many years ago everything just ran off one linux box. NAS, downloading and backups. Everything. Over time this has swelled up and is now beyond a joke.

There was at time when I ran a single esx host with passthrough PCIe card to an opensolaris VM for NAS, and a linux vm for everything else. Maybe a windows VM for the vsphere client and that was it.

Now I'm at a stage where two decent speced hosts are over loaded (always RAM) and a collection of supporting VM's are eating up a substantial amount of these resources. Part of this reason is to keep my skills current and ahead of my workplace - since I don't get adequate time to learn at work, and the environments available aren't suited to some of the experimenting that's really needed. Also anything cloud related is impossible due to network security and network performace.

However I have labbed up vmware vsan and learned a heap over the 13 months I've been running it - yeah it's been that long. 2 node ROBO deployment with witness appliance (on a third host). This has improved in leaps and bounds from the 6.1 vsan release I was on, up to the 6.6 I'm on today. It's not without issue of course. I've corrupted vmdk's and in at least one instance lost a vmdk entirely. I would NOT recommend running the 2 node ROBO design on a business site. But compared to a stand alone host it's probably still worth a go, but be aware of the limits and stick to the HCL and watch the patch releases closely - many have been for data corruption issues. Fortunately patching is simple with the vCenter server appliance (VCSA) now having update manager built in. For now though, the VSAN UI is entirely in the old flash UI, and not the new HTML5 UI. Vsphere 6.5 is a great improvement in every way on the versions before it.

I've also labbed up OnCommand Insight which is an amazing product. It's only issue is it's way too expensive. This product has a front end real time UI and back end data warehouse for scheduled or adhoc reports. I've only labbed up the front end, as it's great for identifying issues in the Vmware stack and just general poking around at where your resources have gone. For home though, the VM does eat heaps of resources - 24GB ram and 8 cores for the main server, and 24GB ram and 2 cores for the anomaly detection engine (I should see if I can lower that ram usage).

OCI vsan

vRealize Log Insight is similar to Splunk but free (ish) from vmware (depending on your licensing). This eats up lots of resources at home too - 20% cpu all the time (2 cores assigned). It's default sizing manages nearly 12 months of logs in it, which is way more than could ever need.

Other Netapp bits I labbed up is the Netapp simulator and associated bits - OnCommand unified manager and workflow automation. A handful more vm's there, and I've got 2 versions of the simulator too due to testing upgrading and compatibility. Just not running both at once except when I need to to test something specific.

NetApp Altavault is also one I've been playing with. This gives you a CIFS/NFS target locally with a cache and stashes it all in the cloud (s3 bucket style storage). For a while I was keen to utilise this for a cloud backup of my data, however the VM is pretty heavy (24GB ram, and minimum 2TiB cache disk (8TiB recommmended for the 40TiB model)) and the egress pricing out of the cloud is still higher than I'd like. Still, it's a great product and works fine.

At one stage I had labbed up Vmware NSX too, but due to some issues (which I now believe have been addressed) I had to remove it. Since then I haven't returned to have another go.

Obviously not all of this needs to run all the time, but in many ways it's less useful to have them not running constantly due to gaps in the data, or even the time necessary to start up the environment to test again before shutting it down. Or daily tasks within the tools which wouldn't run if it's not left running. Yeah yeah, another automation problem.

Automation Problem

Ok so far thats just a numbers game. Too many VM's, too much disk. Trying to do too much at once. Can't fault that logic.

The down side is this situation has occurred only because I had the capacity for it to. If I didn't have 2 decent ESX hosts or a few TB spare for VM's this would have never occurred. The ongoing challenge now is to rationalise the size of the VM's down to their minimum and keep things up to date (more pets again by the looks).

Or do I just toss it all in the bin and go back to basics in the interests of costs, time and overheads?

Dumpster Option

Copyright © 2001-2017 Robert Harrison. Powered by hampsters on a wheel. RSS.