Pets vs Cattle vs complexity

So way back in November 2014 I started on an experiement around configuration management. It might have started earlier, but that’s the first commit date in the git repo. Basically I was motivated (somehow) by the realisation that the pets vs cattle analogy worked really well to me. My handful of machines were more bespoke and unique (pets) than they could have been, and it would be a good idea to make them more throw away (cattle).

At the time I was already using pxe booting kickstart scripts - with a fairly complete base build coming out of the kickstart process. My media PC was entirely configured this way, and could be rebuilt - on demand - in about 15 minutes elapsed time. So if anything went bad with a package update, it was already cattle and not a pet. Other machines (desktop and vm’s) were built with kickstart but less cattle and more pet like. So the pets vs cattle thing had room for improvement, and the other thing this methodology needed was a configuration management tool. Kickstart scripts was not it, as they didn’t work for cloud where you build from a cloned image.

After some reading around and talking to people. Some people loved Puppet, others liked Chef, a newcomer (at the time) was Salt which was gaining some interest. All of these needed agents installed on the destination, and I think all needed a server (application) to drive it. This is ignoring one was written in Java, one in Ruby (and erlang) and one in Python. So they also needed their base language installed on the destination to function too. This meant to me, I couldn’t escape the kickstart script completely as it would need some software installed beyond the minimum and the agent software too.

Then I found Ansible which Redhat was sponsoring and Fedora was using. Ansible only needed ssh on the destination to work - no agent at all. However it did benefit from having python on the destination for most of it’s functionality.

The methodology of each of these tools varied a bit.

  • Puppet worked on a model approach and tries to make the destination realize the model. Scripts were written in a custom language and called plays.
  • Chef used the model idea too and applied the recipe to fit the target to the model. Recipies were in a custom ruby style language.
  • Salt I think was the same again, so I didn’t look too closely.
  • Ansible was pretty much a top down script of custom modules. The modules (mostly) had checks so they can flag if they need to do anything, and track success - idempotent scripts was the key. Your stuff is written in yaml documents called plays and they are arranged into playbooks.

So I started off with Ansible and trying to translate my kickstart scripts into ansible roles and playbooks. Splitting out common bits which apply to all machines into a common role which even worked across software versions and distributions (various releases of centos and fedora, and later debian). Each system type then had several roles assigned which then apply the steps in the playbook in a top down fashion. My kickstart script shurnk to a totally minimal centos/fedora install which adds a user and ssh key only. From there ansible could connect and run the playbooks to turn a machine into any system type.

Early teething issues annoyed me, like not being able to have multiple things done in a task step. So you end up with heaps of tasks in a playbook, each doing one thing - the exception was anything that could be done repetitively from a list (so multiple calls to same module could be parameterised from a list/dict of items). Playbooks could be included and passed variables, so some high level automation was possible. Ultimately it was a very verbose way of doing things.

You end up having to do this

1
2
3
4
5
6
7
8
- name: setup privoxy
  lineinfile: dest=/etc/privoxy/config state=present regexp="^listen-address" line="listen-address {{ ansible_default_ipv4.address }}:8118"

- name: insert firewalld rule for privoxy
  firewalld: service=privoxy  permanent=yes state=enabled immediate=yes

- name: enable privoxy
  service: name=privoxy state=started enabled=yes

rather than what made more sense

1
2
3
4
- name: setup privoxy
  lineinfile: dest=/etc/privoxy/config state=present regexp="^listen-address" line="listen-address {{ ansible_default_ipv4.address }}:8118"
  firewalld: service=privoxy  permanent=yes state=enabled immediate=yes
  service: name=privoxy state=started enabled=yes

though there’s a new keyword since 2.x block which I need to look at. It might let me do this.

As time went on, I think I started with ansible 1.6, I hit issues where modules lacked the one ability I needed, or changed in behaviours. Then other system things changed - yum to dnf, iptables to firewalld. These necessitated using conditions on tasks to check distribution or release version (which was easy, but meant doubling up of tasks, one for each way of doing it). It seemed ok, and I plodded on. Each release of ansible got better, 1.9 was good, 2.0 was a big improvement and now I’m on 2.3. Each iteration more modules have been added, issues have been fixed and it’s got more powerful which is great.

I expanded my playbooks to include my omnios host server and the package repositories on there. I created a parameterised play which was given the release name and a tcp port, and it would create the source repo, populate it and start the service for it. Rerunning the playbook would update the repo. Happy days.

Ipxe worked really well. Simply include another playbook and the ipxe boot menu was updated. Change a variable for what Fedora release I wanted and it would download the pxeboot files (kernel+initrd) to the appropriate web server (ipxe rocks by booting from http) and update the menu. Easy. Except it wouldn’t clean up the old files unless you wrote a task to do that - disk is cheap anyway.

Then I tried my router - a vyos VM. I had a templated config for this, so looked at applying a script by playbook. Some initial success spurred me on, but eventually I hit an issue with changes. The script just couldn’t apply in an idempotent way. The router needed to delete ALL firewall rules and run the script, inside one transaction to handle deletes or changes. This meaned EVERY run would dump and reload the firewall, even if no change was present. So I stopped there and kept on elsewhere.

Ansible modules had changed over this time (2 years) so I could clean up some old hacks that were there. I’d marked them so they were easy to find. Firewalld now didn’t need to reload the service, the change was immediate. Clean up here and there. Now I wasn’t using “old” centos, so I could dump some old hacks I had present for centos6 now everything worked on centos7. This still left centos on yum and fedora on dnf for packages. The “unified” package module didn’t exist yet.

More apps came along and it was easy to automate them. OnCommand Insight was easy to automate the install without interaction on centos. I even got the playbook to hit the API to install the license key.

Sounds like a great success. Except now I have a mess of playbooks written in yaml which need testing regularly to ensure upstream changes don’t break them. Changes both in ansible modules and distribution packages. So I setup a good way to test them on vmware; clone a base image and apply the playbook, over and over. This way I didn’t need to pxe boot the vm manually to test. I never got to the point that I felt comfortable that rerunning the playbook had no risk at damaging/trashing the proper machine, so testing was required. I’m not sure how close I got either, it might have been just one round of cleanup more and happy days, or it could have been heaps - I just didn’t have any datapoints to draw a conclusion from.

And I still netboot and kickstart ESX host builds.

I’d failed. I’d automated my pets again.

Automatic cat feeder


Another fork in the road

It’s been ages and ages since I’ve really felt like writing a post and I’m sure the neglect here is obvious. Plenty has gone on in the interim, well at on and off actually. I have a few things I feel the need to write about, and due to the varied topics will try to keep them in separate posts.

I’ll start off with the one that reoccurs periodically over the years, and that is the persistent feeling of being stuck at a fork in the road. To stay or to go. If stay how to fix whats wrong, and if go, where to and why. The key bit is the need for change. This has come up often enough that I know it’s a temporary thing, and the resulting action is just a distraction, an abberation from the mean. It’s a combination of a motivation/reward thing and a happiness thing.

The other thing is it’s always on multiple fronts concurrently. If it was just one thing it’s usually easy to figure out a way forward and just plod on. When it’s more than one it’s difficult to pick what to work on first or even spot the interdependencies between them. Correlation doesn’t imply causality, it could just be dumb luck.

Previously I’ve tried to escape it by trying new things and simplifying. Don’t take on too much at once. Focus on what matters. That sort of thing. The things you own end up owning you. The only snag with all of that is it can be difficult to find motivation to adjust course. It’s like life and routine is a large ship, and trying to influence it’s direction takes sustained directed effort. Simplifying takes dedication to work through something, but it’s frequently worth it and is an ongoing process. Clutter feeds itself both physically and logically. Likewise changes to routine can be beneficial for many ways and also takes dedication and motivation.

The snag I’ve currently hit is mostly a lack of direction and motivation. Sticking in a holding pattern doesn’t achieve anything and just seems to fuel the downward spiral. It’s been same old-same old for too long and somethings going to give.

Fork in the road (from the muppets)


Powered by hampsters on a wheel.
Built with Hugo
Theme Stack designed by Jimmy