Here’s a summary/review of an Infoq hosted DevOps meetup called “Infrastructure as code”.
Four speakers presented their ideas on the infrastructure of the IT company which follows/enhances/mirrors the practices and paradigms used in software engineering.
10 years ago people would laugh at you if you asked for 10000 boxes. Now amazon can provide you with the same amount in a minute.
Cloud is sort of awesome :)
I don’t want any human near my sudo.
If a configuration change process for a production server takes 16 hours (including all of the bureaucracy), 30 minutes saved by a tool won’t matter that much.
Infrastructure people have a long tool generation cycle (compared to devs). The tool lifecycle for rubyists, for example, is 6 days long (that was a joke).
Some ideas on testing in operations:
Pushing software engineering paradigms to ops doesn’t work because ops have a simple job: it should be up all the time. You can never exceed expectations. New paradigms only introduce risk.
Pushing op paradigms into software engineering is largely lacking. Ops paradigms:
You wan’t your boxes and your software fully monitorable. All of the problems should be traced to their roots. Take for example DTrace originally developed by SUN. We are still waiting for tools of this quality to be available on Linux.
Software applications, as I know them, are lacking observability. JVM applications (which I’m familiar with) are kind of observable on a lower level (jconsole/visualvm), but I assume the speakers were talking about the much deeper integration between the monitoring tools used to monitor generic systems and the possibilities to monitor generic applications.
Some more chat on the configurability and layers of configuration followed. All presenters agreed that there is no end to the abstraction layers, just the same as in software engineering. Current systems (chef/puppet) provide a uniform configuration deployment mechanism, but the custom recipes still vary wildly between users, which means that we’re only at the beginning of real operations reuse.
Writing Cucumber acceptance tests for infrastructure before the development and hooking them to the monitoring software (nagios/zabbix) This way business people run the tests against infrastructure themselves.
I can’t even imagine how this would look like in practice. Certainly, an idea worth pursuing.
All in all, an interesting talk. It was especially interesting to hear about the advanced operations as I’ve had a minimal exposure to the systems administration and its evolution during the past decade. Gonna have to read up on that and finally try out puppet.