Select Page

Test Driven Deployment – mcollective, puppet, cucumber

With the release of mcollective recently I’ve been able to work a bit on a deploy problem I’ve had at a client, I was able to build up the following by combining mcollective, cucumber and the open source mcollective plugins.

The cucumber exploring is of course a result of @auxesis‘s brilliant cucumber talk at Devops Days recently.

Note: I’ve updated this from the initial posting, showing how I do filtering with mcollective discovery and put it all into one scenario.

Feature: Update the production systems
 
    Background:
        Given the load balancer has ip address 192.168.1.1
        And I want to update hosts with class roles::dev_server
        And I want to update hosts with fact country=de 
        And I want to pre-discover how many hosts to update
 
    Scenario: Update the website
        When I block the load balancer
        Then traffic from the load balancer should be blocked
 
        When I update the package mywebapp
        Then the package version for mywebapp should be 4.2.6-3.el5
 
        When I unblock the load balancer
        Then traffic from the load balancer should be unblocked

This is completely like any other test driven scenario based system, if it fails to block the firewall deploy will bail out. If it fails to update the package it will bail and finally only if those worked will it unblock the firewall.

Thanks to mcollective this is distributed and parallel over large numbers of machines. I can also apply filters to update just certain clusters using mcollective’s discovery features.

Everything’s outcome is tested and cucumber will only show the all clear when everything worked on all machines in a consistent way.

This is made possible in part because the mcollective plugins use the Puppet providers underneath the hood, so package and service actions are complete idempotent and repeatable, I can rerun this script 100 times and it will do the same thing.

I have other steps not included here to keep things simple but in a real world I would restart the webserver after the update and I would then call NRPE plugins on all the nodes to make sure their load average is in acceptable ranges before the firewall gets opened letting the load balancer in.

This opens up a whole lot of interesting ideas, kudos to @auxesis and his great talk at devopsdays!

The Marionette Collective

Some time ago I posted a blog post and screencast about my middleware solution to concurrent systems administration, I had a lot of interest in this and in the end relented and released the thing as Open Source. You should go watch the screencast to really grasp what is going on and what this achieves.

I asked around on Twitter for name suggestions and got many great suggestions, in the end I settled on mcollective short for The Marionette Collective, yeah it sounds like an Emo band, I don’t care. It’ll be easy to google and I got a matching .org domain, that helps over certain other ungoogleable project names!

The code for the core service is released under the Apache License version 2 and is available at google code: http://marionette-collective.org/

There’s a wiki getting started guide and also a quick guide on writing agents and CLI tools to talk to it.

By default the code includes the following plugins:

  • Security by means of a Pre Shared Key
  • Communications with Stomp compatible servers
  • Facts about machines via a simple YAML file
  • Discovery

You can drop in replacement agents for all of the above and perhaps write something to use full SSL encryption etc. No actual agents to do any real work is included in the core Apache Licensed code.

A second project was started where I’m dropping a bunch of my own agents, the projects are separate because the agents might have different licensing from the core app server, for example there’s an agent to use Facter from Reductive Labs for facts but this means the code has to be GPL.

Agents available right now lets you use facter for facts, manage services using the puppet providers and also do distributed url benchmarks, check out the plugins project: http://code.google.com/p/mcollective-plugins/

I’d encourage others to put agent on Github or wherever and to announce it on the users list or just to drop me a mail and I’ll link to it from the project wiki – you can also join one of the projects and just commit your code yourself.

It’s still early days – I was accused of being too perfectionist in how I like to release code, so this is very much an early and often approach to releasing! The entire code base is about 3 weeks old and I spent mostly some free time hacking it up, so there’s much improvement to be made still, I do however use a version of it in production and find it very stable and reliable so far.

I am looking for early testers to give me feedback about the code, structure of the project etc. If you’re stuck grab me on freenode my nick is Volcane and I’ll see if I can help you get it going.

Using Ruby Net::IMAP with plain auth

I’ve had to help a client pull out sender addresses from a folder on a Zimbra server, Ruby supports IMAP but only with LOGIN and CRAM-MD5 methods while Zimbra wants PLAIN. Net::IMAP supports adding a new authenticator so it was pretty simple in the end:

class ImapPlainAuthenticator
  def process(data)
    return "#{@user}\0#{@user}\0#{@password}"
  end
 
  def initialize(user, password)
    @user = user
    @password = password
  end
end
 
Net::IMAP::add_authenticator('PLAIN', ImapPlainAuthenticator)

And using it to retrieve the list of senders:

imap = Net::IMAP.new('imap.your.com')
 
imap.authenticate('PLAIN', 'username', 'pass')
imap.examine('INBOX/subfolder')
imap.search(["ALL"]).each do |message_id|
    envelope = imap.fetch(message_id, "ENVELOPE")[0].attr["ENVELOPE"]
    puts "#{envelope.from[0].mailbox}"
end

Very simple, took about 5 minutes and saved days of manual pain and suffering, result!

Reusing Puppet Providers

Last night I was thinking about writing scripts to manage services, packages etc in a heterogeneous environment. This is hard because the operating systems all behave differently.

This is of course a problem that Puppet solves with it’s providers system, after some chatting with Luke on IRC I now have a pretty nice solution.

Assuming you don’t mind shell scripting in Ruby here’s a pretty decent bit of Ruby to manage a service on many different environments.

require 'puppet'
 
service = ARGV.shift
action = ARGV.shift
 
ARGV.length > 0 ? hasstatus = true : hasstatus = false
 
begin
    svc = Puppet::Type.type(:service).new(:name => service, 
        :hasstatus => hasstatus).provider
 
    svc.send action
 
    puts("#{service} #{action} status: #{svc.status}")
rescue Exception => e
    puts("Could not #{action} service #{service}: #{e}")
end
# service.rb httpd stop true
httpd stop status: stopped
 
# service.rb httpd start true
httpd start status: running

You’d probably want to put in support for the pattern parameter to keep certain broken Operating Systems happy, but this is pretty nice and platform independent.

Middleware for Systems Administration

I spoke a bit on the puppet IRC channel about my middleware based systems administration tool, I made a video to demo it below.

The concept is that I use publish / subscribe middleware – ActiveMQ with Stomp in my case – to do one-off administration. Unlike using Capistrano or some of those tools I do not need lists of machines or visit each machine with a request because the network supports discovery and a single message to the middleware results in 10s or 100s or 1000s of machines getting the message.

This means any tasks I ask to be done happens in parallel on any number of machines typically I see ~100 machines finish the task in the same time as 1 machine would and no need for SSH or anything like that.

The app server and client libs I wrote take away all the complexities of the middleware and takes care of crypto signing requests, only responding to requests that has been signed properly etc, serializing and deserialization of data etc.

Discovery is built in and it supports puppet classes and facts and a few other criteria I use for my own systems so there is no need to build any kind of system that keeps track of what machines I have with what version of operating system etc. As long as is on the middleware I can find it.

The bulk – timeout handling and so forth removed – of the ping app you see in the demo can be seen here, client:

client = Stomphost::Client.new(config)
client.sendreq(Time.now.to_f, "echo")
 
loop do
    resp = client.receive 
    elapsed = (Time.now.to_f - resp[:body]) * 1000
end

And the agent is just this:

module Stomphost
    module Agent
        class Echo
            def handlemsg(msg)
                msg[:body]
            end
        end
    end
end

You can see that even data types like the float will flow cleanly through end to end.

Watch the video, I mention my uses cases but it includes distributed Exim administration, package updates, services restarts, iptables management and much more.

UPDATE: This code has now been released as an Open Source Apache 2 licenced project at marionette-collective.org