mcollective | R.I.Pienaar

MCollective Security with ActiveMQ

by R.I. Pienaar | Nov 18, 2009 | Code

As part of rolling out mcollective you need to think about security. The various examples in the quick start guide and on this blog has allowed all agents to talk to all nodes all agents. The problem with this approach is that should you have untrusted users on a node they can install the client applications and read the username/password from the server config file and thus control your entire architecture.

Since revision 71 of trunk the structure of messages has changed to be compatible with ActiveMQ authorization structure, I’ve also made the structure of the message targets configurable. The new default format is compatible with ActiveMQ wildcard patterns and so we can now do fine grained controls over who can speak to what.

General information about ActiveMQ Security can be found on their wiki.

The default message targets looks like this:

/topic/mcollective.agentname.command
/topic/mcollective.agentname.reply

The nodes only need read access to the command topics and only need write access to the reply topics. The examples below also give them admin access so these topics can be created dynamically. For simplicity we’ll wildcard the agent names, you could go further and limit certain nodes to only run certain agents. Adding these controls effectively means anyone who gets onto your node will not be able to write to the command topics and so will not be able to send commands to the rest of the collective.

There’s one special case and that’s the registration topic, if you want to enable the registration feature you should give the nodes access to write on the command channel for the registration agent. Nothing should reply on the registration topic so you can limit that in the ActiveMQ config.

We’ll let mcollective log in as the mcollective user, create a group called systemusers, we’ll then give the systemsuser group access to run as a typical registration enabled mcollective node.

The rip user is a mcollective admin and can create commands and receive replies.

First we’ll create users and the groups.

<simpleAuthenticationPlugin>
 <users>
  <authenticationUser username="mcollective" password="pI1SkjRi" groups="mcollectiveusers,everyone"/>
  <authenticationUser username="rip" password="foobarbaz" groups="admins,everyone"/>
 </users>
</simpleAuthenticationPlugin>

Now we’ll create the access rights:

<authorizationPlugin>
  <map>
    <authorizationMap>
      <authorizationEntries>
        <authorizationEntry queue="mcollective.>" write="admins" read="admins" admin="admins" />
        <authorizationEntry topic="mcollective.>" write="admins" read="admins" admin="admins" />
        <authorizationEntry topic="mcollective.*.reply" write="mcollectiveusers" admin="mcollectiveusers" />
        <authorizationEntry topic="mcollective.registration.command" write="mcollectiveusers" read="mcollectiveusers" admin="mcollectiveusers" />
        <authorizationEntry topic="mcollective.*.command" read="mcollectiveusers" admin="mcollectiveusers" />
        <authorizationEntry topic="ActiveMQ.Advisory.>" read="everyone,all" write="everyone,all" admin="everyone,all"/>
      </authorizationEntries>
    </authorizationMap>
  </map>
</authorizationPlugin>

You could give just the specific node that runs the registration agent access to mcollective.registration.command to ensure the secrecy of your node registration.

Finally the nodes need to be configured, the server.cfg should have the following at least:

topicprefix = /topic/mcollective
topicsep = .
plugin.stomp.user = mcollective
plugin.stomp.password = pI1SkjRi
plugin.psk = aBieveenshedeineeceezaeheer

For my clients I can use the ability to configure the user details in my shell environment:

export STOMP_USER=rip
export STOMP_PASSWORD=foobarbaz
export STOMP_SERVER=stomp1
export MCOLLECTIVE_PSK=aBieveenshedeineeceezaeheer

And finally the rip user when logged into a shell with these variables have full access to the various commands. You can now give different users access to the entire collective or go further and give a certain admin user access to only run certain agents by limiting the command topics they have access to. Doing the user and password settings in shell environments means it’s not kept in any config file in /etc/ for example.

Registration in MCollective

by R.I. Pienaar | Nov 15, 2009 | Uncategorized

Since rolling out mcollective to more and more machines I sometimes noticed one or two weren’t checking in and found it hard to figure out which ones it was. One person evaluating it also expressed interest in some form of registration ability so that they can build up an inventory of what is out there using mcollective.

At first it seemed a bit against what I set out to do – no central database, use discovery instead – but I think the two compliment each other well, I still use discovery to actually interact with the network, registration is there to assist in building web interfaces or other inventories.

I added the ability to call a configurable plugin at a configurable interval, basically whatever data your plugin returns will be sent to the collective directed at an agent ‘registration’. A sample plugin is provided, it simply returns a list of agents as an array and you can see how trivial it is to write your own.

Using the registration system I wrote a plugin that simply keeps a file in a directory for each member and a simple nagios check will then report if there are any files older than registration interval + 30. It’s quite simple but works well, the moment one of my machines goes silent the monitor goes red.

You can grab the agent and monitor script here.

Note that whatever work your registration agent will do need to be fast, you’ll be getting a large amount of registration messages from all over your network so if you take many seconds to process each you’ll run into problems. You can get some more details about registration on the wiki page

ActiveMQ Clustering

by R.I. Pienaar | Nov 10, 2009 | Code

As part of deploying MCollective + ActiveMQ instead of my old Spread based system I need to figure out a multi location setup, the documentation says I’d possible so I thought I better get down and figure it out.

In my case I will have per-country ActiveMQ’s, I’ve had the same with Spread in the past and it’s proven reliable enough for my needs, each ActiveMQ will carry 30 or so nodes.

ActiveMQ Cluster

The above image shows a possible setup, you can go much more complex, you can do typical hub-and-spoke setups, a fully meshed setup or maybe have a local one in your NOC etc, ActiveMQ is clever enough not to create message loops or storms if you create loops so you can build lots of resilient routes.

ActiveMQ calls this a Network of Brokers and the minimal docs can be found here. They also have docs on using SSL for connections, you can encrypt the inter DC traffic using that.

I’ll show sample config below of the one ActiveMQ node, the other would be identical except for the IP of it’s partner. The sample uses authentication between links as I think you really should be using auth everywhere.

   <broker xmlns="http://activemq.org/config/1.0" brokerName="your-host" useJmx="true"
      dataDirectory="${activemq.base}/data">
 
      <transportConnectors>
         <transportConnector name="openwire" uri="tcp://0.0.0.0:6166"/>
         <transportConnector name="stomp"   uri="stomp://0.0.0.0:6163"/>
      </transportConnectors>

These are basically your listeners, we want to accept Stomp and OpenWire connections.

Now comes the connection to the other ActiveMQ server:

<networkConnectors>
   <networkConnector name="amq1-amq2" uri="static:(tcp://192.168.1.10:6166)" userName="amq" password="Afuphohxoh"/>
</networkConnectors>

This sets up a connection to the remote server at 192.168.1.10 using username amq and password Afuphohxoh. You can also designate failover and backup links, see the docs for samples. If you’re building lots of servers talking to each other you should give every link on every server a unique name. Here I called it amq1_amq2 for comms from a server called amq1 to amq2, this is a simple naming scheme that ensures things are unique.

Next up comes the Authentication and Authorization bits, this sets up the amq user and an mcollective user that can use the topic /topic/mcollective.*. More about ActiveMQ’s security model can be found here.

    <plugins>
      <simpleAuthenticationPlugin>
        <users>
          <authenticationUser username="amq" password="Afuphohxoh" groups="admins,everyone"/>
          <authenticationUser username="mcollective" password="pI1jweRV" groups="mcollectiveusers,everyone"/>
        </users>
      </simpleAuthenticationPlugin>
      <authorizationPlugin>
        <map>
          <authorizationMap>
            <authorizationEntries>
              <authorizationEntry queue=">" write="admins" read="admins" admin="admins" />
              <authorizationEntry topic=">" write="admins" read="admins" admin="admins" />
              <authorizationEntry topic="mcollective.>" write="mcollectiveusers" read="mcollectiveusers" admin="mcollectiveusers" />
              <authorizationEntry topic="ActiveMQ.Advisory.>" read="everyone,all" write="everyone,all" admin="everyone,all"/>
            </authorizationEntries>
          </authorizationMap>
        </map>
      </authorizationPlugin>
    </plugins>
  </broker>

If you setup the other node with a setup connecting back to this one you will have bi-directional messages working correctly.

You can now connect your MCollective clients to either one of the servers and everything will work as if you had only one server. ActiveMQ servers will attempt reconnects regularly if the connection breaks.

You can also test using my generic stomp client that I posted in the past

Test Driven Deployment – mcollective, puppet, cucumber

by R.I. Pienaar | Nov 6, 2009 | Code

With the release of mcollective recently I’ve been able to work a bit on a deploy problem I’ve had at a client, I was able to build up the following by combining mcollective, cucumber and the open source mcollective plugins.

The cucumber exploring is of course a result of @auxesis‘s brilliant cucumber talk at Devops Days recently.

Note: I’ve updated this from the initial posting, showing how I do filtering with mcollective discovery and put it all into one scenario.

Feature: Update the production systems
 
    Background:
        Given the load balancer has ip address 192.168.1.1
        And I want to update hosts with class roles::dev_server
        And I want to update hosts with fact country=de 
        And I want to pre-discover how many hosts to update
 
    Scenario: Update the website
        When I block the load balancer
        Then traffic from the load balancer should be blocked
 
        When I update the package mywebapp
        Then the package version for mywebapp should be 4.2.6-3.el5
 
        When I unblock the load balancer
        Then traffic from the load balancer should be unblocked
Feature: Update the production systems Background: Given the load balancer has ip address 192.168.1.1 And I want to update hosts with class roles::dev_server And I want to update hosts with fact country=de And I want to pre-discover how many hosts to update Scenario: Update the website When I block the load balancer Then traffic from the load balancer should be blocked When I update the package mywebapp Then the package version for mywebapp should be 4.2.6-3.el5 When I unblock the load balancer Then traffic from the load balancer should be unblocked

This is completely like any other test driven scenario based system, if it fails to block the firewall deploy will bail out. If it fails to update the package it will bail and finally only if those worked will it unblock the firewall.

Thanks to mcollective this is distributed and parallel over large numbers of machines. I can also apply filters to update just certain clusters using mcollective’s discovery features.

Everything’s outcome is tested and cucumber will only show the all clear when everything worked on all machines in a consistent way.

This is made possible in part because the mcollective plugins use the Puppet providers underneath the hood, so package and service actions are complete idempotent and repeatable, I can rerun this script 100 times and it will do the same thing.

I have other steps not included here to keep things simple but in a real world I would restart the webserver after the update and I would then call NRPE plugins on all the nodes to make sure their load average is in acceptable ranges before the firewall gets opened letting the load balancer in.

This opens up a whole lot of interesting ideas, kudos to @auxesis and his great talk at devopsdays!

The Marionette Collective

by R.I. Pienaar | Nov 4, 2009 | Uncategorized

Some time ago I posted a blog post and screencast about my middleware solution to concurrent systems administration, I had a lot of interest in this and in the end relented and released the thing as Open Source. You should go watch the screencast to really grasp what is going on and what this achieves.

I asked around on Twitter for name suggestions and got many great suggestions, in the end I settled on mcollective short for The Marionette Collective, yeah it sounds like an Emo band, I don’t care. It’ll be easy to google and I got a matching .org domain, that helps over certain other ungoogleable project names!

The code for the core service is released under the Apache License version 2 and is available at google code: http://marionette-collective.org/

There’s a wiki getting started guide and also a quick guide on writing agents and CLI tools to talk to it.

By default the code includes the following plugins:

Security by means of a Pre Shared Key
Communications with Stomp compatible servers
Facts about machines via a simple YAML file
Discovery

You can drop in replacement agents for all of the above and perhaps write something to use full SSL encryption etc. No actual agents to do any real work is included in the core Apache Licensed code.

A second project was started where I’m dropping a bunch of my own agents, the projects are separate because the agents might have different licensing from the core app server, for example there’s an agent to use Facter from Reductive Labs for facts but this means the code has to be GPL.

Agents available right now lets you use facter for facts, manage services using the puppet providers and also do distributed url benchmarks, check out the plugins project: http://code.google.com/p/mcollective-plugins/

I’d encourage others to put agent on Github or wherever and to announce it on the users list or just to drop me a mail and I’ll link to it from the project wiki – you can also join one of the projects and just commit your code yourself.

It’s still early days – I was accused of being too perfectionist in how I like to release code, so this is very much an early and often approach to releasing! The entire code base is about 3 weeks old and I spent mostly some free time hacking it up, so there’s much improvement to be made still, I do however use a version of it in production and find it very stable and reliable so far.

I am looking for early testers to give me feedback about the code, structure of the project etc. If you’re stuck grab me on freenode my nick is Volcane and I’ll see if I can help you get it going.

MCollective Security with ActiveMQ

Registration in MCollective

ActiveMQ Clustering

Test Driven Deployment – mcollective, puppet, cucumber

The Marionette Collective

Licence