by R.I. Pienaar | May 7, 2010 | Uncategorized
I’ve had quite a lot of contributions to my Puppet Concat module and after some testing by various people I’m ready to do a new release.
Thanks to Paul Elliot, Chad Netzer and David Schmitt for patches and assistance.
For background of what this is about please see my earlier post: Building files from fragments with Puppet
You can download the release here. Please pay special attention to the upgrade instructions below.
Changes in this release
- Several robustness improvements to the helper shell script.
- Removed all hard coded paths in the helper script to improve portability.
- We now use file{} to copy the combined file to its location. This means you can now change the ownership of a file by just changing the owner/group in concat{}.
- You can specify ensure => “/some/other/file” in concat::fragment to include the contents of another file in the fragment. Even files not managed by puppet.
- The code is now hosted on Github and we’ll accept patches there.
Upgrading
When upgrading to this version you need to take particular care. All the fragments are now owned by root, the shell script runs as root and we use file{} to copy the resulting file out.
This means you’ll see the diff of not just the fragments but also the final file when running puppetd –test but unfortunately it also means the first time you run puppet with the new code your Puppet will fire off all notifies that you have on your concat{} resources. You’ll also see a lot of changes to resources in the fragments directory on first run. This is normal and expected behavior.
So if say you’re using the concat to create my.cf and notify the service to restart automatically then simply upgrading this module will result in MySQL restarting. This is a one off notify that happens only the first time, from then on it will be as normal. So I’d suggest when upgrading to disable those notifies till this upgrade is running everywhere and then put it back.
by R.I. Pienaar | Apr 14, 2010 | Uncategorized
I have finalized speakers for the next London DevOps get together, I sent the mail below to the list, looking forward to seeing everyone there!
Hello,
I am glad to announce speakers for our first meet hosted by The Guardian.
We will meet at their shiny new offices in Kings Cross to start at 7pm, those who went to
Scale Camp will know the venue.
We have two talks of roughly 30 minutes each lined up:
We will have some time for a few lightning talks if there’s any interest before retiring to a nearby pub. If anyone has Pub suggestions please send them along.
Map and details can be found as usual at http://londondevops.org/meetings/
Thanks again to The Guardian for the venue, if anyone out there want to sponsor some sodas or something for at the venue please get in contact.
I will try to set up some RSVP system, if you mention the meet on twitter please use the #ldndevops hashtag!
by R.I. Pienaar | Apr 14, 2010 | Code
I retweeted this on twitter, but it’s just too good to not show. Over at rottenbytes.com Nicolas is showing some proof of concept code he wrote with MCollective that monitors the load on his dom0 machines and initiate live migrations of virtual machines to less loaded servers.
This is the kind of crazy functionality I wanted to enable with MCollective and it makes me very glad to see this kind of thing. The server side and client code combined is only 230 lines – very very impressive.
This is a part of what VMWare DRS does Nico has some ideas to add other sexy features as well as this was just a proof of concept. The logic for what to base migrations on will be driven by a small DSL for example.
I asked him how long it took to knock this together: time taken to get acquainted with MCollective combined with time to write the agent and client was only 2 days, that’s very impressive. He already knew Ruby well though ๐ And has a Ruby gem to integrate with Xen.
I’m copying the output from his code below, but absolutely head over to his blog to check it out he has the source up there too:
[mordor:~] ./mc-xen-balancer
[+] hypervisor2 : 0.0 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.11 load and 3 slice(s) running
[+] added test1 on hypervisor3 with 0 CPU time (registered 18.4 as a reference)
[+] added test2 on hypervisor3 with 0 CPU time (registered 19.4 as a reference)
[+] added test3 on hypervisor3 with 0 CPU time (registered 18.3 as a reference)
[+] sleeping for 30 seconds
[+] hypervisor2 : 0.0 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.33 load and 3 slice(s) running
[+] updated test1 on hypervisor3 with 0.0 CPU time eaten (registered 18.4 as a reference)
[+] updated test2 on hypervisor3 with 0.0 CPU time eaten (registered 19.4 as a reference)
[+] updated test3 on hypervisor3 with 1.5 CPU time eaten (registered 19.8 as a reference)
[+] sleeping for 30 seconds
[+] hypervisor2 : 0.16 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.33 load and 3 slice(s) running
[+] updated test1 on hypervisor3 with 0.0 CPU time eaten (registered 18.4 as a reference)
[+] updated test2 on hypervisor3 with 0.0 CPU time eaten (registered 19.4 as a reference)
[+] updated test3 on hypervisor3 with 1.7 CPU time eaten (registered 21.5 as a reference)
[+] hypervisor3 has 3 threshold overload
[+] Time to see if we can migrate a VM from hypervisor3
[+] VM key : hypervisor3-test3
[+] Time consumed in a run (interval is 30s) : 1.7
[+] hypervisor2 is a candidate for being a host (step 1 : max VMs)
[+] hypervisor2 is a candidate for being a host (step 2 : max load)
trying to migrate test3 from hypervisor3 to hypervisor2 (10.0.0.2)
Successfully migrated test3 ! |
[mordor:~] ./mc-xen-balancer
[+] hypervisor2 : 0.0 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.11 load and 3 slice(s) running
[+] added test1 on hypervisor3 with 0 CPU time (registered 18.4 as a reference)
[+] added test2 on hypervisor3 with 0 CPU time (registered 19.4 as a reference)
[+] added test3 on hypervisor3 with 0 CPU time (registered 18.3 as a reference)
[+] sleeping for 30 seconds
[+] hypervisor2 : 0.0 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.33 load and 3 slice(s) running
[+] updated test1 on hypervisor3 with 0.0 CPU time eaten (registered 18.4 as a reference)
[+] updated test2 on hypervisor3 with 0.0 CPU time eaten (registered 19.4 as a reference)
[+] updated test3 on hypervisor3 with 1.5 CPU time eaten (registered 19.8 as a reference)
[+] sleeping for 30 seconds
[+] hypervisor2 : 0.16 load and 0 slice(s) running
[+] init/reset load counter for hypervisor2
[+] hypervisor2 has no slices consuming CPU time
[+] hypervisor3 : 1.33 load and 3 slice(s) running
[+] updated test1 on hypervisor3 with 0.0 CPU time eaten (registered 18.4 as a reference)
[+] updated test2 on hypervisor3 with 0.0 CPU time eaten (registered 19.4 as a reference)
[+] updated test3 on hypervisor3 with 1.7 CPU time eaten (registered 21.5 as a reference)
[+] hypervisor3 has 3 threshold overload
[+] Time to see if we can migrate a VM from hypervisor3
[+] VM key : hypervisor3-test3
[+] Time consumed in a run (interval is 30s) : 1.7
[+] hypervisor2 is a candidate for being a host (step 1 : max VMs)
[+] hypervisor2 is a candidate for being a host (step 2 : max load)
trying to migrate test3 from hypervisor3 to hypervisor2 (10.0.0.2)
Successfully migrated test3 !
by R.I. Pienaar | Apr 11, 2010 | Code
Till now The Marionette Collective has relied on your middleware to provide all authorization and authentication for requests. You’re able to restrict certain middleware users from certain agents, but nothing more fine grained.
In many cases you want to provide much finer grain control over who can do what, some cases could be:
- A certain user can only request service restarts on machines with a fact customer=acme
- A user can do any service restart but only on machines that has a certain configuration management class
- You want to deny all users except root from being able to stop services, others can still restart and start them
This kind of thing is required for large infrastructures with lots of admins all working in their own group of machines but perhaps a central NOC need to be able to work on all the machines, you need fine grain control over who can do what and we did not have this will now. It would also be needed if you wanted to give clients control over their own servers but not others.
Version 0.4.5 will have support for this kind of scheme for SimpleRPC agents. We wont provide a authorization plugin out of the box with the core distribution but I’ve made one which will be available as a plugin.
So how would you write an auth plugin, first a typical agent would be:
module MCollective
module Agent
class Service<RPC::Agent
authorized_by :action_policy
# ....
end
end
end |
module MCollective
module Agent
class Service<RPC::Agent
authorized_by :action_policy
# ....
end
end
end
The new authorized_by keyword tells MCollective to use the class MCollective::Util::ActionPolicy to do any authorization on this agent.
The ActionPolicy class can be pretty simple, if it raises any kind of exception the action will be denied.
module MCollective
module Util
class ActionPolicy
def self.authorize(request)
unless request.caller == "uid=500"
raise("You are not allow access to #{request.agent}::#{request.action}")
end
end
end
end
end |
module MCollective
module Util
class ActionPolicy
def self.authorize(request)
unless request.caller == "uid=500"
raise("You are not allow access to #{request.agent}::#{request.action}")
end
end
end
end
end
This simple check will deny all requests from anyone but Unix user id 500.
It’s pretty simple to come up with your own schemes, I wrote one that allows you to make policy files like the one below for the service agent:
policy default deny
allow uid=500 * * *
allow uid=502 status * *
allow uid=600 * customer=acme acme::devserver |
policy default deny
allow uid=500 * * *
allow uid=502 status * *
allow uid=600 * customer=acme acme::devserver
This will allow user 500 to do everything with the service agent. User 502 can get the status of any service on any node. User 600 will be able to do any actions on machines with the fact customer=acme that also has the configuration management class acme::devserver on them. Everything else will be denied.
You can do multiple facts and multiple classes in a simple space separated list. The entire plugin to implement such policy controls was only 120 – heavy commented – lines of code.
I think this is a elegant and easy to use layer that provides a lot of functionality. We might in future pass more information about the caller to the nodes. There’s some limitations, specifically about the source of the caller information being essentially user provided so you need to keep that mind.
As mentioned this will be in MCollective 0.4.5.
by R.I. Pienaar | Apr 3, 2010 | Uncategorized
I just released version 0.4.4 of The Marionette Collective. This release is primarily a bug fix release addressing issues with log files and general code cleanups.
The biggest change in this release is that controlling the daemon has become better, you can ask it to reload an agent or all agents and a few other bits. Read all about it on the wiki..
Please see the Release Notes, Changelog and Download List for full details.
For background information about the MCollective project please see the project website.