by R.I. Pienaar | Dec 5, 2010 | Uncategorized
I’ve previously blogged about IPSec on RedHat and mentioned how great the ifcfg scripts are to get IPSec VPNs going.
In that post I used a pre-shared key to start the VPNs, that was fine then but now I need something a bit better. IPSec supports the standard PKI infrastructure and the RedHat scripts support those too. Their use though isn’t well documented so here is what I found through investigation.
First you’ll need a CA. The CA should be used to self sign your certificates and every node needs one matching their Common Name. You also need a CRL and the CA certificate on all the machines. How you go about making a CA is a bit out of scope for this post, there are many options out there like TinyCA.
The complexity comes in how to install these certificates into the Racoon directory as it depends on very specific file names.
Given the RedHat interface config script below that can be saved in /etc/sysconfig/network-scripts/ifcfg-ipsec.remote.host.net:
DST=1.2.3.4
TYPE=IPSEC
ONBOOT=yes
IKE_CERTFILE=/etc/racoon/certs/host.cert |
DST=1.2.3.4
TYPE=IPSEC
ONBOOT=yes
IKE_CERTFILE=/etc/racoon/certs/host.cert
You need to have the following files installed:
-
/etc/racoon/certs/host.cert.private – The private key part of your certificate – without a passphrase so it needs secure permissions
-
/etc/racoon/certs/host.cert.public – The public part of the host certificate
-
/etc/racoon/certs/a63b58d3.0 – The Certificate Authority certificate – more on the name below
-
/etc/racoon/certs/a63b58d3.r0 – The CRL from the CA – more on the name below
This first 2 are simple, you can replace host.cert with anything as long as they match with what is in the interface config script. The .private and .public parts should not be changed.
The last two are a bit more tricky. You’ll get the CA certificate and CRL from your CA you then need to calculate the hash from the CA certificate:
# openssl x509 -hash -noout -in ca.pem
a63b58d3 |
# openssl x509 -hash -noout -in ca.pem
a63b58d3
Use the hash you obtained from that and name both your CA cert and the CRL according to this.
When the VPN gets brought up it will validate the certificates on both ends against the CA and the CRL. So you can easily invalidate connections by just adding them to the CRL and you know only certs signed by your own CA can connect to the IPSec server.
Just like certificates the CRL has a validity you should monitor this since if your CRL is invalid no VPNs will be established. I have published a Nagios check that I use to monitor both CRLs and Certificates here.
You still need to be pretty careful about who has access to your certs since you cannot through the simple scripts limit which Common Names can connect to the server and you should still firewall your ISAKMP port (udp/500) to allow only your trusted networks to communicate with the server.
by R.I. Pienaar | Nov 24, 2010 | Uncategorized
Every team seems to have their own Status boards yet there are very few generic solutions to this problem.
I asked my coworker to write a framework for building status boards so that people can just focus on the problem of displaying their data. The result is a great Ruby on Rails based system that provides a WYSIWYG editor for doing your layouts and a simple plugin system that let you build almost any kind of status board.
The framework takes care of all the hard things like managing the display and layout, refreshing your boards, saving of layouts and doing later edits. We’ve built it as opensource in the hope that there will be community contributed plugins.
You can see how the board looks in action below:
What you’re seeing in the green dots are Nagios service groups and their current status, some internal metrics of users, a Cacti graph and a CPU graph that is read from a JSON file.
You can see a video of the editor in action and a bit about the plugin infrastructure here also there is a wiki page and you can grab the source on GitHub. We still have some rough edges to smooth over so we’d appreciate any feedback.
Lastly if you like it and think you might use it one day please vote for us at the Ultimate Wall Board competition!
by R.I. Pienaar | Nov 18, 2010 | Uncategorized
Rake is the Ruby make system, it lets you write tasks using the Ruby language and is used in most Ruby projects.
I wanted to automate some development tasks and had to figure out a few patterns, thought I’d blog them for my own future reference and hopefully to be useful to someone else.
In general people pass variables on the command line to rake, something like:
$ ENVIRONMENT=production rake db::migrate |
$ ENVIRONMENT=production rake db::migrate
This is ok if you know all the variables you can pass but sometimes it’s nice to support asking if something isn’t given:
def ask(prompt, env, default="")
return ENV[env] if ENV.include?(env)
print "#{prompt} (#{default}): "
resp = STDIN.gets.chomp
resp.empty? ? default : resp
end
task :do_something do
what = ask("What should I do", "WHAT", "something")
puts "Doing #{what}"
end |
def ask(prompt, env, default="")
return ENV[env] if ENV.include?(env)
print "#{prompt} (#{default}): "
resp = STDIN.gets.chomp
resp.empty? ? default : resp
end
task :do_something do
what = ask("What should I do", "WHAT", "something")
puts "Doing #{what}"
end
This will support running like:
$ WHAT="foo" rake do_something |
$ WHAT="foo" rake do_something
It will also though prompt for the information with a default if the environment variable isn’t set. The only real trick here is that you have to use STDIN.gets and not just gets. For added niceness you could use Highline to build the prompts but that’s an extra dependency.
Here is what it does if you don’t specify a environment var:
$ rake do_something
(in /home/rip/temp)
What should I do (something): test
Doing test |
$ rake do_something
(in /home/rip/temp)
What should I do (something): test
Doing test
The second tip is about rendering templates, I am using the above ask method to ask a bunch of information from the user and then building a few templates based on that. So I am calling ERB a few times and wanted to write a helper.
def render_template(template, output, scope)
tmpl = File.read(template)
erb = ERB.new(tmpl, 0, "<>")
File.open(output, "w") do |f|
f.puts erb.result(scope)
end
end
task :do_something do
what = ask("What do you want to do", "WHAT", "something")
render_template("simple.erb", "/tmp/output.txt", binding)
end |
def render_template(template, output, scope)
tmpl = File.read(template)
erb = ERB.new(tmpl, 0, "<>")
File.open(output, "w") do |f|
f.puts erb.result(scope)
end
end
task :do_something do
what = ask("What do you want to do", "WHAT", "something")
render_template("simple.erb", "/tmp/output.txt", binding)
end
The template would just have:
This is pretty standard stuff the only odd thing is the binding, it basically takes the current scope of the do_something task and passes it into the render_template method that then pass it along to the erb. This way your local variable – what in this case – is available as a local variable in the template.
by R.I. Pienaar | Nov 16, 2010 | Uncategorized
I am working toward releasing The Marionette Collective 1.0.0 and finishing off a last few features. I’ll maintain 1.0.0 as a long term supported stable version while 1.1.x will be where new development will happen.
One of the last features that will go into 1.0.0 is the ability to do a discovery but then only target a subset of hosts.
I have a Nagios Notification system called Angelia and I have three instances of this deployed on my network. Any machine that needs to send a message can do this via the angelianotify agent. I need the Angelia service to be highly available so that even if I do maintenance on one instance the others should transparently keep sending my alerts.
The problem is that till now I had to filter based on enough state to uniquely identify a single instance of Angelia else I will receive multiple alerts, this new feature solves this problem.
$ mc-rpc -1 angelianotify sendmsg msg="hello world" recipient="clickatell://0044xxx" |
$ mc-rpc -1 angelianotify sendmsg msg="hello world" recipient="clickatell://0044xxx"
With the new -1 flag a discovery will be done, it will find 3 instances and by default it will pick the the node that was quickest to respond. This is a poor mans nearest detection. You can also configure it to send the request to a random node in the results.
The -1 is shorthand for another option, the full option is –limit-nodes 1 so you can specify any arbitrary number of nodes but you can also specify something like –limit-nodes 10% to only hit a subset of your machines.
The percentage targeting is useful for only initiating deploys against a subset of your nodes while your monitoring system detects errors before you do your full deploy to all your servers.
This was a simple thing to implement but I think it’s a pretty big deal – it helps us step further away from DNS as an addressing system and rather use your metadata.
It also means if you’re exposing services over SimpleRPC you can now easily create a very reliable network of these services using the communication layer provided by the middleware. You can create a fully meshed network of ActiveMQ servers that can route around network outages, around maintenance windows and other outages without costly load balancers or without issues such as relying on locally redundant hosted instances of these services.
With my Angelia example above – should my monitoring server in the US want to send an alert while I am doing maintenance on the nearest Angelia instance it will simply use the next nearest one, probably the one in the UK. Effectively and cheaply realizing my need for a HA system over some arbitrarily simple bit of software like Angelia that doesn’t need to concern itself with being HA aware.
by R.I. Pienaar | Nov 14, 2010 | Uncategorized
Puppet compiles its manifests into a catalog, the catalog is derived from your code and is something that can be executed on your node.
This model is very different from other configuration management system that tend to execute top down and just run through the instructions in a very traditional manner.
Having a compiled artifact has many advantages most of which aren’t really exposed to users today, I have a lot of ideas on how I would like to use the catalog – and the graph it contains. The first idea is to be able to compare them and identify changes between versions of your code.
For this discussion I’ll start with the code below:
class one {
file{"/tmp/test": content => "foo" }
}
class two {
include one
file{"/tmp/test1": content => "foo";
"/tmp/test2": content => "foo";
}
}
include two |
class one {
file{"/tmp/test": content => "foo" }
}
class two {
include one
file{"/tmp/test1": content => "foo";
"/tmp/test2": content => "foo";
}
}
include two
When I run it I get 3 files:
-rw-r--r-- 1 root root 3 Nov 14 11:32 /tmp/test
-rw-r--r-- 1 root root 3 Nov 14 11:32 /tmp/test1
-rw-r--r-- 1 root root 3 Nov 14 11:31 /tmp/test2
Being able to diff the catalog has a lot of potential. Often when you look at a diff of code it’s hard to know what the end result would be, especially if you use inheritance heavily or if your code relies on external data like from extlookup. Since the puppet master now supports compiling catalogs and spitting them out to STDOUT you also have the possibility to compile machine catalogs on a staging master and compare it against the production catalog without any risk.
The other use case could be during major version upgrades where you wish to validate the next release of Puppet will behave the same way as the old one. We’ve had problems in the past where 0.24.x would evaluate templates differently from later versions and you get unexpected changes being rolled out to your machines.
Lets make a change to our code above, here’s the diff of our change:
--- test.pp 2010-11-14 11:35:57.000000000 +0000
+++ test2.pp 2010-11-14 11:36:06.000000000 +0000
@@ -5,6 +5,8 @@
class two {
include one
+ File{ mode => 400 }
+
file{"/tmp/test1": content => "foo";
"/tmp/test2": content => "foo"; |
--- test.pp 2010-11-14 11:35:57.000000000 +0000
+++ test2.pp 2010-11-14 11:36:06.000000000 +0000
@@ -5,6 +5,8 @@
class two {
include one
+ File{ mode => 400 }
+
file{"/tmp/test1": content => "foo";
"/tmp/test2": content => "foo";
This is the kind of thing you’ll see in mail if you have your SCM set up to mail diffs or while sitting in a change control meeting. The change looks simple enough you want to just change the mode of /tmp/test1 and /tmp/test2 to 400 rather than the default.
When you run this code though you’ll see that /tmp/test also change! This is because setting defaults applies to included classes too, and this is exactly the kind of situation that is very hard to pick up from diffs and to be able to guess the full impact of the change.
My diff tool will have shown you this (format slightly edited):
Resource counts:
Old: 516
New: 516
Catalogs contain the same resources by resource title
Individual Resource differences:
Old Resource:
file{"/tmp/test": content => acbd18db4cc2f85cedef654fccc4a4d8 }
New Resource:
file{"/tmp/test": mode => 400, content => acbd18db4cc2f85cedef654fccc4a4d8 }
Old Resource:
file{"/tmp/test1": content => acbd18db4cc2f85cedef654fccc4a4d8 }
New Resource:
file{"/tmp/test1": mode => 400, content => acbd18db4cc2f85cedef654fccc4a4d8 }
Old Resource:
file{"/tmp/test2": content => acbd18db4cc2f85cedef654fccc4a4d8 }
New Resource:
file{"/tmp/test2": mode => 400, content => acbd18db4cc2f85cedef654fccc4a4d8 } |
Resource counts:
Old: 516
New: 516
Catalogs contain the same resources by resource title
Individual Resource differences:
Old Resource:
file{"/tmp/test": content => acbd18db4cc2f85cedef654fccc4a4d8 }
New Resource:
file{"/tmp/test": mode => 400, content => acbd18db4cc2f85cedef654fccc4a4d8 }
Old Resource:
file{"/tmp/test1": content => acbd18db4cc2f85cedef654fccc4a4d8 }
New Resource:
file{"/tmp/test1": mode => 400, content => acbd18db4cc2f85cedef654fccc4a4d8 }
Old Resource:
file{"/tmp/test2": content => acbd18db4cc2f85cedef654fccc4a4d8 }
New Resource:
file{"/tmp/test2": mode => 400, content => acbd18db4cc2f85cedef654fccc4a4d8 }
Here you can clearly see all 3 files will be changed and not just two. With this information you’d be much better off in your change control meeting than before.
The diff tool works in a bit of a round about manner and I hope to improve the usage a bit in the near future. First you need to dump the catalogs into a format unique to this tool set and finally you can diff this intermediate format. The reason for this is that you can compare catalogs from different versions of puppet code so you need to go via an intermediate format.
There’s one thing worth noting. I initially wrote it to help with a migration from 0.24.8 to 0.25.x or even 2.6.x and in my initial tests this seemed fine but on more extensive testing with bigger catalogs I noticed a number of strange things in the 0.24.x catalog format. First it doesn’t contain all the properties for Defined Types and 2nd it sets a whole lot of extra properties on resources filling in blanks left by the user.
What this means is that if you diff a 0.24.x catalog vs the same code on newer versions you’ll likely see it complain that all your defined type resources are missing from the 0.24 catalog and you might also get some false positives on resource diffs. I can’t do much about the missing resources but what I can do is clear up the false positives, I already handle the ones in my manifests but there are no doubt more if you let me know of them I’ll see about working around them too.
The code for this can be found in my GitHub account. It’s still a bit of a work in progress as I haven’t actually done my migration yet so subscribe to the repo there’s likely to be frequent changes still.