Usually when I describe mcollective to someone they generally think its nice and all but the infrastructure to install is quite a bit and so ssh parallel tools like cap seems a better choice. They like the discovery and stuff but it’s not all that clear.
I have a different end-game in mind than just restarting services, and I’ve made a video to show just how I manage a cluster of Exim servers using mcollective. This video should give you some ideas about the possibilities that the architecture I chose brings to the table and just what it can enable.
While watching the video please note how quick and interactive everything is, then keep in mind the following while you are seeing the dialog driven app:
I am logged in via SSH from UK to Germany into a little VM there
The mcollective client talks to a Germany based ActiveMQ
The 4 mail servers in the 2nd part of the demo are based 2 x US, 1 x UK and 1 x DE
I have ActiveMQ instances in each of the above countries clustered together using the technique previous documented here.
Here’s the video then, as before I suggest you hit the full screen link and watch it that way to see what’s going on.
This is the end game, I want a framework to enable this kind of tool on Unix CLI – complete with pipes as you’d expect – things like the dialog interface you see here, on the web, in general shell scripts and in nagios checks like with cucumber-nagios, all sharing a API and all talking to a collective of servers as if they are one. I want to make building these apps easy, quick and fun.
I eventually killed it after 2 days of not finishing, the problem is, obviously, that sed does not seek to the position, it reads the whole file. So pulling out the last line of a 150GB file requires reading 150GB of data, if you have 120 tables this is a huge problem.
The below code is a new take on it, I am just reading the file with ruby and spitting out the resulting files with 1 read operation, start to finish on the same data was less than a hour. When run it gives you nice output like this:
Found a new table: sms_queue_out_status
writing line: 1954 2001049770 bytes in 91 seconds 21989557 bytes/sec
Found a new table: sms_scheduling
writing line: 725 729256250 bytes in 33 seconds 22098674 bytes/sec
Found a new table: sms_queue_out_status
writing line: 1954 2001049770 bytes in 91 seconds 21989557 bytes/sec
Found a new table: sms_scheduling
writing line: 725 729256250 bytes in 33 seconds 22098674 bytes/sec
The new code below:
#!/usr/bin/rubyif ARGV.length == 1
dumpfile = ARGV.shiftelseputs("Please specify a dumpfile to process")exit1end
STDOUT.sync = trueifFile.exist?(dumpfile)
d = File.new(dumpfile, "r")
outfile = false
table = ""
linecount = tablecount = starttime = 0while(line = d.gets)if line =~ /^-- Table structure for table .(.+)./
table = $1
linecount = 0
tablecount += 1puts("\n\n")if outfile
puts("Found a new table: #{table}")
starttime = Time.now
outfile = File.new("#{table}.sql", "w")endif table != ""&& outfile
outfile.syswrite line
linecount += 1
elapsed = Time.now.to_i- starttime.to_i+1print(" writing line: #{linecount} #{outfile.stat.size} bytes in #{elapsed} seconds #{outfile.stat.size / elapsed} bytes/sec\r")endendendputs
#!/usr/bin/ruby
if ARGV.length == 1
dumpfile = ARGV.shift
else
puts("Please specify a dumpfile to process")
exit 1
end
STDOUT.sync = true
if File.exist?(dumpfile)
d = File.new(dumpfile, "r")
outfile = false
table = ""
linecount = tablecount = starttime = 0
while (line = d.gets)
if line =~ /^-- Table structure for table .(.+)./
table = $1
linecount = 0
tablecount += 1
puts("\n\n") if outfile
puts("Found a new table: #{table}")
starttime = Time.now
outfile = File.new("#{table}.sql", "w")
end
if table != "" && outfile
outfile.syswrite line
linecount += 1
elapsed = Time.now.to_i - starttime.to_i + 1
print(" writing line: #{linecount} #{outfile.stat.size} bytes in #{elapsed} seconds #{outfile.stat.size / elapsed} bytes/sec\r")
end
end
end
puts
I often need to split large mysql dumps into smaller files so I can do selective imports from live to dev for example where you might not want all the data. Each time I seem to rescript some solution for the problem. So here’s my current solution to the problem, it’s a simple Ruby script, you give it the path to a mysqldump and it outputs a string of echo’s and sed commands to do the work.
UPDATE: Please do not use this code, it’s too slow and inefficient, new code can be found here.
Just pipe it’s output to a file and run it via shell when you’re ready to do the splitting. At the end you’ll have a file per table in your cwd.
Till now people wanting to test this had to pull out of SVN directly, I put off doing a release till I had most of the major tick boxes in my mind ticked and till I knew I wouldn’t be making any major changes to the various plugins and such. This release is 0.2.x since 0.1.x was the release number I used locally for my own testing.
This being the first release I fully anticipate some problems and weirdness, please send any concerns to the mailing list or ticketing system.
I am keen to get feedback from some testers, specifically keen to hear thoughts around these points:
How does the client tools behave on 100s of nodes, I suspect the output format might be useless if it just scrolls and scrolls, I have some ideas about this but need feedback.
On large amount of hosts, or when doing lots of requests soon after each other, do you notice any replies going missing.
Feed back about the general design principals, especially how you find the plugin system and what else you might want pluggable. I for example want to make it much easier to add new discovery methods.
Anything else you can think of
I’ll be putting in tickets on the issue system for future features / fixes I am adding so you can track there to get a feel for the milestones toward 0.3.x.
Thanks goes to the countless people who I spoke to in person, on IRC and on Twitter, thanks for all the retweets and general good wishes. Special thanks also to Chris Read who made the debian package code and fixed up the RC script to be LSB compliant.
Most of the applications I write in Ruby are some kind of Framework, ruby-pdns takes plugins, mcollective takes plugins, my nagios notification bot takes plugins etc, yet I have not yet figured out a decent approach to handling plugins.
Google suggests many options, the most suggested one is something along these lines.
class Plugin
defself.inherited(klass)
PluginManager << klass.newendendclass FooPlugin<Plugin
end
class Plugin
def self.inherited(klass)
PluginManager << klass.new
end
end
class FooPlugin<Plugin
end
Where PluginManager is some class or module that stores and later allows retrieval, when the FooPlugin class gets created it will trigger the hook in the base class.
This works ok, almost perfectly, except that at the time of the trigger the FooPlugin class is not 100% complete and your constructor will not be called, quite a pain. From what I can tell it calls the constructor on either Class or Object.
I ended up tweaking the pattern a bit and now have something that works well, essentially if you pass a String to the PluginManager it will just store that as a class name and later create you an instance of that class, else if it’s not a string it will save it as a fully realized class assuming that you know what you did.
The full class is part of mcollective and you can see the source here but below the short version:
I am quite annoyed that including a module does not also include static methods in Ruby, its quite a huge miss feature in my view and there are discussions about changing that behavior. I had hopes of writing something simple that I can just do include Pluggable and this would set up all the various bits, create the inherited hook etc, but it’s proven to be a pain and would be littered with nasty evals etc.
module PluginManager
@plugins = {}defself.<<(plugin)
type = plugin[:type]
klass = plugin[:class]raise("Plugin #{type} already loaded")if@plugins.include?(type)if klass.is_a?(String)@plugins[type] = {:loadtime =>Time.now, :class=> klass, :instance=>nil}else@plugins[type] = {:loadtime =>Time.now, :class=> klass.class, :instance=> klass}endenddefself.[](plugin)raise("No plugin #{plugin} defined")unless@plugins.include?(plugin)# Create an instance of the class if one hasn't been done beforeif@plugins[plugin][:instance] == nilbegin
klass = @plugins[plugin][:class]@plugins[plugin][:instance] = eval("#{klass}.new")rescueException=> e
raise("Could not create instance of plugin #{plugin}: #{e}")endend@plugins[plugin][:instance]endendclass Plugin
defself.inherited(klass)
PluginManager <<{:type =>"facts_plugin", :class=> klass.to_s}endendclass FooPlugin<Plugin
end
module PluginManager
@plugins = {}
def self.<<(plugin)
type = plugin[:type]
klass = plugin[:class]
raise("Plugin #{type} already loaded") if @plugins.include?(type)
if klass.is_a?(String)
@plugins[type] = {:loadtime => Time.now, :class => klass, :instance => nil}
else
@plugins[type] = {:loadtime => Time.now, :class => klass.class, :instance => klass}
end
end
def self.[](plugin)
raise("No plugin #{plugin} defined") unless @plugins.include?(plugin)
# Create an instance of the class if one hasn't been done before
if @plugins[plugin][:instance] == nil
begin
klass = @plugins[plugin][:class]
@plugins[plugin][:instance] = eval("#{klass}.new")
rescue Exception => e
raise("Could not create instance of plugin #{plugin}: #{e}")
end
end
@plugins[plugin][:instance]
end
end
class Plugin
def self.inherited(klass)
PluginManager << {:type => "facts_plugin", :class => klass.to_s}
end
end
class FooPlugin<Plugin
end
For mcollective I only ever allow one of a specific type of plugin so the code is a bit specific in that regard.
I think late creating the plugin instances is quite an improvement too since often you’re loading in plugins that you just don’t need like client apps would probably not need a few of the stuff I load in and creating instances is just a waste.
I am not 100% sold on this approach as the right one, I think I’ll probably refine it more and would love to hear what other people have done.
This has though removed a whole chunk of grim code from mcollective since I now store all plugins and agents in here and just fetch them as needed. So already this is an improvement to what I had before so I guess it works well and should be easier to refactor for improvements now.