The unix pgrep utility is great, it lets you grep through your process list and find interesting things. I wanted to do something similar but for my entire server group so built something quick ontop of MCollective.
I am using the Ruby sys-proctable gem to do the hard work, it returns a massive amount of information about each process and have written a simple agent on top of this.
The agent supports grepping the process tree but also supports kill and pgre+kill though I have not yet implemented more than the basic grep on the command line. Frankly the grep+kill combination scares me and I might remove it. A simple grep slipup and you will kill all processes on all your machine ๐ Sometimes too much power is too much and should just be avoided.
At the moment mc-pgrep outputs a set format but I intend to make that configurable on the command line, here’s a sample:
% mc-pgrep -C /dev_server/ ruby * [ ============================================================> ] 4 / 4 dev1.my.com root 9833 ruby /usr/sbin/mcollectived --pid=/var/run/mcollectived.pid root 21608 /usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_pass dev2.my.com root 14568 /usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_pass root 31595 ruby /usr/sbin/mcollectived --pid=/var/run/mcollectived.pid dev3.my.com root 1620 /usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_pass root 14093 ruby /usr/sbin/mcollectived --pid=/var/run/mcollectived.pid dev4.my.com root 3231 /usr/lib/ruby/gems/1.8/gems/passenger-2.2.2/lib/phusion_pass root 20557 ruby /usr/sbin/mcollectived --pid=/var/run/mcollectived.pid ---- process list stats ---- Matched hosts: 4 Matched processes: 8 Resident Size: 37.264KB Virtual Size: 629.578MB |
You can also limit it to only find zombies with the -z option.
This has been quite interesting for me, if I limit the pgrep to “.” (the pattern is regex) every machine will send back a Sys::ProcTable hash for all its processes. This is a 50 to 70 KByte payload per server. I’ve so far seen no problem getting his much traffic through ActiveMQ + MCollective and processing it all in a very short time:
% time mc-pgrep -F "country=/uk|us/" . ---- process list stats ---- Matched hosts: 20 Matched processes: 1958 Resident Size: 1.777MB Virtual Size: 60.072GB mc-pgrep -F "country=/uk|us/" . 0.19s user 0.06s system 7% cpu 3.420 total |
That 3.4 seconds is with a 2 second discovery overhead client machine in Germany and the filter matching UK and US machines – all the way to the West Coast – my biggest delay here is network and not MC or ActiveMQ.
The code can be found at my GitHub account and still a bit of a work in progress, wiki pages will follow once I am happy with it.
And as an aside, I am slowly migrating at least my code to GitHub if not wiki and ticketing. So far my Plugins have moved, MC will move soon too.