Select Page

Layeredtech’s thanks to old customers

I have been a customer of Layeredtech for years, at present I have only 2 machines there but at times I’ve had 7 or 8.  My one machine is pretty old, I think I got it circa 2002 or so and it’s been doing well, same hardware etc.

Yesterday I received the following email from them:

Layered Tech is committed to being the leader of the Hosted
Infrastructure market by providing our customers with the best products
backed by the best service.  In an effort to improve our customer
experience, we have determined that a small number of existing servers
will need to be relocated from their current data center.  As you are
receiving this message, we have identified that you have one or more
servers in the in area of the Savvis facility that will need to be
moved.  It is our intention to minimize any interruption in service and
we will do our best to work within predetermined time frames that are
convenient to you.

Due to the form factor (chassis type) of
this server, we will need to migrate your data to a new server. We will
work with you so that the impact is as minimal as possible.  

Below
are the servers that are affected by this migration.  Please respond to
this message acknowledging the need to relocate your server(s).  At
that point, we will move this ticket to our Operations Department where
we will work with you on a migration schedule.

From reading this you might assume they will assist you with the migrate and this is a notice of an impending change, perhaps a month or two from now?

In reality the situation is that no, they will not help you migrate your data.  They want you to take out a contract for a new machine and then migrate your data yourself – something which even at best will take 5 to 10 hours on oldish machines like this.

They do not offer any compensation, and when pressed on that point only offer 1 month…the cherry on the cake is that all this has to be done for 18 days from now, in effect they are terminating your old machine forcing you to take a new one and doing it with less than the agreed 30 days notice.  Like it or not.

The sales person who has been coordinating this from their side is incredibly unhelpful and frankly useless, only after much pushing back by me do I even get a hint that anything other than do-it-yourself migration is an option, at this point still waiting for details.

This kind of disregard for customers is typical of large hosting centres, they have thousands of customers and their hard handed handling of their customers is acceptable because at worse they’ll loose a fraction of a percentage of customers, so being unhelpful really does pay off for them since most people will probably just take this crap.

This is shockingly poor service, if you value your data, avoid Layeredtech.

Adventures with Ruby

Some more about my continuing experiences with ruby, in my last post I said

the language does what you’d expect and as you’ll see in my next post
spending a week with it on and off is enough to write a capable multi
threaded socket server.

As it turns out I quickly lived to regret saying that.  Soon after I hit publish I started running into some problems with the very same socket server.

A bit of background, Adobe has made a change to how things work moving away from their previous crossdomain.xml file served over HTTP for cross domain authorization to a new model that requires you to run a special TCP server on port 834 serving up XML over a special protocol.  I won’t go into how brain dead I think this is, suffice to say I needed to run one of those for a client.  Adobe of course does provide a server for this, but it has some issues, I chose the simplest of their examples – Perl under xinetd – and quickly discovered that it has no concept of timeouts, or anything that doesn’t speak it’s protocol.  The end result is that you just end up with a ever growing number of perl stuff running waiting around for ages.

I took this as a challenge to write something real under Ruby using it as a learning experience as well, so set out to write a multi threaded server for this.  At first glance it looks almost laughably trivial:  The Ruby STL includes GServer – a very nice class that does the hard work of thread management for you, you just inherit from it and supply the logic for your protocol and let it do the rest, awesome.

I wrote this, put in logging, options parsing and all the various bits I needed, tested it locally – 10 concurrent workers doing 200,000 requests and it served it in no time at all with limited CPU impact. I then wrote RC scripts, config files and all that and deployed it at my client.

Real soon after deploying it I noticed the wheels came off a bit.  I, out of curiosity, put in some regular logging that would print lines like:

Jun 23 08:23:37 xmpp1 flashpolicyd[7610]: Had 10042 clients of which 285 were bogus. Uptime 0 days 14 hours 2 min. 23 client(s) connected now.

Note in that line how it claims to have 23 connections at present? That’s complete b/s, I added the ability to dump actual created threads and there just weren’t enough threads for 23 clients and the TCP stack agreed…Turns out gserver has issues handling bad network connections – my clients are over GPRS, Modems, and all sorts – and it seems threads die without GServer removing them from it’s list of active connections.

This is a small problem except that GServer uses the connection count towards figuring out if you’ve hit its max connections setting.  So while I could just set that to some huge figure, it does indicate theres a memory leak – array grows for ever.  Not to mention it just leaving me with a bad taste in my mouth over the quality of my new and improved solution.

Naturally I gave up on GServer I didn’t feel like installing all sorts of Gems on the servers so figured I’ll just write my own thread handling.  While it’s not trivial its by far not the most complex thing I’ve ever done.  Happy in this case with a bit of wheel reinventing for the sake of learning. 

I chose to use the Ruby STL Logger Library for logging and even added the ability to alter log level on the fly through sending signals to the process, very nice and I were able to re-use much of the option parsing code etc from my previous attempt so this only took a few hours.

I did the development on my Mac using TextMate – the really kick arse Mac text editor that has nice Ruby support – the Mac is on Ruby 1.8.6.  I intended to run this on RHEL 4 and 5, they have Ruby 1.8.1 and 1.8.5 respectively, so I was really setting myself up for problems all of my own making.

Turn out Logger has a bug, fixed here in revision 6262 without any useful svn log, that only bit me on the RHEL 4 machine.  It would open the initial log correctly with line buffering enabled, but once it rotates the log the new log and subsequent ones wouldn’t have line buffering.  Which in my case means I get log lines showing up once every 5 hours!

This sux a lot, and it’s unlikely that RedHat will backport such a small little thing, and since RedHat 4 will be here till 2012 I guess I’ll just have to patch it myself or move to RedHat 5 on this server, something I planned to do anyway.

So something that should have been fairly trivial has turned into a bit of a pain, not really Ruby’s fault that I am using 1.8.1 when much newer versions are out, but not nice regardless.  At the end of it all my flash server is working really well and handling clients perfectly with no leaking or anything bad

I, [2008-06-26T23:02:36.607920 #22532]  INFO — : -604398464: Had 15611 clients of which 423 were bogus. Uptime 0 days 13 hours 41 min. 0 connection(s) in use now.

Those bogus clients are ones that timeout or just otherwise never complete a request, these were the ones that would trip up GServer in the past.

Once I’ve done documenting it I’ll be releasing the flash server here

On working from home

I’ve not been posting much here, work has been incredibly manic the last while, especially I need to still finish off my SSO posts with the last installment, for now though, some thoughts on my work arrangement and freelancing.
 
I have been working from home as self employed since around December (self employed for longer) and as before when I’ve done many months of home working it has started to get to me.

In the past I generally had an office to go to but didn’t unless I had to, worked from home when I could, home is nearer to the Data Center than the office and generally home has better workstations.

Now as I am self employed I do not have an office to go to and I’ve also realised that all the reading I have done in the past were generally on the train while commuting, and this is something that I cannot let slip but just do not get around to reading while at home.

So I now have an office in London Soho, it is just a desk in a shared office full of other freelancers and startups but for the last 2 weeks I have been pretty happy, I have learned Ruby (more about that in my next post) and I am finding I really enjoy the ritual of going to an office again, but an office where there aren’t project managers and other people waiting to harass you, I think that is the main ingredient.

I will probably stay at this office till the clocks change for winter time, then I’ll be working from home again till the summer.  The main thing is I have the freedom to choose now and that more than anything is what I love about self employment.

Nasty PHP Authentication Handling

Sometimes you come across things that just make you wonder what is going on in peoples minds.

For years everyone who wrote applications compatible with the standard HTTP Authentication method has used the REMOTE_USER server variable as set by Apache to check the username that was logged in by the webserver, this has worked well for everyone, CGI’s and all would just grab it there and everyone would be happy.

Along comes PHP and they make great big mess of it, PHP suggests that we use $_SERVER[‘PHP_AUTH_USER’] instead, and they give some good reasons for this too, except they have severely crippled this for all but Basic and Digest authentication, the following code from main/main.c

        if (auth && auth[0] != ‘\0’ && strncmp(auth, “Basic “, 6) == 0) {
                char *pass;
                char *user;

                user = php_base64_decode(auth + 6, strlen(auth) – 6, NULL);
                if (user) {
                        pass = strchr(user, ‘:’);
                        if (pass) {
                                *pass++ = ‘\0’;
                                SG(request_info).auth_user = user;
                                SG(request_info).auth_password = estrdup(pass);
                                ret = 0;
                        } else {
                                efree(user);
                        }
                }
        }

As you can see above, they only import the user and pass from Apache if the AuthType is Basic, this makes no sense at all.  Why not just check with Apache, if it set the username then import it? Surely Apache know if a user has authenticated? Ditto for password.  It is so broken in fact that PHP in CGI mode also doesn’t work since those headers don’t get set for that either, countless comments and nasty hacks can be found in the PHP user contributed notes about this, but it is all just sillyness.

The reason this is annoying me is that I have written a Single Singon system in PHP, you can host a identity server on any domain and hook any site in any other domain into the SSO system, its a bit like TypeKey

Of course it’s nice to have a easy to use SSO system in PHP but what is the point if you can’t make legacy apps like Nagios, Cacti, RT etc play along with the SSO?  So to solve this I extended Apache::AuthCookie with a new mod_perl module that plugs into Apache and does authentication using my SSO and a small bit of glue that you put on your RT/Cacti/Nagios box.

All’s great, I have SSO to Nagios, RT and countless other things working flawlessly, except of course Cacti because it’s written along the lines of the PHP manual, uses PHP_AUTH_USER instead of REMOTE_USER and so my new fancy AuthType in Apache does not work with Cacti.   As it turns out its a quick 2 liner fix in the Cacti code but you would think PHP would be a bit more generic in this regard since as it stands now I think a lot of people who want to do SSO using hardware tokens and such have issues with PHP being silly.

Detailed Apache Stats

Apache has its native mod_status status page that many people use to pull stats into tools such as Cacti and other RRDTool based stats packages. This works well but does not always provide enough details, questions such as these remain unanswered:

  • How many of my requests are GET and how many are POST?
  • How many 404 errors and 5xx errors do I get on my site as a whole and for script.php specifically?
  • What is the average response time for the whole server, and for script.php?
  • How many Closed, Keep Alive and Aborted connections do I have?

To answer this I wrote a script that keeps a running track of your Apache process, it has many fine grained controls that let you fine tune exactly what to keep stats on. I got the initial idea from an old ONLamp article titled Profiling LAMP Applications with Apache’s Blackbox Logs.

The article proposes a custom log format that provides the equivelant to an airplanes blackbox, a flight recorder that records more detail per request than the usual common log formats do. I suggest you read the article for background information. The article though stops short of a full data parser so I wrote one for a client who kindly agreed that I can opensource it.

Using this and some glue in my Cacti I now have graphs showing a profile of the requests I receive for the whole site, but as you are able to apply fine grained controls to select what exactly you’ll see, you could get per server overview stats and details for just a specific scripts performance and statuses:

The script creates on a regular interval a file that contains the performance data, the data is presented in variable=value data pairs, I will soon provide a Cacti and Nagios plugin to parse this output to ease integration into these tools.

The performance data includes values such as:

  • Amount of requests in total
  • Total size of requests separated by in and out bytes
  • Average response time
  • Total processing time.
  • Counts of connections in Close, Keep Alive and Aborted states.
  • Counts for each valid HTTP Status code, and aggregates for 1xx, 2xx, 3xx, 4xx and 5xx.
  • The amount of GET and POST requests.
  • And detail for each and every unique request the server serves.

See the Sample Stats for a good example, variables are pretty self explanatory. To keep the data set small and manageable 2 selectors exist, one to choose which requests to keep details for and which to keep stats for. These can be combined with standard Apache directives such as Location to provide very fine grained stats for all or a subset of your site.

You would need some glue to plug this into Cacti and Nagios, I will provide a script for this soon as I have time to write up some docs for it.

Install guide etc can be found on my GitHub there is also extensive Perldoc Docs in the script, the GitHub also have links to downloading the script.

Passport

Today my UK passport finally arrived, I did have to go for an interview where they re-established that I am who I say I am.

The interview was quite interesting, when I went to write my Britishness Test there were only 4 or so other people at the test and I kind of looked at that as an indication that perhaps, as usual, there is more hype in the media related to immigration than is really needed.

They arrange the interview meetings in blocks of 45 minutes and only allow you in 10 minutes before your block start, so you can know who is in there are all people who are applying for their first passport.  I went to the interview center in Elephant & Castle, its a fairly big facility with almost 30 interview cubicles, in my 45 minutes block there were about 80 or so people waiting for the interview.

This kind of brought it home a bit more that yes really there is a massive immigration problem, the interview centers are open 6 days a week and if they are usually anywhere near as busy as when I went – noon on a Friday – then I’d say the rate of immigration is totally unsustainable.  It is easier to visualize the problem in a setting like this than to read some big figure in a news paper.

The interview itself was very professionally done, the woman who interviewed me were friendly, thorough and I think the whole thing was actually pretty enjoyable apart from the obvious inconvenience involved.  They mostly asked me to confirm what I already filled in on my application forms but also some extra things like what is my Car Registration, what bank accounts I have and when they were opened.  It was all simple stuff and went really quick.