I am under contract to be systems administrator at a Social Media company called Handmade Mobile, their prime product is Flirtomatic (You might want to wait till you’re home from work to visit the site), a site for adults who just want to chat and meet fun people, it’s not a dating site in the normal sense it’s more like MSN with with some profile searching, pictures, etc added.

I got involved with them about 2 or 2.5 years ago they were having some issues and needed a re-do of their servers, network, etc. I designed a simple easy to manage network and put in servers built the way I liked them using Open Source technologies which I still maintain and administer today.

They’ve been making some good waves in the last while, 2 items I think are worth mentioning:

Back in September they set a in-house record for monthly WAP impressions with more than 100 Million WAP impressions in the month. One report on this mentioned that In March the amount of searches across all the major search engines combined done from phones were around 20 million, so this puts the 100 Million WAP impressions in a extremely good light.

Today I got news of another impressive bit of stat about them - A company called WatchMouse who specializes in monitoring web site response and stability did a test against 104 Social Media sites for performance, time to load the whole page etc. they also penalized your points for any failed loads etc.

Out of the lot Facebook were the worst, Flirtomatic though came fourth which I have to say I am very impressed by as Flirtomatic has huge amounts of photos on their front page compared to number 1 faceparty for example. Faceparty’s front page weighs in at a very light 44KB (1KB for the HTML) while Flirtomatic is 630KB (17KB for the HTML) so I think being 4th fastest given its bulk is great.

Anyway, kudos to them :)

Detailed Apache Stats

Apache has its native mod_status status page that many people use to pull stats into tools such as Cacti and other RRDTool based stats packages. This works well but does not always provide enough details, questions such as these remain unanswered:

  • How many of my requests are GET and how many are POST?
  • How many 404 errors and 5xx errors do I get on my site as a whole and for script.php specifically?
  • What is the average response time for the whole server, and for script.php?
  • How many Closed, Keep Alive and Aborted connections do I have?

To answer this I wrote a script that keeps a running track of your Apache process, it has many fine grained controls that let you fine tune exactly what to keep stats on. I got the initial idea from an old ONLamp article titled Profiling LAMP Applications with Apache’s Blackbox Logs.

The article proposes a custom log format that provides the equivelant to an airplanes blackbox, a flight recorder that records more detail per request than the usual common log formats do. I suggest you read the article for background information. The article though stops short of a full data parser so I wrote one for a client who kindly agreed that I can opensource it.

Using this and some glue in my Cacti I now have graphs showing a profile of the requests I receive for the whole site, but as you are able to apply fine grained controls to select what exactly you’ll see, you could get per server overview stats and details for just a specific scripts performance and statuses:

The script creates on a regular interval a file that contains the performance data, the data is presented in variable=value data pairs, I will soon provide a Cacti and Nagios plugin to parse this output to ease integration into these tools.

The performance data includes values such as:

  • Amount of requests in total
  • Total size of requests separated by in and out bytes
  • Average response time
  • Total processing time.
  • Counts of connections in Close, Keep Alive and Aborted states.
  • Counts for each valid HTTP Status code, and aggregates for 1xx, 2xx, 3xx, 4xx and 5xx.
  • The amount of GET and POST requests.
  • And detail for each and every unique request the server serves.

See the Sample Stats for a good example, variables are pretty self explanatory. To keep the data set small and manageable 2 selectors exist, one to choose which requests to keep details for and which to keep stats for. These can be combined with standard Apache directives such as Location to provide very fine grained stats for all or a subset of your site.

You would need some glue to plug this into Cacti and Nagios, I will provide a script for this soon as I have time to write up some docs for it.

Install guide etc can be found on my Wiki there is also extensive Perdoc Docs in the script, the Wiki also have links to downloading the script, the latest is always available here

Useful Xen Utilities
Today on freshmeat I noticed 2 useful utilities for anyone running Xen Servers.

The first is called Virt-top it is a easier to read top like tool than xm top that shows all virtual machines memory and CPU usage in a nice display including totals etc:


The other - Virt-P2V - it’s a CD Image that you can boot a physical machine with that will then convert it to a virtual machine for you. It will scp the drive image to a destination of your choice and create a config file to boot it after asking you some questions. I intend to use this to move a VMWare virtual machine to Xen soon, will post here and see how it goes.

Both of these come from a Red Hat employee, with some luck we’ll see these included in Red Hat Linux soon.

British Citizenship
I previously mentioned that got a letter confirming it all went well with my application for naturalisation, the whole process is now more or less done.

I had the ceremony last Thursday and around 11:24 in the morning the Mayor of Greenwich handed me my certificate so I am now all done with that and a full Citizen of the United Kingdom. I arrived here on the 2nd of Feb 2002 and became a citizen on the 7th of Feb 2008. I could have applied last year in March already and probably would have been done with it all around September but I was procrastinating and eventually the noise about the reforms in the immigration laws gave me the kick I needed to complete it.

The biggest advantage I’ll see immediately is of course the passport, traveling as a South African - or in fact being a South African out of South Africa - is such a liability your whole life is just tough, massive headache of visas, immigration time wasting etcetc, endless hassle. In tourist visas alone I spent about GBP500 in the last few years never mind all the time wasted in getting those and even just in queuing in the non EU citizen lines at airports, all gone now! I’ve also had to struggle quite a lot with tenancy agreements for flats that I rent etc as I was never sure if I’ll even be in the country for the year they want you to sign, so had to always get 6 month break clauses put in etc.

This is a part of the certificate I received during the ceremony:

Today I’ll apply for my first UK passport, it should come through in about 2 weeks unfortunately just too late to attend FOSDEM.

The process for applying for citizenship is all hyped up to be this fantastic experience for applicants, a great introduction to the country and its people. This is done through the test you need to pass and a formal ceremony that even includes singing God Save The Queen.

Overall I’d say the whole thing just left me cold, personally I see little point to most of the hoops I had to jump through. I have to say though that the test has some value - it tests that you have a grasp of English and in that function its a success so I’d keep it for that reason. The ceremonies though? waste of time and money in my eye.

RedHat 5.1 tunable kernel ticks per second
For some time the default clock rate on RedHat machines (and probably others) have been 1000HZ, this is great to keep your mouse moving smooth while something big is happening in the background, but not so great for hosting 10 virtual machines on one poor physical machine as it will have to try and satisfy 10000 ticks per second.

I’ve been using a guest kernel repository by one of the VMWare users that rebuilds the std CentOS/RedHat kernels with HZ=100 and it’s been great, chopped massive amounts off my CPU usage on the host.

Now with RedHat 5.1 this is not needed anymore see this post for a bit of a graph on the impact and the background. The short of it is, simply append divider=10 to your guest kernel boot parameters and enjoy a much happier host. I found that time keeping also becomes more predictable in the guest.

Extracting only certain lines from a file
This is probably old news to most people but I need to remember this so I figured I may as well blog it.

I made a mysqldump that just takes all databases into a single file, already I want to kick myself because I know if I ever need to import it there will be troubles because the target database will already have the mysql database etc.

Really I should have used MySQL Parallel Dump that makes files per tables etc and is much faster but it didn’t exist at the time.

So how to pull lines 8596 to 9613 from this big file? It’s trivial with sed:

here is a sample file:

$ cat > file.txt line 1 line 2 line 3 line 4 line 5 ^D  $ sed -n '2,4p;4q' file.txt line 2 line 3 line 4 

The sed command just tells it the start to end line and also to quit processing when it hits the end line, really kewl.

NetNewsWire is set free
I just noticed that the folks over at Newsgator has set pretty much their whole product suite free today, free is Newsgator for Windows, NetNewsWire for the Mac, the online version of NewsGator and so is NewGator Go! for your phone.

This is pretty huge news as all the products I just mentioned syncs with each other seamlessly and have great UI’s, NewNewsWire has been my reader of choice for ages.

For the paranoids out there though there’s this little tid bit in the new features list:

Sort by attention: NetNewsWire now tracks more information about what you do and can tell which feeds are most important to you.

So you probably want to find out exactly what that’s all about first.

British Citizenship
I just received the following in the post:

Thanks you for submitting your application for British citizenship. I am pleased to say that the application has been succesful and you will shortly receive a letter inviting you to attend a citizenship ceremony.

Hooray.

Library of Congress and Flickr
Flickr and The Library of Congress announced a project together to put a whole load of the Libraries photos up and to ask the public to create meta data for these photos - tags, notes etc.

This is a phenomenal achievement for Flickr in my mind, looking through these photos there are some really absolutely amazing shots showing American life in the early 1900s, depression years, the war etc.

I spent some time over lunch browsing some of these, the machinery, clothes, culture, cars, architecture, it is all just amazing I wish there were such a good record of the UK available to the public.

Some of the images are just great to look at like the one below from 1911, that’s a hand held large format camera, amazing.


Others show the truly amazing work that photographers did in those days and frankly makes me wish I can even come close to this kind of shot on my digital cameras.


Click on these images and look at them, they are phenomenally well done the richness and dynamic range of color in those shots far out paces the results I tend to see on digital. I wish self developing color slides wasn’t such a pain else I’d start doing medium format color transparencies right away.