Thunderbird Freezes When Deleting or Moving Email

I recently updated to the latest Thunderbird (v11.0) and was disappointed to discover that suddenly whenever I was deleting an email or moving it into a different folder, the entire application would freeze for 1-2 seconds while it processed that command.

I am fastidious about email and spend probably more time than I should ensuring everything is filed into appropriate folders (or deleted if I’m never going to look at it again). When you’re getting hundreds of emails a day, deleting and moving needs to be an operation that consumes near zero time, otherwise you’re suddenly spending way more time “doing email” than you should be. As a result, these freezes were massively irritating and caused no end of problems.

I reinstalled Thunderbird, which seemed to fix it temporarily – but before I knew it was happening again. I tried rebuilding and compacting folders – all for naught. I tried searching the Thunderbird Bugzilla looking for similar reports, but I couldn’t see anyone else having the problem.

I put up with this for a while trying various things, but eventually gave up and fired up the incredibly handy FileMon utility from the SysInternals guys to see if anything obvious was happening on the disk side of things that would account for this freeze.

Immediate pay dirt; this chunk of output in FileMon is shows the main part of what happened when I tried to move an email into a subfolder of the Inbox:

You can see there the operation started at 4:11:37pm and then the next activity was at 4:11:39pm – two seconds was roughly how long I was seeing Thunderbird freeze for.

Next step was looking at what MsMpEng.exe was – Microsoft Security Essentials. Turns out MSE was installed on my PC as part of a general system policy update at around the same time I upgraded to Thunderbird v11.0.

I tried changing the settings to see if that was indeed the cause – in MSE you just look for the Settings tab, select Real-time protection, and uncheck the ‘Turn on real-time protection’ box. Immediately Thunderbird started behaving normally with no more freezes.

Fortunately there’s an ‘Excluded processes’ option in Microsoft Security Essentials so you can add Thunderbird.exe to the list of processes to skip. This completely fixed the problem for me and now I’m back to moving and deleting emails fast as ever.

Location-based Advertising Goes Wrong; Clues about Dodgy Advertising

If you’re an astute observer, you might have noticed some elements – for example, advertising or some other content – on overseas web sites sometimes have some element on them that refers to the city in which you’re living in.

It might seem like an astonishing coincidence that an article on the Toronto Times or the South Xihuan Observer just happens to have something like this on their website at the exact same time you just happened to click through from Google… but it isn’t. It is the result of location-based advertising – detecting some information about you from your web browser and figuring out where you are. Usually this is done by your IP address and it is a simple look-up in some database that maintains a list of how geographical locations map to certain IP ranges (colloquially referred to as “GeoIP”).

This is not an exact science, and as this screengrab from shows, sometimes things can go wrong:

This is probably just a simple programming error – the “REGION” tag should have been replaced with my actual region.

This is mostly a fascinatingly boring example of a web site bug.

The only interesting thing is that it clearly highlights that the module with that error is engaging in deception to try to trick you into clicking on it. Clearly, this is not a “new trick in your region” – it is some bullshit generic factoid, presumably about car insurance, that they’re trying to bait you into clicking by implying that it is related to where you live.

There are, of course, other location-based clues in this (rather poor) ad – it has what is pretty clearly a US police department patrol car, and the text of the ad refers to “miles per day” – so hopefully even the casual Australian Internet user would start hearing alarm bells.

While it almost certainly isn’t a scam and probably poses no real “danger”, it’s important for people to be alert for little tricks like this that attempt to change your behaviour by appealing to you by “hitting you at home”, so to speak.

Sogou Search Engine Spider Smashing Websites

Was keeping an eye on our CPU usage on a newly provisioned VPS on which a part of AusGamers was recently transferred to and noticed a big, unusual spike in CPU usage:

Correlating this with another graph indicated it was something hitting our news or forum pages pretty hard, so I nabbed the Apache logs and quickly determined what it was – the “Sogou web spider”, hitting our front page twice a second, over and over again: – – [13/Sep/2011:10:52:16 +1000] “GET / HTTP/1.0” 301 233 “” “Sogou web spider/4.0(+h
ttp://” – – [13/Sep/2011:10:52:16 +1000] “GET / HTTP/1.0” 301 233 “” “Sogou web spider/4.0(+” – – [13/Sep/2011:10:52:17 +1000] “GET / HTTP/1.0” 301 233 “” “Sogou web spider/4.0(+” – – [13/Sep/2011:10:52:17 +1000] “GET / HTTP/1.0” 301 233 “” “Sogou web spider/4.0(+” – – [13/Sep/2011:10:52:17 +1000] “GET / HTTP/1.0” 301 233 “” “Sogou web spider/4.0(+” – – [13/Sep/2011:10:52:18 +1000] “GET / HTTP/1.0” 301 233 “” “Sogou web spider/4.0(+”

… and so on, for a total of 18,763 requests Eventually it moved on to our different pages, but I stopped counting.

The URL in our logs directs you to a Chinese language FAQ, which when run through the awesome translate feature in Chrome directs you to a form for which you can submit a complaint about “crawling too fast”. I did that (in English) and will be fascinated to see if I get a response.

In the meantime, we just blocked the IP address.

Fixing Double Encoded Characters in MySQL

If you’re working on any old PHP/MySQL sites, chances are at some point you’re going to need to get into the murky, painful world of character encoding – presumably to convert everything to UTF-8 from whatever original setup you have. It is not fun, but fortunately many people have gone through it before and paved the way with a collection of useful information and scripts.

One problem which struck us recently when migrating our database server was certain characters being “double encoded”. This appears to be relatively common. For us, the cause was exporting our data – all UTF-8 data but stored in tables that were latin1 – via mysqldump and then importing again as if it was UTF-8. This means something like the characters are detected as multibyte, but because the source and destinations were different, they’re re-encoded – so you end up with these double encoded characters that look like squiggly gibberish appearing in all your web pages.

Nathan over at the Blue Box Group has written an extremely comprehensive guide to problems like this. It explains the root cause of these problems, the common symptoms, and – of course, most importantly – precise details on how to safely fix them. If you’re doing anything at all involved in changing character encoding then it is worth a read even before you have problems, just so you can get a better handle on how to fix things and what your end game should be.

There’s a few other ways to fix it, of course. The Blue Box solution is comprehensive and reliable but it requires quite a bit of work to get it going, and you also need to know which database table fields you want to work on specifically – so it can be time consuming unless you’re prepared to really sit down and work on it, either to process everything manually or write a script to do it all for you.

Fortunately there’s an easier way, as described here – basically, all you need to do is export your current dataset with mysqldump, forcing it to latin1, and then re-import it as UTF-8:

mysqldump -h DB_HOST -u DB_USER -p –opt –quote-names –skip-set-charset –default-character-set=latin1 DB_NAME > DB_NAME-dump.sql

mysql -h DB_HOST -u DB_USER -p –default-character-set=utf8 DB_NAME < DB_NAME-dump.sql

We did this for and it worked perfectly – the only caveat you need to be aware of is that it will mess up UTF-8 characters that are properly encoded aleady. For us this wasn’t a big deal as we were able to clearly identify them and fix them manually.

StackOverflow has yet another approach which might be suitable if you’re dealing with only one or two tables and just want to fix it from the MySQL console or phpMyAdmin or whatever – changing the table character sets on the fly:

ALTER TABLE [tableName] MODIFY [columnName] [columnType] CHARACTER SET latin1
ALTER TABLE MyTable [tableName] [columnName] [columnType] CHARACTER SET binary
ALTER TABLE MyTable [tableName] [columnName] [columnType] CHARACTER SET utf8

This method worked fine for me in a test capacity on a single table but we didn’t end up using it everywhere.

Trials and Tribulations of Updating PGP Desktop

I somehow missed the news in April last year that Symantec would be acquiring PGP. Symantec doesn’t exactly have a stellar reputation amongst technical people (my Dell laptop still has some mystical, seemingly uninstallable software components from a Symantec product that was on there when I bought it that I could never get rid of), so I’m sure if I had known about it, it would have filled me with dread.

I found out about it today when I loaded PGP Desktop and realised I hadn’t checked for updates for a while. Normally I haven’t needed to – PGP were pretty good about emailing me about updates. So I opened the application and hit Help->Update. After a split second of thinking, I’m greeted with a dialog telling me: “Product manifest from the PGP Corporation update server fails the integrity check. Please try again later.” I tried again later, same thing, so I did the next step anyone would try when troubleshooting and Googled the error message.

I was directed to this thread on the Symantec forums (never a good sign when the first hits aren’t in some support knowledge base). Fortunately, it had a reply from a Symantec tech support person, so that was good news.

The reply advised users experiencing the problem to download this PDF. Another bad sign. Why isn’t this just linked on a website? Load the PDF and you’re greeted with something that looks like this:

Really? You can’t even get the slashes the right way around in your hyperlinks? Dread level increasing.

Anyway, I tried the process. Went to the URL in point 1 and was told I need to sign up for an account. No worries, makes sense after reading the rest of the document – you get access to a license management section in the Symantec website, so an account seems like a reasonable thing. A relatively painless process; didn’t even need to activate. Tried to log in – more dread:

Augh. Stuck.

I realise that Symantec probably have a bit of work to do as part of the changeover – they say as much in the forum post. But getting software updates seems like enough of a Big Deal to warrant a bit more effort – not to say attention to detail – if they expect corporate customers to want to keep coming back. If I wanted to go to all this effort with desktop encryption software and keeping it up to date, I’d be using GPG.

Setting Up Infobox Templates in MediaWiki

Note: This guide has been updated as of 2014-09-22 for MediaWiki v1.23. If you’re using this version (or later) please see the Infoboxes in MediaWiki v1.23 post.

** Click here for the updated post. **

If you’ve ever been to any of the more structured Wikipedia pages you probably have seen the neat “infoboxes” that they have on the right hand side. They’re a neat, convenient way to get some of the core metainfo from an article.

If you have your own MediaWiki instance, you’ve probably thought they’d be a nice thing to have, so maybe you copy and pasted the code from Wikipedia and then were surprised when it didn’t just magically work. Turns out that the infobox stuff is part of MediaWiki’s extensive Templating system, so first of all you need the templates. Sounds easy, right?

Well, no. You don’t just flip a switch or download a file, and when you do a search you might find this article which details a process that it says might take 60-90 minutes.

I started looking into it and quickly got lost; you basically need to create a billion different Templates and do all sorts of weird stuff to get it to work. Fortunately I stumbled across this discussion which contained a clue that greatly simplifies the process.

I was able to distill the steps down to a process that I was able to reproduce on a new MediaWiki install in about five minutes. Before we start, I’ll throw in the warning that I have not read the documentation and I don’t understand at a low level what is happening with the templating. I just wanted a working, simple infobox.

  1. Download the MediaWiki extension ParserFunctions and add it to your LocalSettings.php as referred to there.
  2. Copy the CSS required to support the infobox from Wikipedia to your Wiki. The CSS is available in Common.css. You’ll probably need to create the stylesheet – it will be at http://your_wiki/wiki/index.php?title=MediaWiki:Common.css&action=edit – and then you can just copy/paste the contents in there. (I copied the whole file; you can probably just copy the infobox parts.)
  3. Export the infobox Template from Wikipedia:
    1. Go to Wikipedia’s Special:Export page
    2. Leave the field for ‘Add pages from category’ empty
    3. In the big text area field, just put in “Template:Infobox”.
    4. Make sure the three options – “Include only the current revision, not the full history”, “Include templates”, and “Save as file” – are all checked
    5. Hit the ‘Export’ button; it will think for a second then spit out an XML file containing all the Wikipedia Templates for the infobox for you to save to your PC.
  4. Now you have the Template, you need to integrate them into your MediaWiki instance. Simply go to your Import page – http://your_wiki/wiki/index.php/Special:Import – select the file and then hit ‘Upload file’. NOTE: see update at the bottom of the page before doing this.
  5. With the Templates and styles added you should be able to now add a simple infobox. Pick a page and add something like this to the top:{{Infobox
    |title = Infobox Title
    |header1 = Infobox Header
    |label2 = Created by
    |data2 = David
    |label3 = External reference
    |data3 = []

The full infobox Template docs are available here – there’s a lot of stuff in there, but if you just want a really basic infobox then this is the simplest way I found to get them working.

I tested this on two separate MediaWiki installs – one running v1.12.1 and one on v1.15.1 – and it worked on both of them, but as always YMMV.

Update 2013-07-27

As many people have noticed, the guide no longer works. Thanks to commenters jh and chojin, it looks like you also need to do the following:

  • Install the Scribunto extension and add it to your LocalSettings.php as usual. It looks like this extension is now required for the InfoBox templates (in fact, it looks like it replaces ParserFunctions entirely, but I’m still testing that).
  • The XML file that is output in step 3 appears to erroneously (?) use text/plain as the format type. If you edit this XML file in your text editor and replace all incidents of ‘text/plain’ with ‘CONTENT_FORMAT_TEXT’ (I only found two), the import will be successful and the infobox tags looks like they work.

If someone else can confirm this for me as a working solution I’ll revise the original post so it takes these steps into account.

WackGet v1.2.4 – Now Works on Windows 7 (and Vista)!

After a far too long delay, we’ve finally got a new version of WackGet which works on Windows 7 and Windows Vista!

Big thanks to Andrew at Mammoth Media for throwing his time into getting this updated and working. It turns out there were a couple of bugs that were present in previous versions that just happened to work on previous Windows versions, but the more recent Windows editions wouldn’t put up with that sort of crap and barfed.

Here’s the change log:

1.2.4 – Bugfixes
Corrected a bug related to invalid memory access that caused a crash on Windows Vista/7
Corrected a bug where a download may appear stuck for 10-15 seconds before starting

You can download v1.2.4 here. We’re just packaging up the source code and will have it available shortly.

Update 2012-10-08: A user has reported a bug in this version where pausing and resuming files after around the 6GB mark does not work. I have reproduced this and have confirmed it is a problem in the included version of wget.exe. Unfortunately the wget executable can not just be replaced so it is a bit of work to fix this problem.

I’ll see if I can get someone to fix it but in the meantime if someone is willing to have a crack at it that’d be great (source code is available here).

Should I Gzip Content Before Putting it in MySQL?

The answer for us was “yes”, although there’s a lot more to it than that. I just wrote about doing this on AusGamers for a table that was causing us a lot of grief with really slow DELETEs due to the huge volume of data in there.

I found that gzip’ing the content before putting it into the database made a massive difference to performance – queries that would usually take minutes to run because they were removing up to gigabytes of data suddenly were dealing with 10x less bytes, which made a huge impact to the execution time.

The results were obvious – you can see in the graphs below the impact that was made.

This change might not be useful in all circumstances – obviously at some point the CPU overhead of gzip’ing might cause more problems than its worth, or something. But if you’re dealing with multi-megabyte chunks of text that MySQL only needs to pull in and out (ie, you don’t need to sort by the contents or do anything else with that data from within MySQL), it’s probably worth trying.

Securing WordPress Using a Separate, Privileged Apache Vhost

Something I’ve been meaning to check out for a while – locking down WordPress to make it really secure. It’s always freaked me out a bit having web server-writable directories, but it just makes WordPress so powerful and, frankly, easy to use.

I checked out the hardening guide on the official WordPress site. It has a bunch of tips about how to set file system permissions, but at the end of the day you basically need to keep certain directories world-writable if you want to have that handy functionality that lets you do things like install plugins, edit themes, and automatically update.

However, after reading about a new zero-day exploit in a particular file that is packaged with many WordPress themes (not one that I happened to have installed), it drove me to action, along with the realisation that basically none of those simply hardening things is going to be useful if your site is set up with web-writable directories. If there’s an exploit in code – whether it’s core WP code or some random thing you’ve added in a plugin or theme – chances are you’ll be vulnerable.

So I have decided to try something else.

1) I’ve chowned all the files in my WordPress directory to be a non-web user, but left o+rx, which means the web process can happily read everything and serve my files – but it can no longer write to the directory. This of course means all that functionality I mentioned above no longer works.

2) I’ve created a new Apache vhost on my VPS on a separate port. As I am running ITK MTM – a module for Apache that allows me to specify what uid/gid the Apache process will run at on a per-user basis – I can tell this vhost to run as the same username as the non-web user that owns all the files.

3) I’ve made a tiny change to my wp-config.php file so that it lets me access this WordPress instance on the vhost without rewriting the URLs and forwarding me back to the main vhost. I just did something like this:

$t_port = 8958;
$t_servername = '';
if ($_SERVER['SERVER_PORT'] == $t_port)
$t_servername .= ":$t_port";
define('WP_SITEURL', $t_servername);
define('WP_HOME', $t_servername);

4) Now, when I want to perform administrative tasks in WordPress, I just need to remember to access my /wp-admin directory via the vhost.

5) Throw some extra security on this new vhost. I just whapped on a .htaccess in the vhost configuration, but you can do whatever you want – IP restrictions, or whatever.

After doing some basic testing to confirm it was all working as expected, I then went to write this post. I hit ‘save draft’ and was promptly greeted with a bizarre error from my WPSearch plugin (“Fatal error: Call to a member function find() on a non-object in [..]/wp-content/plugins/wpsearch/WPSearch/Drivers/Search/Phplucene.php”). This was mysterious! What had I done wrong?

So I looked through the code and WPSearch and trying to figure out what was going on. Eventually I realised – I’d tried writing this post from my non-privileged vhost. WPSearch must need to write to the disk somewhere as the web user – presumably to update the search index – and it was failing with that error because it wasn’t expecting suddenly to be able to no longer write to the disk (presumably when installing WPSearch it tells you if your file permissions are incorrect for usage).

After that I jumped back in to my privileged vhost and rewrote the post – and so far, so good. I’ll test this for a bit longer but to me it seems like an obvious way of running a more secure instance of WordPress, albeit with a bit more messing around.

Important notes:

Any plugin that you’re running that needs to interact by writing to the disk as part of its usual process will probably fail.

WP Super Cache is one that I’m using that will simply not work with this method – cache requests fail silently from the public interface and the cache simply will not function.

To fix this you need to find out what it needs to write to and give it full permission (which somewhat obviates the point of this exercise, but I’d much rather have only the cache directory world-writable) – in this case, ‘chmod o+w ./wp-content/cache’ fixes up WP Super Cache.

I’ll add more as I discover more.

Updated 2011-08-03: Added WP_HOME into step 3; it is required for various reasons – things like WP Super Cache and the permalinks menu break without it.

Updated 2011-08-15: A new problem – adding images into a post while you’re using the ‘admin port’ means that they’ll get referenced with this port. Not sure how to work around that one.

Differences in Requesting gzip’ed Content using curl in PHP

There are some slight differences in the way curl requests are handled when you’re requested gzip’ed content from a web server. I found these slightly non-obvious, although it’s really pretty clear, but in the interests of trying to clarify I thought I’d write this down.

If you want to use curl to retrieve gzip’ed content from a webserver, you simply do something like this:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_ENCODING, "gzip,deflate");
$data = curl_exec($ch);

What I found that was weird was that when I did something like ‘strlen($data)’ after that call, the result clearly indicated that the retrieved data was not compressed – strlen() was reporting 100 kbytes, but when I wget’ed the same page gzip’ed, I could see that it was only around 10 kbytes.

I added the header option to the curl request so I could see what was going on, so the code became:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_ENCODING, "gzip,deflate");
curl_setopt($ch, CURLOPT_HEADER, true);
$data = curl_exec($ch);

This yielded something like:

HTTP/1.1 200 OK
Date: Thu, 28 Jul 2011 23:03:42 GMT
Server: Apache/2.2.3 (CentOS)
X-Powered-By: Mono
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 11091
Connection: close
Content-Type: text/html; charset=UTF-8

So the web server thought was clearly returning a compressed document, as it matched the ~10 kbyte figure I was seeing with wget, but the actual size of the $data variable was out of whack with this.

As it turns out, CURLOPT_ENCODING actually also controls whether the curl request decodes the response from the webserver. So in addition to setting the required header for the request, it also transparently decompresses it so you can deal directly with the uncompressed content. Upon reflection, this is a little obvious if you just read the manual page.

Basically, the problem was that I was expecting (and wanting) to get a binary chunk of compressed data. This was not the case, but what curl was doing worked out fine for me anyway.

However, I did figure out how to get the binary chunk that I was initially wanting. Basically instead of using the CURLOPT_ENCODING option, you just add a header to the request and set binary transfer mode on, so the code simply becomes:

$headers[] = "Accept-Encoding: gzip";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$data = curl_exec($ch);

This will return the gzip’ed chunk of binary gibberish to $data (which, of course, will be much smaller when you run strlen() on it).

1 2 3 4 5 6 26  Scroll to top