exim4 and apache2-mpm-itk

Migrating from a very old Debian install to a newer VPS with a more recent version of Apache and the mpm-itk mod, I was having problems with sending mail using the standard PHP mail() call (first seen when the WP contact form I was using was throwing a “Sorry, email message could not be delivered” error).

exim4 log reported the following:

unable to set gid=33 or uid=0 (euid=0): forcing real = effective

This thread contained a post indicating the problem was the LimitGIDRange/LimitUIDRange options; it seems if these are not specified there are some defaults (perhaps with very low values, or perhaps it’s just that if it’s not set it will not work at all) that need to be overridden.

Defining these values in the global Apache configuration fixes it.

PHPMailer and Gmail API Mysterious ‘Could not connect’ Error on Windows

I’ve had PHPMailer happily sending email through the Gmail API (as part of a G Suite subscription) for a while now and it mysteriously stopped working yesterday (29th Sep, 2017), throwing the following output with debug enabled:

2017-09-30 11:24:52 SERVER -> CLIENT: 220 smtp.gmail.com ESMTP v2sm1805443wmf.8 - gsmtp
2017-09-30 11:24:52 CLIENT -> SERVER: EHLO trog-pc
2017-09-30 11:24:52 SERVER -> CLIENT: 250-smtp.gmail.com at your service, [86.170.8.39]
250-SIZE 35882577
250-8BITMIME
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-CHUNKING
250 SMTPUTF8
2017-09-30 11:24:52 CLIENT -> SERVER: STARTTLS
2017-09-30 11:24:52 SERVER -> CLIENT: 220 2.0.0 Ready to start TLS
2017-09-30 11:24:52 SMTP Error: Could not connect to SMTP host.
2017-09-30 11:24:52 CLIENT -> SERVER: QUIT
2017-09-30 11:24:52 SERVER -> CLIENT: M I A ��] P *g�� 87� �*��h�!T��
[multiple line binary gibberish removed]
2017-09-30 11:24:52 SMTP ERROR: QUIT command failed: M I A ��] P *g�� 87� �*��h�!T�� [multiple line binary gibberish removed]
2017-09-30 11:24:52 SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/wiki/Troubleshooting
Mailer Error: SMTP connect() failed. https://github.com/PHPMailer/PHPMailer/wiki/Troubleshooting

It looks pretty clearly like a crypto error and the step in the Troubleshooting guide (helpfully provided in the error message!) relating to the OpenSSL check made it seem pretty clear that it was a problem.

The OpenSSL test result looked like this:

C:\files\Apps\OpenSSL>openssl s_client -starttls smtp -crlf -connect smtp.gmail.com:587
CONNECTED(0000019C)
depth=1 C = US, O = Google Trust Services, CN = Google Internet Authority G3
verify error:num=20:unable to get local issuer certificate
---
Certificate chain
0 s:/C=US/ST=California/L=Mountain View/O=Google Inc/CN=smtp.gmail.com
i:/C=US/O=Google Trust Services/CN=Google Internet Authority G3
1 s:/C=US/O=Google Trust Services/CN=Google Internet Authority G3
i:/OU=GlobalSign Root CA - R2/O=GlobalSign/CN=GlobalSign
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIEjDCCA3SgAwIBAgIIa79pDvQYxx0wDQYJKoZIhvcNAQELBQAwVDELMAkGA1UE
BhMCVVMxHjAcBgNVBAoTFUdvb2dsZSBUcnVzdCBTZXJ2aWNlczElMCMGA1UEAxMc
R29vZ2xlIEludGVybmV0IEF1dGhvcml0eSBHMzAeFw0xNzA5MTMxNzUyMjVaFw0x
NzEyMDYxNzExMDBaMGgxCzAJBgNVBAYTAlVTMRMwEQYDVQQIDApDYWxpZm9ybmlh
MRYwFAYDVQQHDA1Nb3VudGFpbiBWaWV3MRMwEQYDVQQKDApHb29nbGUgSW5jMRcw
FQYDVQQDDA5zbXRwLmdtYWlsLmNvbTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCC
AQoCggEBAJBSmxMU1SnUDEN7W9D97mjMOMhbrDPqocU1iVgdUaUuLDDvabwb1PjE
SoSpWZpva+13ZcWnHTvw+oYA+y8yjKALGQopDXjSA9OKu2TdK4gLg05PgCvFHgXd
dOJUkkyctAfthO4oll1/NhFa5w+Juv0p5UI7aSLklFxjH7B4Iv8C85vGa5YPftkt
VOCj0rWHOXeZF14qFsWYcH8azRZKU4ih9S8IoeSgfnMhxJRnjexwVNA5MJ/ig3zT
d0q7pK8usd+rbABshztIKA2AAB0g6NueOVGI4GCT/r8MAKbezi6I5U2Kw5Ja0RJl
e0HTq+dCSc9z5Yb86/+yOEDLy3h2BV8CAwEAAaOCAUwwggFIMB0GA1UdJQQWMBQG
CCsGAQUFBwMBBggrBgEFBQcDAjAZBgNVHREEEjAQgg5zbXRwLmdtYWlsLmNvbTBo
BggrBgEFBQcBAQRcMFowLQYIKwYBBQUHMAKGIWh0dHA6Ly9wa2kuZ29vZy9nc3Iy
L0dUU0dJQUczLmNydDApBggrBgEFBQcwAYYdaHR0cDovL29jc3AucGtpLmdvb2cv
R1RTR0lBRzMwHQYDVR0OBBYEFD3FXTx/EAB0m7BLwZC8B7Bh/+TUMAwGA1UdEwEB
/wQCMAAwHwYDVR0jBBgwFoAUd8K4UJpndnaxLcKG0IOgfqZ+ukswIQYDVR0gBBow
GDAMBgorBgEEAdZ5AgUDMAgGBmeBDAECAjAxBgNVHR8EKjAoMCagJKAihiBodHRw
Oi8vY3JsLnBraS5nb29nL0dUU0dJQUczLmNybDANBgkqhkiG9w0BAQsFAAOCAQEA
DWhdK0rwWV7Q2lBk2oukJBgmptwWsPHtkYjnhjqRXzAwAg6iqXJrf6BUmdgK4Vvp
rj0qeE9kcTudvZwMPVxS7gcjk66v79n2NvE2QBKGZnlUC/4S93jgQVuDZMwmKF3n
ArjDQ0zE2o6wfM3I3yNkzT+/ZxXtrYzhPmRmbVKWTgMSJvWwN6H2T7An+1JXl11A
7Tf5VeiPSI/kvCByw7sezFDRHbnj2uXZz23DymT75zgF/V3Nbzmg3htdlVnyB2Xp
kMKl5swPBrBuui2+et9ZN7vYjZRdHy0jg/PB9lfpdA2CQnIHcq/vIYBzmi+TSms1
vNaIty8ekNsvigjIzn13eg==
-----END CERTIFICATE-----
subject=/C=US/ST=California/L=Mountain View/O=Google Inc/CN=smtp.gmail.com
issuer=/C=US/O=Google Trust Services/CN=Google Internet Authority G3
---
No client certificate CA names sent
Peer signing digest: SHA256
Server Temp Key: ECDH, P-256, 256 bits
---
SSL handshake has read 3246 bytes and written 468 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES128-GCM-SHA256
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES128-GCM-SHA256
Session-ID: 927522CACEB8BB3D0FB305E197235C64D147A4CC26643AB60EB5F110E787FA98
Session-ID-ctx:
Master-Key: 8D698AF5A7790DC4836430F2FA6157B310CF0DDA684B5160BEC643B966E9CCC41598D34D03DA0579893A6CAFB62C2B33
Key-Arg : None
PSK identity: None
PSK identity hint: None
SRP username: None
TLS session ticket lifetime hint: 100800 (seconds)
TLS session ticket:
0000 - 00 33 97 a3 0b be 7f 8d-47 f3 97 6c 18 bb 43 83 .3......G..l..C.
0010 - 27 a4 f7 01 2c d1 a8 0e-55 a9 3c c3 b3 6f 30 58 '...,...U.<..o0X
0020 - 89 22 e3 29 50 42 18 8e-29 ca be 27 57 f9 bc 6e .".)PB..)..'W..n
0030 - 25 f9 ed 68 6a ba 30 97-60 0b 32 fc 19 ab 83 10 %..hj.0.`.2.....
0040 - 00 d1 91 e7 1d 72 d9 2f-3f 27 ac 06 83 23 78 94 .....r./?'...#x.
0050 - 4d 59 38 7f 5d 70 2e ec-d9 d4 b3 31 c9 34 04 25 MY8.]p.....1.4.%
0060 - 79 a8 2f 49 66 ce c7 e3-67 de 46 58 43 b9 42 36 y./If...g.FXC.B6
0070 - 54 49 33 94 99 1e 7d 0b-87 4c da c5 a4 72 b1 05 TI3...}..L...r..
0080 - 5d 47 3b cf 33 13 69 41-f8 1d e4 a0 81 26 1c e5 ]G;.3.iA.....&..
0090 - a7 6b 9b 09 c8 db 1d 8f-6b 5e 54 eb d7 ed 9e 6c .k......k^T....l
00a0 - fc 1f f9 f8 3a d4 3a df-05 c7 0b a3 0b 66 c1 4e ....:.:......f.N
00b0 - 66 27 3c 64 03 60 81 1d-44 bb f0 a4 08 d0 96 dd f'<d.`..D.......
00c0 - 14 31 95 fd 23 7f 13 82-ed 15 fa fb 6a f5 ec 69 .1..#.......j..i
00d0 - c9 b1 d3 e9 fc .....

Start Time: 1506770618
Timeout : 300 (sec)
Verify return code: 20 (unable to get local issuer certificate)
---
250 SMTPUTF8

At first glance the Troubleshooting guide implies that the ‘unable to get local issuer certificate’ is safe to ignore – but it is only referring to the first instance of the error at the top. If you’re also seeing the error message at the bottom, you have the same problem as me.

The easy fix is to set verify_peer to false as it described in the Troubleshooting guide. But (also as it notes) this is dodgy and you should fix the local certificate store. So don’t do this.

After messing around a bit (including testing identical code on a Linux VM and seeing that it worked), I gritted my teeth and dove into the OpenSSL configuration (something which I’ve studiously avoided for years because everything has magically Just Worked for me).

It looks like OpenSSL didn’t have a local certificate store at all in Windows and it needs to be explicitly configured. I have no idea how it worked at all – maybe it was using some sort of embedded certificate that just expired? Or maybe I had changed some other option somewhere without realising (unlikely but I hate blaming gremlins).

Anyway, the fix is simple:

1) Download the latest cacert.pem file from the curl website

2) Plonk it somewhere on your local machine where PHP can get to it.

3) Update your php.ini’s openssl.cafile directive to point to this new file.

PHP’s OpenSSL should now have the local certificates. The OpenSSL test in the PHPMailer Troubleshooting guide should now “pass” and that final Verify return code: 20 (unable to get local issuer certificate) message should be replaced with Verify return code: 0 (ok). PHPMailer should also happily work again.

Setting Up Infobox Templates in MediaWiki v1.23

This article explains how to add the “Infobox” template to your MediaWiki installation. It is primarily intended for people who have installed v1.23 from source.

This is an updated version of this older post about setting up Infobox on earlier versions of MediaWiki. It is basically the same but has been modified to be suitable for the current (at the time of writing) version of MediaWiki, v1.23. Please see the older post for more info and background as well as helpful commentary from other users in different circumstances.

Here are the basic steps necessary to add working Infoboxes to a freshly installed version of MediaWiki. Note that the original steps required the install of ParserFunctions; this is no longer required as it ships with recent versions of MediaWiki by default.

  1. Download the Scribuntu extension into your extensions folder and add it to your LocalSettings.php as described in the ‘Installation’ section.
  2. Copy the CSS required to support the infobox from Wikipedia.org to your Wiki. The CSS is available in Common.css. You’ll probably need to create the stylesheet – it will be at http://your_wiki/wiki/index.php?title=MediaWiki:Common.css&action=edit – and then you can just copy/paste the contents in there. (I copied the whole file; you can probably just copy the infobox parts.)
  3. Export the Infobox Template from the Wikipedia.org:
    1. Go to Wikipedia’s Special:Export page
    2. Leave the field for ‘Add pages from category’ empty
    3. In the big text area field, just put in “Template:Infobox”.
    4. Make sure the three options – “Include only the current revision, not the full history”, “Include templates”, and “Save as file” – are all checked
    5. Hit the ‘Export’ button; it will think for a second then spit out an XML file containing all the Wikipedia Templates for the infobox for you to save to your PC.
  4. Now you have the Template, you need to integrate them into your MediaWiki instance. Simply go to your Import page – http://your_wiki/wiki/index.php/Special:Import – select the file and then hit ‘Upload file’.
  5. With the Templates and styles added you should be able to now add a simple infobox. Pick a page and add something like this to the top:{{Infobox
    |title = An amazing Infobox
    |header1 = It works!
    |label2 = Configured by
    |data2 = trog
    |label3 = Web
    |data3 = [http://trog.qgl.org trog.qgl.org]
    }}
  6. Save, and you should end up with something that looks like this:

Image Data Only Hashing of JPEG Files

As part of a small project to verify backups, I came across a case where I had two photos that looked identical but with different EXIF data.

The backup verification system (correctly) flagged these as two different files – as the SHA1 file hashes were different. However, the actual photo was – as far as I could tell – absolutely identical, so I started looking to see if there was a way to verify JPEG files based on the image data alone (instead of the entire file, which would include meta stuff like the EXIF data).

A quick look around revealed that ImageMagick has a “signature hash” function as part of ‘identify‘, which sounded perfect. You can test it like so:

identify.exe -verbose -format “%#” test.jpg

At first glance this solved the problem, but testing on a few systems showed that I was getting different hashes for the same file – it looked like different versions of ImageMagick return a different hash. I’ve asked about this on their forum and was told that the signature algorithm has changed a few times – which makes it sort of useless if compatibility across platforms is required.

After looking around a bit more for alternative I found the (possibly Australian made?) PHP JPEG Metadata Toolkit, which (amongst many other things) includes a get_jpeg_image_data() function which (so far) seems to work reliably across systems. Pulling the data out and running it through SHA1 gives a simple usable way to hash the image-only data in a JPEG file.

Location-based Advertising Goes Wrong; Clues about Dodgy Advertising

If you’re an astute observer, you might have noticed some elements – for example, advertising or some other content – on overseas web sites sometimes have some element on them that refers to the city in which you’re living in.

It might seem like an astonishing coincidence that an article on the Toronto Times or the South Xihuan Observer just happens to have something like this on their website at the exact same time you just happened to click through from Google… but it isn’t. It is the result of location-based advertising – detecting some information about you from your web browser and figuring out where you are. Usually this is done by your IP address and it is a simple look-up in some database that maintains a list of how geographical locations map to certain IP ranges (colloquially referred to as “GeoIP”).

This is not an exact science, and as this screengrab from msnbc.com shows, sometimes things can go wrong:

This is probably just a simple programming error – the “REGION” tag should have been replaced with my actual region.

This is mostly a fascinatingly boring example of a web site bug.

The only interesting thing is that it clearly highlights that the module with that error is engaging in deception to try to trick you into clicking on it. Clearly, this is not a “new trick in your region” – it is some bullshit generic factoid, presumably about car insurance, that they’re trying to bait you into clicking by implying that it is related to where you live.

There are, of course, other location-based clues in this (rather poor) ad – it has what is pretty clearly a US police department patrol car, and the text of the ad refers to “miles per day” – so hopefully even the casual Australian Internet user would start hearing alarm bells.

While it almost certainly isn’t a scam and probably poses no real “danger”, it’s important for people to be alert for little tricks like this that attempt to change your behaviour by appealing to you by “hitting you at home”, so to speak.

Sogou Search Engine Spider Smashing Websites

Was keeping an eye on our CPU usage on a newly provisioned VPS on which a part of AusGamers was recently transferred to and noticed a big, unusual spike in CPU usage:

Correlating this with another graph indicated it was something hitting our news or forum pages pretty hard, so I nabbed the Apache logs and quickly determined what it was – the “Sogou web spider”, hitting our front page twice a second, over and over again:

199.119.201.102 – – [13/Sep/2011:10:52:16 +1000] “GET / HTTP/1.0” 301 233 “http://www.ausgamers.com” “Sogou web spider/4.0(+h
ttp://www.sogou.com/docs/help/webmasters.htm#07)”
199.119.201.102 – – [13/Sep/2011:10:52:16 +1000] “GET / HTTP/1.0” 301 233 “http://www.ausgamers.com” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
199.119.201.102 – – [13/Sep/2011:10:52:17 +1000] “GET / HTTP/1.0” 301 233 “http://www.ausgamers.com” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
199.119.201.102 – – [13/Sep/2011:10:52:17 +1000] “GET / HTTP/1.0” 301 233 “http://www.ausgamers.com” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
199.119.201.102 – – [13/Sep/2011:10:52:17 +1000] “GET / HTTP/1.0” 301 233 “http://www.ausgamers.com” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
199.119.201.102 – – [13/Sep/2011:10:52:18 +1000] “GET / HTTP/1.0” 301 233 “http://www.ausgamers.com” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”

… and so on, for a total of 18,763 requests Eventually it moved on to our different pages, but I stopped counting.

The URL in our logs directs you to a Chinese language FAQ, which when run through the awesome translate feature in Chrome directs you to a form for which you can submit a complaint about “crawling too fast”. I did that (in English) and will be fascinated to see if I get a response.

In the meantime, we just blocked the IP address.

Fixing Double Encoded Characters in MySQL

If you’re working on any old PHP/MySQL sites, chances are at some point you’re going to need to get into the murky, painful world of character encoding – presumably to convert everything to UTF-8 from whatever original setup you have. It is not fun, but fortunately many people have gone through it before and paved the way with a collection of useful information and scripts.

One problem which struck us recently when migrating our database server was certain characters being “double encoded”. This appears to be relatively common. For us, the cause was exporting our data – all UTF-8 data but stored in tables that were latin1 – via mysqldump and then importing again as if it was UTF-8. This means something like the characters are detected as multibyte, but because the source and destinations were different, they’re re-encoded – so you end up with these double encoded characters that look like squiggly gibberish appearing in all your web pages.

Nathan over at the Blue Box Group has written an extremely comprehensive guide to problems like this. It explains the root cause of these problems, the common symptoms, and – of course, most importantly – precise details on how to safely fix them. If you’re doing anything at all involved in changing character encoding then it is worth a read even before you have problems, just so you can get a better handle on how to fix things and what your end game should be.

There’s a few other ways to fix it, of course. The Blue Box solution is comprehensive and reliable but it requires quite a bit of work to get it going, and you also need to know which database table fields you want to work on specifically – so it can be time consuming unless you’re prepared to really sit down and work on it, either to process everything manually or write a script to do it all for you.

Fortunately there’s an easier way, as described here – basically, all you need to do is export your current dataset with mysqldump, forcing it to latin1, and then re-import it as UTF-8:

mysqldump -h DB_HOST -u DB_USER -p –opt –quote-names –skip-set-charset –default-character-set=latin1 DB_NAME > DB_NAME-dump.sql

mysql -h DB_HOST -u DB_USER -p –default-character-set=utf8 DB_NAME < DB_NAME-dump.sql

We did this for AusGamers.com and it worked perfectly – the only caveat you need to be aware of is that it will mess up UTF-8 characters that are properly encoded aleady. For us this wasn’t a big deal as we were able to clearly identify them and fix them manually.

StackOverflow has yet another approach which might be suitable if you’re dealing with only one or two tables and just want to fix it from the MySQL console or phpMyAdmin or whatever – changing the table character sets on the fly:

ALTER TABLE [tableName] MODIFY [columnName] [columnType] CHARACTER SET latin1
ALTER TABLE MyTable [tableName] [columnName] [columnType] CHARACTER SET binary
ALTER TABLE MyTable [tableName] [columnName] [columnType] CHARACTER SET utf8

This method worked fine for me in a test capacity on a single table but we didn’t end up using it everywhere.

Setting Up Infobox Templates in MediaWiki

Note: This guide has been updated as of 2014-09-22 for MediaWiki v1.23. If you’re using this version (or later) please see the Infoboxes in MediaWiki v1.23 post.

** Click here for the updated post. **


If you’ve ever been to any of the more structured Wikipedia pages you probably have seen the neat “infoboxes” that they have on the right hand side. They’re a neat, convenient way to get some of the core metainfo from an article.

If you have your own MediaWiki instance, you’ve probably thought they’d be a nice thing to have, so maybe you copy and pasted the code from Wikipedia and then were surprised when it didn’t just magically work. Turns out that the infobox stuff is part of MediaWiki’s extensive Templating system, so first of all you need the templates. Sounds easy, right?

Well, no. You don’t just flip a switch or download a file, and when you do a search you might find this article which details a process that it says might take 60-90 minutes.

I started looking into it and quickly got lost; you basically need to create a billion different Templates and do all sorts of weird stuff to get it to work. Fortunately I stumbled across this discussion which contained a clue that greatly simplifies the process.

I was able to distill the steps down to a process that I was able to reproduce on a new MediaWiki install in about five minutes. Before we start, I’ll throw in the warning that I have not read the documentation and I don’t understand at a low level what is happening with the templating. I just wanted a working, simple infobox.

  1. Download the MediaWiki extension ParserFunctions and add it to your LocalSettings.php as referred to there.
  2. Copy the CSS required to support the infobox from Wikipedia to your Wiki. The CSS is available in Common.css. You’ll probably need to create the stylesheet – it will be at http://your_wiki/wiki/index.php?title=MediaWiki:Common.css&action=edit – and then you can just copy/paste the contents in there. (I copied the whole file; you can probably just copy the infobox parts.)
  3. Export the infobox Template from Wikipedia:
    1. Go to Wikipedia’s Special:Export page
    2. Leave the field for ‘Add pages from category’ empty
    3. In the big text area field, just put in “Template:Infobox”.
    4. Make sure the three options – “Include only the current revision, not the full history”, “Include templates”, and “Save as file” – are all checked
    5. Hit the ‘Export’ button; it will think for a second then spit out an XML file containing all the Wikipedia Templates for the infobox for you to save to your PC.
  4. Now you have the Template, you need to integrate them into your MediaWiki instance. Simply go to your Import page – http://your_wiki/wiki/index.php/Special:Import – select the file and then hit ‘Upload file’. NOTE: see update at the bottom of the page before doing this.
  5. With the Templates and styles added you should be able to now add a simple infobox. Pick a page and add something like this to the top:{{Infobox
    |title = Infobox Title
    |header1 = Infobox Header
    |label2 = Created by
    |data2 = David
    |label3 = External reference
    |data3 = [http://trog.qgl.org trog.qgl.org]
    }}

The full infobox Template docs are available here – there’s a lot of stuff in there, but if you just want a really basic infobox then this is the simplest way I found to get them working.

I tested this on two separate MediaWiki installs – one running v1.12.1 and one on v1.15.1 – and it worked on both of them, but as always YMMV.

Update 2013-07-27

As many people have noticed, the guide no longer works. Thanks to commenters jh and chojin, it looks like you also need to do the following:

  • Install the Scribunto extension and add it to your LocalSettings.php as usual. It looks like this extension is now required for the InfoBox templates (in fact, it looks like it replaces ParserFunctions entirely, but I’m still testing that).
  • The XML file that is output in step 3 appears to erroneously (?) use text/plain as the format type. If you edit this XML file in your text editor and replace all incidents of ‘text/plain’ with ‘CONTENT_FORMAT_TEXT’ (I only found two), the import will be successful and the infobox tags looks like they work.

If someone else can confirm this for me as a working solution I’ll revise the original post so it takes these steps into account.

Should I Gzip Content Before Putting it in MySQL?

The answer for us was “yes”, although there’s a lot more to it than that. I just wrote about doing this on AusGamers for a table that was causing us a lot of grief with really slow DELETEs due to the huge volume of data in there.

I found that gzip’ing the content before putting it into the database made a massive difference to performance – queries that would usually take minutes to run because they were removing up to gigabytes of data suddenly were dealing with 10x less bytes, which made a huge impact to the execution time.

The results were obvious – you can see in the graphs below the impact that was made.

This change might not be useful in all circumstances – obviously at some point the CPU overhead of gzip’ing might cause more problems than its worth, or something. But if you’re dealing with multi-megabyte chunks of text that MySQL only needs to pull in and out (ie, you don’t need to sort by the contents or do anything else with that data from within MySQL), it’s probably worth trying.

Securing WordPress Using a Separate, Privileged Apache Vhost

Something I’ve been meaning to check out for a while – locking down WordPress to make it really secure. It’s always freaked me out a bit having web server-writable directories, but it just makes WordPress so powerful and, frankly, easy to use.

I checked out the hardening guide on the official WordPress site. It has a bunch of tips about how to set file system permissions, but at the end of the day you basically need to keep certain directories world-writable if you want to have that handy functionality that lets you do things like install plugins, edit themes, and automatically update.

However, after reading about a new zero-day exploit in a particular file that is packaged with many WordPress themes (not one that I happened to have installed), it drove me to action, along with the realisation that basically none of those simply hardening things is going to be useful if your site is set up with web-writable directories. If there’s an exploit in code – whether it’s core WP code or some random thing you’ve added in a plugin or theme – chances are you’ll be vulnerable.

So I have decided to try something else.

1) I’ve chowned all the files in my WordPress directory to be a non-web user, but left o+rx, which means the web process can happily read everything and serve my files – but it can no longer write to the directory. This of course means all that functionality I mentioned above no longer works.

2) I’ve created a new Apache vhost on my VPS on a separate port. As I am running ITK MTM – a module for Apache that allows me to specify what uid/gid the Apache process will run at on a per-user basis – I can tell this vhost to run as the same username as the non-web user that owns all the files.

3) I’ve made a tiny change to my wp-config.php file so that it lets me access this WordPress instance on the vhost without rewriting the URLs and forwarding me back to the main vhost. I just did something like this:


$t_port = 8958;
$t_servername = 'http://trog.qgl.org';
if ($_SERVER['SERVER_PORT'] == $t_port)
$t_servername .= ":$t_port";
define('WP_SITEURL', $t_servername);
define('WP_HOME', $t_servername);

4) Now, when I want to perform administrative tasks in WordPress, I just need to remember to access my /wp-admin directory via the http://trog.qgl.org:8958/ vhost.

5) Throw some extra security on this new vhost. I just whapped on a .htaccess in the vhost configuration, but you can do whatever you want – IP restrictions, or whatever.

After doing some basic testing to confirm it was all working as expected, I then went to write this post. I hit ‘save draft’ and was promptly greeted with a bizarre error from my WPSearch plugin (“Fatal error: Call to a member function find() on a non-object in [..]/wp-content/plugins/wpsearch/WPSearch/Drivers/Search/Phplucene.php”). This was mysterious! What had I done wrong?

So I looked through the code and WPSearch and trying to figure out what was going on. Eventually I realised – I’d tried writing this post from my non-privileged vhost. WPSearch must need to write to the disk somewhere as the web user – presumably to update the search index – and it was failing with that error because it wasn’t expecting suddenly to be able to no longer write to the disk (presumably when installing WPSearch it tells you if your file permissions are incorrect for usage).

After that I jumped back in to my privileged vhost and rewrote the post – and so far, so good. I’ll test this for a bit longer but to me it seems like an obvious way of running a more secure instance of WordPress, albeit with a bit more messing around.

Important notes:

Any plugin that you’re running that needs to interact by writing to the disk as part of its usual process will probably fail.

WP Super Cache is one that I’m using that will simply not work with this method – cache requests fail silently from the public interface and the cache simply will not function.

To fix this you need to find out what it needs to write to and give it full permission (which somewhat obviates the point of this exercise, but I’d much rather have only the cache directory world-writable) – in this case, ‘chmod o+w ./wp-content/cache’ fixes up WP Super Cache.

I’ll add more as I discover more.

Updated 2011-08-03: Added WP_HOME into step 3; it is required for various reasons – things like WP Super Cache and the permalinks menu break without it.

Updated 2011-08-15: A new problem – adding images into a post while you’re using the ‘admin port’ means that they’ll get referenced with this port. Not sure how to work around that one.