FlashGet Sucks, and Should be Blocked

Over on AusGamers, we run a moderately popular download service for files. We push out around, oh, 30 terabytes a month of data (this is a lot).

Our file servers work pretty hard, but we prefer the work they do be related to just reading files off risk and throwing them down the wire at users. Unfortunately sometimes they have to do other things – like deal with bad requests from really terrible download software.

In this case, FlashGet is the bad download software. It is really annoying. Here’s a few reasons why:

  • If you give it a URL that 404s or 403s (ie, a URL that doesn’t exist or is forbidden), FlashGet inexplicably wants to keep retrying that URL, over and over every two seconds.
  • It incorrectly identifies itself as an IE5-based browser. This is just rude at best, and flat-out lying at worst.

I have written about this earlier, but now that I’ve seen the following data from a single month of usage on our file servers, I think the time has come to do something more:


The top entry here is FlashGet, with over 16 million hits to our server. The vast, majority of these hits are 403 or 404 errors from repeatedly trying to access files that are no longer there or that it no longer has access to.

At this stage the plan is to block FlashGet users. This is harder than it sounds because it is so stupid it ignores things like 403s and 404s and keeps retrying. What I am thinking we’ll do is detect FlashGet via the user-agent string and then redirect them to a different file. The file will be a little video file that explains why their download failed.

Downloading Ubuntu Metalinks with aria2c

The v9.04 Ubuntu release happened recently and as always I found myself battling to get the occasional ISO that wouldn’t come down cleanly via BitTorrent.

I thought I’d give the metalink versions a try with aria2c. Unfortunately the Ubuntu defaults to having a ‘maxconnections’ of one – which, as far as I can tell from the metalink spec, means you’ll only make 1 connection to a server (which will probably end up just being the torrent anyway, as it has the highest priority).

If you naughtily download the metalink file you can, of course, edit the resources section in the XML to be whatever maxconnections you want. I feel justified in doing this because a) I don’t think it will duly unburden the servers and b) I’m doing this to reduce the overall load by providing another mirror alternative, but morally I still feel a bit squeamish about it.

Anyway, my download speed went from like several hundred kilobytes a second via BitTorrent only to the following:

[#2 SIZE:409.6MiB/695.8MiB(58%) CN:113 SPD:5017.71KiB/s UP:18.86KiB/s(800.0KiB) ETA:58s]

So, it made 113 connections (a big chunk of them were BitTorrent ones obviously), and I ended up getting the file at around 5mbytes/sec. Nice!

Spammy User-Agent “Mozilla/4.0 (compatible; MSIE 5.00; Windows 98)” is Probably FlashGet

If you’ve ever run a remotely popular Apache web server you might have used the mod_limitipconn module, which stops people from making too many simultaneous connections from the same IP address. With this module active, anyone trying to make too many connections will get an HTTP 503 Service Temporarily Unavailable error message.

Now, over the years watching log files for downloads from AusGamers, we’ve seen a lot of weird shit. One of the more common problems we’ve seen is a constant stream of spammy requests coming from many users that have all been identifying as “Mozilla/4.0 (compatible; MSIE 5.00; Windows 98)”. I looked at this ages ago and didn’t figure it out then, but after getting annoyed with it again I spent a bit more time today and have since figured out that this is the default User-Agent of the popular download manager, FlashGet.

Why it chooses to identify, by default, as IE running on Windows 98 is a bit beyond me, but that’s something I find annoying, because it made this harder than it should have to diagnose.

The real problem though is how FlashGet handles 503 errors. The RFC seems to imply that, without a Retry-After header, it should act as it should with a regular 500 error (“server encountered an unexpected condition which prevented it from fulfilling the request”).

What FlashGet does though is re-attempts the download every 3 seconds – whether or not the Retry-After header is present! Every 3 seconds to me seems a bit unnecessarily aggressive, but failing to respect the Retry-After header is the real problem here, as there’s nothing server administrators can do to reduce the number of excess attempts (short of blocking this User-Agent completely – which, realistically, probably isn’t that bad an idea).

This means that, if you’re downloading using FlashGet and you’re using the defaults, the whole time you’re downloading from a server that is using 503s to try to block you from making too many connections, you’re also spamming the server with requests for files every 3 seconds. The default (as of version is to try to create 5 connections at once.

As a user, you probably don’t give a shit, but it’s a real pain in the ass for people running servers, as it means log files quickly fill up with thousands upon thousands of these attempts over the course of a single download from a single user. Start adding in thousands of users and you quickly end up with a really annoying situation.

What FlashGet should do:

Assuming that I’m right about the above (which I’m relatively confident about after some testing, though not 100%; it’s certainly possible I screwed up or missed something), here’s some changes I’d like to see (in increasing order of importance):

1) Change their default User-Agent to identify as FlashGet.
2) Change the default behaviour on 503s to wait longer than 3 seconds. I think 60 seconds is a reasonable “bare minimum”, though I would say the longer the better.
3) Make it respect the Retry-After header. This is super-important.

I have posted this as a suggestion on the official FlashGet forum.