Because I hate doing things the easy way, I often try to set up what I think are basic Linux services on boxes with very little memory. This almost always ends up in (my) tears.
In this case, I set up an EC2 t3.nano instance to serve as a legacy FTP server for an integration with a client ERP system. The server has been working perfectly in every single way – except for some reason, yum-cron wouldn’t run.
In /var/log/cron I’d see things like this:
anacron[16639]: Job `cron.daily' locked by another anacron - skipping
There were no other errors I could find anywhere; /var/log/yum just didn’t have any new lines or additions.
I spent a couple days wondering if the cron/anacron setup was correct – I couldn’t see any obvious problems, so eventually just tried invoking it from the console.
Running /usr/sbin/yum-cron
manually, it just sat there for a few seconds before reporting ‘Killed’. What was killing it?!
Obviously (in retrospect) it was OOM Killer. Looking in dmesg revealed the following:
[90980.699432] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=yum-cron,pid=7002,uid=0
[90980.708766] Out of memory: Killed process 7002 (yum-cron) total-vm:756616kB, anon-rss:326396kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1160kB oom_score_adj:0
So it turns out yum-cron – which from what I can tell is a relatively simple Python script wrapper around yum – simply exhausts the memory on a t3.nano, with its paltry 512MB and cannot complete before OOM Killer wipes it out.
Adding a simple 1GB swapfile on the server fixes the problem, although the docs recommend not doing this on the EBS storage directly & instead doing it on ‘ephemeral storage instance store volumes’; this comment indicates that it might be slower and/or more expensive due to increased IO, but it seems that t3 instances do not get free access to the instance store.
Looking at the additional complexity of doing swap ‘properly’ on AWS makes it seem pretty compelling, though it basically doubles the cost.