I want to keep track of the spam scores that show up in my /var/log/mail.log on my Debian Etch machine. My idea was to run a shell script grep’ing the wanted info out of the mail.log right before that file is to be rotated. logrotate provides configuration options for how and when log files should be rotated. This includes a possibility to run a script right before and after the rotation is done. So, I created a configuration /etc/logrotate.d/mail like
/var/log/mail.log /var/log/mail.info {
...
sharedscripts
prerotate
/usr/local/sbin/extract-spamscores
endscript
postrotate
/etc/init.d/sysklogd reload-or-restart
endscript
}
sysklogd is Debian’s package containing syslogd and klogd. The syslog itself already cares for rotating the basic log files, and this includes /var/log/mail.{log,info,warn,err}. sysklogd contains the cron-jobs /etc/cron.daily/sysklogd and /etc/cron.weekly/sysklogd, both of which contain logic for rotating the system’s log files. To obtain the list of the log files, the command syslogd-listfiles is issued in both of the scripts—the weekly script uses the option --weekly. When I issue syslogd-listfiles, I only get /var/log/syslog as answer, whereas with --weekly I get:
# syslogd-listfiles --weekly
/var/log/mail.warn
/var/log/uucp.log
/var/log/user.log
/var/log/daemon.log
/var/log/messages
/var/log/debug
/var/log/auth.log
/var/log/mail.err
/var/log/mail.log
/var/log/kern.log
/var/log/lpr.log
/var/log/mail.info
So I have to use the -s switch to exclude mail.log and mail.info such that logrotate can take care of them. As I also have a separate config for /var/log/messages in /etc/logrotate.d/messages, I modified the call of syslogd-listfiles --weekly in /etc/cron.weekly/sysklogd to
syslogd-listfiles --weekly -s "(messages|mail.(log|info))"
However, I wondered why the prospective file /var/log/spamscores never showed up. The mail.log got rotated every day at the time where the daily cron-jobs were running, but the prerotate script didn’t seem to run—the spam scores weren’t extracted.
An odyssey of research began. What was rotating my mail.log? Was it syslog, or logrotate? If it was logrotate, why didn’t it run the prerotate script? A dry-run of logrotate using the option -d for debugging showed me that mail.log and mail.info didn’t meet all conditions for rotation. Sure, they were too small yet, and I configured to only rotate them if they exceeded a size of 4MB. I changed the call of logrotate in /etc/cron.daily/logrotate to include the -v flag for verbose output. Next morning cron sent me a mail with the output, and logrotate was again claiming that those two files didn’t need rotation: They were only ~1MB, just like every morning. Therefore the prerotate and postrotate scripts weren’t called, too. I looked into /var/log and had to see that mail.log and mail.info had been rotated yet another time. What the hell?? The previous file had been renamed to mail.log.0 and the older gzip’ed file-numbers also got incremented. The oldest one was mail.log.6.gz. How could that be? In /etc/logrotate.d/mail I didn’t configure to keep 7 copies, but only 5.
I compared everything with my Apache log rotation config in /etc/logrotate.d/apache2 which works perfectly. It considers the size limit and runs the prerotate and postrotate scripts just as it should. The previous access.log gets renamed to access.log.1 and the oldest one is access.log.5.gz. Hey, wait a minute: I already saw that syslog starts counting at 0, wheres logrotate starts at 1. Is syslog rotating my mail.log? But why? Well, I excluded mail.log from the weekly cron job only, but the mail.log got rotated daily. However, the call of syslogd-listfiles only lists /var/log/syslog as output, so why the hell should logrotate think of rotating /var/log/mail.log?
I took a closer look at /usr/sbin/syslogd-listfiles. It is a perl script, and, of course, it has some logic what files to list: Sure, given options like --weekly, --auth or --all, it should provide a different list of log files. Then my eyes got hold on the line
$opt_large = 1024*1024;
and I recognized there was a logic that excludes files which are smaller than 1MB. No wonder why a call of syslogd-listfiles during the day didn’t list all other log files: They were still too small! For verification, I changed that line temporarily to
$opt_large = 1*1024;
and suddenly all log files were listed when I issued syslogd-listfiles!
The final solution then was to exclude /var/log/messages, /var/log/mail.log and /var/log/mail.info also from /etc/cron.daily/sysklogd. Now, syslog doesn’t handle them anymore and so logrotate finally attends to rotate my mail.log and to call the script which extracts the spam scores.
For the sake of completeness, here’s my simple extract-spamscores script; you have to replace gauss by your own hostname:
#!/bin/bash
# $1 contains the date, and $2 the actual score
grep SPAM, /var/log/mail.log | perl -ne \
’if ($_ =~ /(.*?)gauss.*Hits: (.*?), /) { print “$1$2\n” }’ \
>> /var/log/spamscores