Setup
I'm running apache on an ubuntu server. I've created a fail2ban rule which bans an ip when they request too many pages too fast.
# Fail2ban Rule
failregex = ^.*?(:80|:443) <HOST> - .* "(GET|POST|HEAD).*$
ignoreregex =.*(.ico|.jpg|.png|.gif|.js|.css|.woff|.mp4)
findtime = 30
maxretry = 10
Goal:
I would like to run an old apache log against this new fail2ban rule so i can see if it would have banned any legitimate requests.
Attempt #1
I thought i might be able to use fail2ban-regex to get a list of potentially banned users but it doesn't have that functionality.
Attempt #2
I thought echoing the historical logs into the log which fail2ban is currently watching would make them get parsed. After fixing a small hangup where log lines having old dates were ignored (fixed by adding a year to them) fail2ban started parsing them and banning IPs from it. However, i had only to look at the first banned IP to see that it was wrong. The IP in question had only made 10 requests in total and they weren't anywhere close to each other time-wise. I can only assume that fail2ban isn't using the log line's timestamp to determine validity which makes this testing method a bust.
# echo example
zcat other_vhosts_access.log.8.gz | sed -n 's/\/2022:/\/2032:/p' >> /var/log/apache2/fail2ban_test.log
Conclusion
With both of my previous attempts failing i can't think of a sane way to approach this problem. Can somebody recommend a way to achieve what i'm after? Or offer insight into why my second solution isn't working.
Best Answer
directly seen it hasn't indeed, but...
Although newest versions of fail2ban-regex supports output parameters, so you could do something like this:
it would be only suitable if you'd find any IPs making a failures regardless the count/time. In your case it'd be senseless at least without some extra preprocessing.
It would not work because fail2ban would not really consider the time of message correctly: either it would be too old (if logged unmodified) or it would be incorrect (if now logged as time of failure, because you need to consider
maxretry
andfindtime
on real usage). Note to mention that fail2ban would seek tonow - findtime
by start (because other messages are not interesting to it, since too obsolete), see https://github.com/fail2ban/fail2ban/issues/2909#issuecomment-758036512.Anyway at the moment, it is hardly possible with stock fail2bans tools out of the box (at least unless this "rescan" facility from RFE above becomes implemented and released).
But since fail2ban (as well as
fail2ban-regex
) is a module in python, it would be possible with a filter from python writing bans to some log or sending them directly to main fail2ban instance, see https://github.com/fail2ban/fail2ban/issues/2909#issuecomment-1039267423 for such script example.Also note that your filter is extremely vulnerable and slow, better rewrite it as precise as possible, somehow like here:
And last but not least, why you need that at all? If the jail with such filter is active and such crawlers coming back, they will be banned as soon as they make
maxretry
failures duringfindtime
, configured for the jail. Preventive banning is not really needed and would just bother your net-filter subsystem with a lot of IPs (they would probably never come back again).