How to block Googlebot quickly

apache-2.2

Google bot is crawling my site right now and it's killing my server. Its only crawiling one or two pages a second, but those pages are really CPU intensive. I have already added those CPU intensive files to the robots.txt file, but googlebot hasn't detected those changes yet. I want to block google bot at the apache.cong level so my site can be back right now. How can I do this? This one apoache instance is hosting a few PHP sites and a django powered site, so I can't use .htaccess files. The server is running Ubuntu 10.04.

Best Answer

If you know the googlebot's IP address, you could set a DROP rule in iptables, but that's a real hack.

iptables -I INPUT -s [source ip] -j DROP

where [source ip] is the googlebot's IP.

This'd definitely stop them, instantly, but it's a bit.. low level.

To unblock

iptables -D INPUT -s [source ip] -j DROP
Related Topic