Strange GET requests on Apache and a lot of 404 errors every day

apache-2.2logging

It's been about a month since I have been receiving strange requests to my website. It's self hosted on a VPS and some of those requests are noted by Google Analytics too.

I really tried to figure out what they are and I'm really worried of being attacked or something since I just host my personal blog.

Sample:

my-blog.com:80 173.245.62.105 - - [28/Jun/2014:22:58:35 +0000] "GET /d4/h/A5/static/js/app/component/trendingtopics.js HTTP/1.1" 404 4236 "indulgy.net" "Mozilla/5.0 (Linux; U; Android 4.1.2; zh-cn; HUAWEI C8815 Build/HuaweiC8815) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"
my-blog.com:80 173.245.62.151 - - [28/Jun/2014:22:58:35 +0000] "GET /d4/h/A5/static/js/app/component/trendingpages.js HTTP/1.1" 404 4237 "indulgy.net" "Mozilla/5.0 (Linux; U; Android 4.1.2; zh-cn; HUAWEI C8815 Build/HuaweiC8815) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"
my-blog.com:80 173.245.62.151 - - [28/Jun/2014:22:58:35 +0000] "GET /d4/h/A5/static/js/app/controller/hashtagsctrl.js HTTP/1.1" 404 4237 "indulgy.net" "Mozilla/5.0 (Linux; U; Android 4.1.2; zh-cn; HUAWEI C8815 Build/HuaweiC8815) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"
my-blog.com:80 173.245.62.105 - - [28/Jun/2014:22:58:35 +0000] "GET /d4/h/A5/static/js/app/component/trendingtopics.js HTTP/1.1" 404 4236 "indulgy.net" "Mozilla/5.0 (Linux; U; Android 4.1.2; zh-cn; HUAWEI C8815 Build/HuaweiC8815) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"
my-blog.com:80 173.245.62.151 - - [28/Jun/2014:22:58:35 +0000] "GET /d4/h/A5/static/js/app/component/trendingpages.js HTTP/1.1" 404 4237 "indulgy.net" "Mozilla/5.0 (Linux; U; Android 4.1.2; zh-cn; HUAWEI C8815 Build/HuaweiC8815) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"
my-blog.com:80 173.245.62.151 - - [28/Jun/2014:22:58:35 +0000] "GET /d4/h/A5/static/js/app/controller/hashtagsctrl.js HTTP/1.1" 404 4237 "indulgy.net" "Mozilla/5.0 (Linux; U; Android 4.1.2; zh-cn; HUAWEI C8815 Build/HuaweiC8815) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30"
my-blog.com:80 173.245.54.211 - - [28/Jun/2014:22:59:08 +0000] "GET /QB/42/65/91f0cb1971ef5b1da1cdde3271456dc0.jpg HTTP/1.1" 404 4236 "-" "msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)"
my-blog.com:80 173.245.54.204 - - [28/Jun/2014:22:59:33 +0000] "GET /a6/RA/uC/37225134389464162UJrrTmrkc.jpg HTTP/1.1" 404 4235 "http://www.bing.com/images/search?q=antique+pewter+benjamin+moore&FORM=HDRSC2" "Mozilla/5.0 (Linux; U; en-us; KFTHWI Build/JDQ39) AppleWebKit/535.19 (KHTML, like Gecko) Silk/3.19 Safari/535.19 Silk-Accelerated=true"
my-blog.com:80 173.245.54.204 - - [28/Jun/2014:22:59:43 +0000] "GET /a6/RA/uC/37225134389464162UJrrTmrkc.jpg HTTP/1.1" 404 4236 "http://www.bing.com/images/search?q=antique+pewter+benjamin+moore&FORM=HDRSC2" "Mozilla/5.0 (Linux; U; en-us; KFTHWI Build/JDQ39) AppleWebKit/535.19 (KHTML, like Gecko) Silk/3.19 Safari/535.19 Silk-Accelerated=true"
my-blog.com:80 108.162.218.130 - - [28/Jun/2014:23:00:25 +0000] "GET /G6/To/2x/2327798744591902060paNYnMPc.jpg HTTP/1.1" 404 4236 "http://universe-marvel.forumactif.org/t547-lien-de-jean" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0"

I have like 16k of these a day and some are weird 404 and others are regular access to css/jss that gives 200. A lot of the requests comes from "indulgy.net" and I really don't know what it is.

The server uses ISPConfig on Debian and everythig is updated, some e-mail ports are blocked by my provider as I use Google Apps.

Can anyone help me understand these requests and protect myself from them?

Best Answer

There's always a lot of "background noise" being generated by bots scanning random ip-ranges and trying for known vulnerabilities and exploits. It's a fact of modern digital life and you see evidence of that in your log files.
Keep your systems patched and up-to-date and learn to live with that.

When a lot of traffic is for more valid looking URL's that aren't and never were part of your website, you're most likely suffering from either:

  • a recycled domain. Your domain was registered and active in the past and a lot of links still point to content of the previous site owner. Live with it.

  • a recycled ip-address. The ip-address you've been assigned was used in the past by a previous tenant and when they terminated their hosting plan they didn't update/terminate their DNS accordingly. I.e. their traffic still comes to your server.

  • similar to above but one of your IP neigbours made a typo in their DNS configuration.

The latter two are roughly the same: you don't own or operate www.example.org but it's still registered and pointing to your ip-address and you see traffic for them coming to your server.
Quite similar to moving to a new address either receiving mail for the previous owner or mail for Mrs Adams on number 13 being addressed to Mrs Adams at your number 30 instead.

Ideally you can identify and contact the owner of the mis-configured domain and they'll update their DNS.
Alternatively you create a separate name-based virtual host for www.example.com making the log files for your actual domain cleaner and reduce the impact by using a tiny error message instead of a nicely crafted HTML page.

Related Topic