There's a general misconception (and misuse) associated with 403 Forbidden
: it's not supposed to give anything away about what the server thinks about the request. It's specifically designed to say,
I get what you're requesting, but I'm not going handle the request, no matter what you try. So stop trying.
Any UA or client should interpret that to mean that the request will never work, and respond appropriately.
This has implications for clients handling requests on behalf of users: if a user isn't logged in, or mistypes, the client handling the request should reply, "I'm sorry, but I can't do anything" after the first time it gets the 403
and stop handling future requests. Obviously, if you want a user to still be able to request access to their personal information after a failure, this is a user-hostile behavior.
403
is in contrast to 401 Authorization Required
, which does give away that the server will handle the request as long as you pass the correct credentials. This is usually what people think about when they hear 403
.
It's also in contrast with 404 Page Not Found
which, as others pointed out, is designed not only to say "I can't find that page" but to suggest to the client that the server makes no claims of success or failure for future requests.
With 401
and 404
, the server doesn't say anything to the client or UA about how they should proceed: they can keep trying in hopes of getting a different response.
So 404
is the appropriate way to handle a page you don't want to show to everyone, but don't want to give away anything about why you won't show it in certain situations.
Of course, this assumes the client making the request cares for petty RFC flippancy. A malicious enough client isn't going to care about the status code returned except in an incidental manner. One will know it's a hidden user page (or a potential hidden user page) by comparing it to other, known user pages.
That is, let's say your handler is users/*
. If I know users/foo
, users/bar
and users/baaz
work, the server returning a 401
, 403
, or 404
for users/quux
doesn't mean I'm not going to try it, especially if I have reason to believe there is a quux
user. A standard example scenario is Facebook: my profile is private, but my comments on public profiles are not. A malicious client knows I exist even if you return 404
on my profile page.
So status codes aren't for the malicious use cases, they're for the clients playing by the rules. And for those clients, a 401
or a 404
request is most appropriate.
I'm afraid adding a Web Service layer is probably the correct solution to your problem.
Separating the client from the underlying database implementation will probably help you in the long run too.
Adding a web service layer doesn't necessarily have to hurt performance...
Indeed, with an appropriate API, a web service can actually improve performance, by batching together multiple database queries within the data center LAN, rather than requiring multiple round trips over the WAN.
And of course a web service layer can often be scaled horizontally, and add appropriate caching to your database queries, perhaps even a change notification mechanism.
A server layer adds security that you cannot possibly ensure with apps running on a remote client. Anything that runs on a client can be "hacked" and should not really be considered in any way trusted. You should only really put presentation logic in the client, and host anything important on hardware you have complete control of.
I don't know about your apps, but my web apps are naturally split into several layers, with the presentation code separated from the persistence layer by at least one level of business logic that keeps the two apart. I find this makes it much easier to reason about my app, and so much faster to add or modify functionality. If the layers are separated anyway, it is relatively easy to keep the presentation layer in the client, and the rest on a server under my control.
So while you can solve your problems without introducing a "web service" layer, by the time you have written all the stored procedures (or equivalent) necessary to fill in the holes in the standard database security implementation, you would probably be better off writing a server-side application that you can write proper unit tests for.
Best Answer
If you want to play by the rules, 403 Forbidden, or 403.6 IP address rejected (IIS specific) would be the correct response.
Giving a 200 response (and ignoring the comment) may just increase the load on the server, as the spam bot will presumably continue submitting spam on future occasions, unaware that it is having no effect. A 4XX response at least says "go away you need to check your facts" and is likely to diminish future attempts.
In the unlikely event you have firewall access, then a block of blacklisted IP addresses at the firewall would minimize server load / make it appear that your server didn't exist to the spammer.
I was going to suggest using a 302 Temporary Redirect to the spammer's own IP address - but this would probably have no effect as there would be no reason for the bot to follow the redirect.
If dealing with manually submitted spam, making the spam only visible by the IP address that submitted it is a good tactic. The spammer goes away happy and contented (and does not vary his approach to work around your defences), and the other users never see the spam.