Indexes, limits, bounds, etc., are always security-related with C code because of overflow exploit issues. Specifically, if you get the number wrong, you code is a candidate for exploitation.
Assigning the constant to a symbol does two things: it makes the number easier to verify in a review because it's immediately obvious what the number means, and more importantly, it ensures that every time you use the number in your code, you're using the SAME value.
Imagine, for example, where the number represents the length of the input field, and at some point you increase that length to accommodate larger names. If you've specified the value numerically in your code, you have to go track down every instance of that number and replace it, but only if the instance of that number represents your specific field (you can't just use search and replace to change every instance of 128 to 256 because the number could mean different things).
Furthermore, in some instances you may be using N+1 (e.g. to allow for termination), so you'd have to track down every instance of 128 and every instance of 129. And was there a reason to specify 130 as well? Oh, now it's difficult to remember. But don't miss any of them of you'll create a classic buffer overflow exploit.
If instead you just did #define FIRST_NAME_LENGTH 128
in one of your include files, and keyed all the corresponding values off that, then you can just change the number once and be done with it.
This is true even in instances where you only use the number once, because while you're CURRENTLY only using the number once, someone may need to extend the code in the future.
This is such an important issue that you should have been taught from day 1 to avoid "magic numbers" in your code. If the number "means" something, then you should make its meaning explicit.
Best Answer
In general anything entered by a user (or untrusted machine or software using an API) needs to be escaped before embedding it within code (HTML, JavaScript, etc.) that is interpreted. Escaping is I think what you mean by "making it harmless". Most libraries have APIs (like
htmlentities()
) to facilitate this.If you don't escape it, storing it is basically the only safe thing you can do. Analysis can be OK, as long as the analyzer cannot be commandeered by its input (i.e. it is robust and defensive and has no exploit).
Modifying input (e.g. stripping dangerous characters) can also be effective, but it is hard to do it in such a way that legitimate characters are suppressed (false positives). For example, if someone's name is John O'Malley-O'Hara, you don't want the system to remove the apostrophes (or the text between them), even though they look like single-quote delimiters common in code. In other words, it is so hard to make sure that input modification is done right that is it perhaps better to not do it at all.
I think the best approach is to treat all input carefully and escape it when displaying it. Some languages and frameworks can assist you with this (see "taint mode").