I think you are attacking it from the wrong angle by trying to encode all posted data.
Note that a "<
" could also come from other outside sources, like a database field, a configuration, a file, a feed and so on.
Furthermore, "<
" is not inherently dangerous. It's only dangerous in a specific context: when writing strings that haven't been encoded to HTML output (because of XSS).
In other contexts different sub-strings are dangerous, for example, if you write an user-provided URL into a link, the sub-string "javascript:
" may be dangerous. The single quote character on the other hand is dangerous when interpolating strings in SQL queries, but perfectly safe if it is a part of a name submitted from a form or read from a database field.
The bottom line is: you can't filter random input for dangerous characters, because any character may be dangerous under the right circumstances. You should encode at the point where some specific characters may become dangerous because they cross into a different sub-language where they have special meaning. When you write a string to HTML, you should encode characters that have special meaning in HTML, using Server.HtmlEncode. If you pass a string to a dynamic SQL statement, you should encode different characters (or better, let the framework do it for you by using prepared statements or the like)..
When you are sure you HTML-encode everywhere you pass strings to HTML, then set ValidateRequest="false"
in the <%@ Page ... %>
directive in your .aspx
file(s).
In .NET 4 you may need to do a little more. Sometimes it's necessary to also add <httpRuntime requestValidationMode="2.0" />
to web.config (reference).
What if I want to obtain a distinct list based on one or more properties?
Simple! You want to group them and pick a winner out of the group.
List<Person> distinctPeople = allPeople
.GroupBy(p => p.PersonId)
.Select(g => g.First())
.ToList();
If you want to define groups on multiple properties, here's how:
List<Person> distinctPeople = allPeople
.GroupBy(p => new {p.PersonId, p.FavoriteColor} )
.Select(g => g.First())
.ToList();
Note: Certain query providers are unable to resolve that each group must have at least one element, and that First is the appropriate method to call in that situation. If you find yourself working with such a query provider, FirstOrDefault may help get your query through the query provider.
Note2: Consider this answer for an EF Core (prior to EF Core 6) compatible approach. https://stackoverflow.com/a/66529949/8155
Best Answer
So you want to create a search engine internal to your website. There are a couple of different options
A search engine in general exists out of (some of) the following parts:
These are non trivial things to create especially if you want a rich feature set like stemming (returning documents containing plural forms of search terms), highlighting results, indexing different document formats like pdf, rtf, html etc... so you want to use something already made for this purpose. This would only leave the task of connecting and orchestrating the different parts, writing the flow control logic.
You could use Lucene.net a opensource project with a lot of features. http://usoniandream.blogspot.com/2007/10/tutorial-implementing-lucenenet-search.html explains how to get started with it.
The other option is Microsoft indexing service which comes with windows but I would advice against it since it's difficult to tweak to work like you want and the results are sub-optimal in my opinion.