Use the pretty_generate()
function, built into later versions of JSON. For example:
require 'json'
my_object = { :array => [1, 2, 3, { :sample => "hash"} ], :foo => "bar" }
puts JSON.pretty_generate(my_object)
Which gets you:
{
"array": [
1,
2,
3,
{
"sample": "hash"
}
],
"foo": "bar"
}
It's a common misconception that user input can be filtered. PHP even has a (now deprecated) "feature", called magic-quotes, that builds on this idea. It's nonsense. Forget about filtering (or cleaning, or whatever people call it).
What you should do, to avoid problems, is quite simple: whenever you embed a a piece of data within a foreign code, you must treat it according to the formatting rules of that code. But you must understand that such rules could be too complicated to try to follow them all manually. For example, in SQL, rules for strings, numbers and identifiers are all different. For your convenience, in most cases there is a dedicated tool for such an embedding. For example, when you need to use a PHP variable in the SQL query, you have to use a prepared statement, that will take care of all the proper formatting/treatment.
Another example is HTML: If you embed strings within HTML markup, you must escape it with htmlspecialchars
. This means that every single echo
or print
statement should use htmlspecialchars
.
A third example could be shell commands: If you are going to embed strings (such as arguments) to external commands, and call them with exec
, then you must use escapeshellcmd
and escapeshellarg
.
Also, a very compelling example is JSON. The rules are so numerous and complicated that you would never be able to follow them all manually. That's why you should never ever create a JSON string manually, but always use a dedicated function, json_encode()
that will correctly format every bit of data.
And so on and so forth ...
The only case where you need to actively filter data, is if you're accepting preformatted input. For example, if you let your users post HTML markup, that you plan to display on the site. However, you should be wise to avoid this at all cost, since no matter how well you filter it, it will always be a potential security hole.
Best Answer
Ryan Grove's Sanitize goes a lot farther than Rails 3
sanitize
. It ensures the output HTML is well-formed and has three built-in whitelists:Sanitize::Config::RESTRICTED Allows only very simple inline formatting markup. No links, images, or block elements.
Sanitize::Config::BASIC Allows a variety of markup including formatting tags, links, and lists. Images and tables are not allowed, links are limited to FTP, HTTP, HTTPS, and mailto protocols, and a attribute is added to all links to mitigate SEO spam.
Sanitize::Config::RELAXED Allows an even wider variety of markup than BASIC, including images and tables. Links are still limited to FTP, HTTP, HTTPS, and mailto protocols, while images are limited to HTTP and HTTPS. In this mode, is not added to links.