Sql – How does SQL query parameterisation work

Securitysqlsql-injection

I feel a little silly for asking this since I seem to be the only person in the world who doesn't get it, but here goes anyway. I'm going to use Python as an example. When I use raw SQL queries (I usually use ORMs) I use parameterisation, like this example using SQLite:

Method A:

username = "wayne"
query_params = (username)
cursor.execute("SELECT * FROM mytable WHERE user=?", query_params)

I know this works and I know this is the generally recommended way to do it. A SQL injection-vulnerable way to do the same thing would be something like this:

Method B:

username = "wayne"
cursor.execute("SELECT * FROM mytable WHERE user='%s'" % username)

As far I can tell I understand SQL injection, as explained in this Wikipedia article. My question is simply: How is method A really different to method B? Why is the end result of method A not the same as method B? I assume that the cursor.execute() method (part of Python's DB-API specification) takes care of correctly escaping and type-checking the input, but this is never explicitly stated anywhere. Is that all that parameterisation in this context is? To me, when we say "parameterisation", all that means is "string substitution", like %-formatting. Is that incorrect?

Best Answer

A parameterized query doesn't actually do string replacement. If you use string substitution, then the SQL engine actually sees a query that looks like

SELECT * FROM mytable WHERE user='wayne'

If you use a ? parameter, then the SQL engine sees a query that looks like

SELECT * FROM mytable WHERE user=<some value>

Which means that before it even sees the string "wayne", it can fully parse the query and understand, generally, what the query does. It sticks "wayne" into its own representation of the query, not the SQL string that describes the query. Thus, SQL injection is impossible, since we've already passed the SQL stage of the process.

(The above is generalized, but it more or less conveys the idea.)