The correct way to avoid SQL injection attacks, no matter which database you use, is to separate the data from SQL, so that data stays data and will never be interpreted as commands by the SQL parser. It is possible to create SQL statement with correctly formatted data parts, but if you don't fully understand the details, you should always use prepared statements and parameterized queries. These are SQL statements that are sent to and parsed by the database server separately from any parameters. This way it is impossible for an attacker to inject malicious SQL.
You basically have two options to achieve this:
Using PDO (for any supported database driver):
$stmt = $pdo->prepare('SELECT * FROM employees WHERE name = :name');
$stmt->execute([ 'name' => $name ]);
foreach ($stmt as $row) {
// Do something with $row
}
Using MySQLi (for MySQL):
$stmt = $dbConnection->prepare('SELECT * FROM employees WHERE name = ?');
$stmt->bind_param('s', $name); // 's' specifies the variable type => 'string'
$stmt->execute();
$result = $stmt->get_result();
while ($row = $result->fetch_assoc()) {
// Do something with $row
}
If you're connecting to a database other than MySQL, there is a driver-specific second option that you can refer to (for example, pg_prepare()
and pg_execute()
for PostgreSQL). PDO is the universal option.
Correctly setting up the connection
Note that when using PDO to access a MySQL database real prepared statements are not used by default. To fix this you have to disable the emulation of prepared statements. An example of creating a connection using PDO is:
$dbConnection = new PDO('mysql:dbname=dbtest;host=127.0.0.1;charset=utf8', 'user', 'password');
$dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
In the above example the error mode isn't strictly necessary, but it is advised to add it. This way the script will not stop with a Fatal Error
when something goes wrong. And it gives the developer the chance to catch
any error(s) which are throw
n as PDOException
s.
What is mandatory, however, is the first setAttribute()
line, which tells PDO to disable emulated prepared statements and use real prepared statements. This makes sure the statement and the values aren't parsed by PHP before sending it to the MySQL server (giving a possible attacker no chance to inject malicious SQL).
Although you can set the charset
in the options of the constructor, it's important to note that 'older' versions of PHP (before 5.3.6) silently ignored the charset parameter in the DSN.
Explanation
The SQL statement you pass to prepare
is parsed and compiled by the database server. By specifying parameters (either a ?
or a named parameter like :name
in the example above) you tell the database engine where you want to filter on. Then when you call execute
, the prepared statement is combined with the parameter values you specify.
The important thing here is that the parameter values are combined with the compiled statement, not an SQL string. SQL injection works by tricking the script into including malicious strings when it creates SQL to send to the database. So by sending the actual SQL separately from the parameters, you limit the risk of ending up with something you didn't intend.
Any parameters you send when using a prepared statement will just be treated as strings (although the database engine may do some optimization so parameters may end up as numbers too, of course). In the example above, if the $name
variable contains 'Sarah'; DELETE FROM employees
the result would simply be a search for the string "'Sarah'; DELETE FROM employees"
, and you will not end up with an empty table.
Another benefit of using prepared statements is that if you execute the same statement many times in the same session it will only be parsed and compiled once, giving you some speed gains.
Oh, and since you asked about how to do it for an insert, here's an example (using PDO):
$preparedStatement = $db->prepare('INSERT INTO table (column) VALUES (:column)');
$preparedStatement->execute([ 'column' => $unsafeValue ]);
Can prepared statements be used for dynamic queries?
While you can still use prepared statements for the query parameters, the structure of the dynamic query itself cannot be parametrized and certain query features cannot be parametrized.
For these specific scenarios, the best thing to do is use a whitelist filter that restricts the possible values.
// Value whitelist
// $dir can only be 'DESC', otherwise it will be 'ASC'
if (empty($dir) || $dir !== 'DESC') {
$dir = 'ASC';
}
REST is the underlying architectural principle of the web. The amazing thing about the web is the fact that clients (browsers) and servers can interact in complex ways without the client knowing anything beforehand about the server and the resources it hosts. The key constraint is that the server and client must both agree on the media used, which in the case of the web is HTML.
An API that adheres to the principles of REST does not require the client to know anything about the structure of the API. Rather, the server needs to provide whatever information the client needs to interact with the service. An HTML form is an example of this: The server specifies the location of the resource and the required fields. The browser doesn't know in advance where to submit the information, and it doesn't know in advance what information to submit. Both forms of information are entirely supplied by the server. (This principle is called HATEOAS: Hypermedia As The Engine Of Application State.)
So, how does this apply to HTTP, and how can it be implemented in practice? HTTP is oriented around verbs and resources. The two verbs in mainstream usage are GET
and POST
, which I think everyone will recognize. However, the HTTP standard defines several others such as PUT
and DELETE
. These verbs are then applied to resources, according to the instructions provided by the server.
For example, Let's imagine that we have a user database that is managed by a web service. Our service uses a custom hypermedia based on JSON, for which we assign the mimetype application/json+userdb
(There might also be an application/xml+userdb
and application/whatever+userdb
- many media types may be supported). The client and the server have both been programmed to understand this format, but they don't know anything about each other. As Roy Fielding points out:
A REST API should spend almost all of its descriptive effort in
defining the media type(s) used for representing resources and driving
application state, or in defining extended relation names and/or
hypertext-enabled mark-up for existing standard media types.
A request for the base resource /
might return something like this:
Request
GET /
Accept: application/json+userdb
Response
200 OK
Content-Type: application/json+userdb
{
"version": "1.0",
"links": [
{
"href": "/user",
"rel": "list",
"method": "GET"
},
{
"href": "/user",
"rel": "create",
"method": "POST"
}
]
}
We know from the description of our media that we can find information about related resources from sections called "links". This is called Hypermedia controls. In this case, we can tell from such a section that we can find a user list by making another request for /user
:
Request
GET /user
Accept: application/json+userdb
Response
200 OK
Content-Type: application/json+userdb
{
"users": [
{
"id": 1,
"name": "Emil",
"country: "Sweden",
"links": [
{
"href": "/user/1",
"rel": "self",
"method": "GET"
},
{
"href": "/user/1",
"rel": "edit",
"method": "PUT"
},
{
"href": "/user/1",
"rel": "delete",
"method": "DELETE"
}
]
},
{
"id": 2,
"name": "Adam",
"country: "Scotland",
"links": [
{
"href": "/user/2",
"rel": "self",
"method": "GET"
},
{
"href": "/user/2",
"rel": "edit",
"method": "PUT"
},
{
"href": "/user/2",
"rel": "delete",
"method": "DELETE"
}
]
}
],
"links": [
{
"href": "/user",
"rel": "create",
"method": "POST"
}
]
}
We can tell a lot from this response. For instance, we now know we can create a new user by POST
ing to /user
:
Request
POST /user
Accept: application/json+userdb
Content-Type: application/json+userdb
{
"name": "Karl",
"country": "Austria"
}
Response
201 Created
Content-Type: application/json+userdb
{
"user": {
"id": 3,
"name": "Karl",
"country": "Austria",
"links": [
{
"href": "/user/3",
"rel": "self",
"method": "GET"
},
{
"href": "/user/3",
"rel": "edit",
"method": "PUT"
},
{
"href": "/user/3",
"rel": "delete",
"method": "DELETE"
}
]
},
"links": {
"href": "/user",
"rel": "list",
"method": "GET"
}
}
We also know that we can change existing data:
Request
PUT /user/1
Accept: application/json+userdb
Content-Type: application/json+userdb
{
"name": "Emil",
"country": "Bhutan"
}
Response
200 OK
Content-Type: application/json+userdb
{
"user": {
"id": 1,
"name": "Emil",
"country": "Bhutan",
"links": [
{
"href": "/user/1",
"rel": "self",
"method": "GET"
},
{
"href": "/user/1",
"rel": "edit",
"method": "PUT"
},
{
"href": "/user/1",
"rel": "delete",
"method": "DELETE"
}
]
},
"links": {
"href": "/user",
"rel": "list",
"method": "GET"
}
}
Notice that we are using different HTTP verbs (GET
, PUT
, POST
, DELETE
etc.) to manipulate these resources, and that the only knowledge we presume on the client's part is our media definition.
Further reading:
(This answer has been the subject of a fair amount of criticism for missing the point. For the most part, that has been a fair critique. What I originally described was more in line with how REST was usually implemented a few years ago when I first wrote this, rather than its true meaning. I've revised the answer to better represent the real meaning.)
Best Answer
There are a lot of answers here, and problems with quite a few of them.
Things you should NOT do:
If you need to read the session token from JavaScript, you're doing something horribly wrong. Your session identifier cookie should ALWAYS have HTTPOnly set on it so its not available to scripts.
This one protection makes it so that the impact of XSS is considerably reduced, since an attacker will no longer be able to get a logged in users session token, which for all intents and purposes are the equivalent of credentials in the application. You don't want one error to give keys to the kingdom.
The session identifier should not be written to the contents of the page. This is for the same reasons you set HTTPOnly. This means that that your csrf token can not be your session id. They need to be different values.
Things you should do:
Follow OWASP's guidance:
Specifically, if this is a REST application you can require double-submission of CSRF tokens. If you do this, just be sure that you define it to a specific full-domain (www.mydomain.com) and not a parent domain (example.com), and that you also utilize the "samesite" cookie attribute which is gaining popularity.
Simply create something cryptographically random, store it in ASCII Hex or Base64 encode, and add it as a cookie and to your forms when the server returns the page. On the server side make sure that the cookie value matches the form value. Voila, you've killed CSRF, avoided extra prompts for your users, and not opened yourself up to more vulnerabilities.
NOTE: As @krubo states below the double-submission technique has been found to have some weaknesses (See Double-Submission). Since this weakness requires that:
I kind of think the weakness falls more in the category of a "Cool Defcon Talk" rather than a "Realworld Security Risk". In any case, if you are going to use double-submission it doesn't hurt to take a few extra steps to protect yourself fully.
New Update 07/06/2020
My new favorite way to do double-submission is to create and pass a cryptographic random string in the body of the request as before; but rather than have the cookie be the same exact value have the cookie be the encoded value of the string being signed by a certificate. This is still just as easy to validate on the server side, but is MUCH harder for an attacker to mimic. You should still use the samesite Cookie attribute and other protections outlined earlier in my post.