Imagine a game of Simon Says. It's like any other game of Simon Says, but there's one special rule: at the end of each round, one of the players gets to comment on their performance by saying a few words. The game might play out like this:
"Simon says touch your toes." "That was easy!" "Simon says spin around in a circle." "Whoa, now I'm dizzy." "Clap your hands...you're out! Simon didn't say clap your hands." "Shucks."
But Simon Says is a game based on doing what words say. What if the players used command words in their replies?
"Simon says touch your toes." "That was easy!" "Simon says spin around in a circle." "SIMON SAYS..." "Clap your hands...you're out! Simon didn't say clap your hands." "Really? Because it sounds EXACTLY like he did..."
SQL injection works like this. The "player" is the user, and their input can affect the outcome of any "game" (database query) where it's allowed to be treated the same as the words spoken by Simon (the programmer). Of course, while the consequences of cheating in a game are trivial, the consequences of injecting SQL statements into user input can be devastating. From theft to destruction of data, bad things can happen when SQL is not properly sanitized against injections. When writing code that uses SQL to access databases, there are two rules to follow: 1) Never trust user input, and 2) Always use prepared statements.
In Ruby, we can create methods to take arguments we then pass into an SQL query, like this:
But this technique is exactly what not to do, because it breaks both of the cardinal rules. By allowing name to be any string, it ignores the chances of a malicious user adding a semicolon and an additional query, or adding a boolean to affect how the database returns the results of the query (for more examples, check out this great resource.) It also lets the code pass right into an SQL query without preparing the query first. We don't want that. So let's fix it up a bit:
So what's different here? Well, in the first example, the execution happened on everything at once - the query code and the inserted argument were all processed at the same time. But here, the query statement is prepared separately from the argument, so the program knows the exact scope and intention of the query before it even looks at the string passed in as the argument. This statement preparation (also known as query parameterization) is the most secure way to prevent SQL injection. Notice that the change isn't even that complicated. If there was more than one argument being passed, there would be one question mark for each variable, and the variables passed into the execute statement would have to be written in order, like so:
"Simon says touch your toes, and whatever you say next can only be treated as a reply and never as the start of a new round." "SIMON SAYS..." "Nice try, but I was prepared for you to try something like that, and it's only going to count as a reply to the last round." "Whoa, you're good at this security stuff." "Simon says you can be too, if you use prepared statements."
Since it's not a huge hassle to prevent against SQL injection, and since the consequences of unsterilized SQL can be so dire, there's really no excuse not to use techniques like this to keep your programs nice and secure. Try it out for yourself. Happy coding!