28 Jan, 2015

Scala-Flavored Assortment of Play Injection Prevention Techniques, Part I: SQL

by Jack Mannino

Although many database access libraries are touted as injection-proof or secure by default, there are often plenty of exceptions in the fine print. Using the libraries in the “intended” way may magically remove the risk of injection attacks (unless there are flaws in the libraries themselves), but if you deviate from that way, you’re on your own. Often, developers revert back to plain old SQL simply because they can’t get their ORM to play nicely. Or, their generated code ends up being significantly slower and less efficient than churning out the SQL by hand. Or, they build apps with NoSQL databases and assume that NoSQL = No Problem.

In this three-part series, we’ll walk through the secure way to implement a few popular libraries for accessing databases and caches within Scala web applications. We’ll focus on libraries commonly used in Play web applications targeting SQL databases, MongoDB, Redis, Apache Hive, and Apache Cassandra.

If you’ve never heard of Injection or one of its most notorious variants (SQL Injection) before, start here first: https://www.owasp.org/index.php/SQL_Injection.

Slick

Slick allows you to interact with stored data as if you were working with Scala collections. It has good support for MySQL and PostgreSQL and also provides you with type safety. When combined with Lifted Embedding, Slick gives some protection against SQL Injection. However, it also allows you to drop back to plain SQL if you need to, which introduces the risk of SQL Injection. Let’s take a look at both approaches.

Lifted Embedding

With Lifted Embedding, Slick takes your collections and converts or “lifts” them into Rep objects. The original type is preserved and passed into the constructor to Rep, allowing for type safety and clean separation between code and data when communicating with the database. The example below demonstrates how Slick views your Double and String values after lifting occurs.

Upon lifting the variables, a Double becomes Rep[Double] and a String becomes Rep[String]. To take a deeper look into how Slick achieves some of this magic behind the scenes, take a dive into the scala.slick.lifted package within the source code:  https://github.com/slick/slick/tree/master/src/main/scala/scala/slick/lifted

Direct Embedding is another option, although at this time it’s only supported as an experimental feature (that you probably shouldn’t use in production). Direct Embedding takes a slightly different approach; it uses Scala Macros. To see an example of Direct Embedding in action, here is an Activator Template.

Plain SQL

There are two primary ways to satisfy your craving for plain old SQL using Slick. The first is via the sql”” and sqlu”” string interpolation prefixes. Parameterizing your queries using this approach is fairly simple. The default method is to reference a variable with a $ in front of it, like $waffles:

The unsafe way to do it is to use #$ to directly insert your literal into the SQL. While you may have noble ambitions, use this functionality cautiously as this kind of statement is vulnerable to injection:

The other way to use plain SQL is by utilizing the StaticQuery class. Thankfully, there are built-in options to make binding your variables straightforward. The example below shows the email value being substituted with a ? within the SQL string:

So that’s the right way to do it. Now let’s see it done the wrong way:

The email parameter’s value is concatenated into the rest of the SQL statement and, as a result, gets treated as code when it executes in the database. The attacker-controlled email value can be used to run arbitrary commands or modify queries in unexpected ways.

There’s also the concatenation operator +?, which automatically binds any variables concatenated. It’s a handy little piece of syntactic sugar, though it tends to result in the occasional omission of the ?, which won’t set off any warnings at compile time:

Under the hood, you can see exactly what Slick does when you provide the +? operator. The dynamic value is replaced by a bound variable (?) and placed within a set of parameters that are passed along to the database.

Slick also allows you to do other interesting things, such as generate your database (less boilerplate) and customize and override many behaviors. We won’t be touching on these in this post, though.

Anorm

Anorm is a data access layer that uses plain SQL for interacting with databases. In comparison to Slick, Anorm doesn’t try to abstract away as many details. There is no lifted embedding and there are no code generators. Instead, Anorm’s goal is to make working with JDBC less tedious. Anorm’s approach to SQL is very different from Slick, and Anorm’s documentation helps make this point:

You don’t need another DSL to access relational databases

SQL is already the best DSL for accessing relational databases. We don’t need to invent something new. Moreover the SQL syntax and features can differ from one database vendor to another.From a security perspective, Anorm makes it clear that you’re working with SQL. Anorm offers straightforward syntax to ensure that your values are properly parameterized as they are inserted into your statements. In this example, the email parameter is properly handled by enclosing it within brackets {} and providing the value within the .on method:

To implement the same query unsafely, just remove the brackets around your variable and concatenate the string directly into the statement. Then close your eyes and hope for the best:

Additionally, for Scala 2.10 and up, Anorm supports another pattern that allows you to easily parameterize your queries. By prefixing your variable names with $, you can safely perform interpolation:

Please come back soon for the next post in this series, which will focus on MongoDB using ReactiveMongo and tricks for safely using Redis within your Play applications.