Problems with DB abstraction

In a previous post< we have described in some detail various aspects of selecting nodes and terms ensuring that Drupal's node access system is respected. However, new and exciting examples continue to arrive at the CVS review doorstep.

Here's some more fopars we deal with on a regular basis...

  1. $sql = "SELECT nid FROM {node} WHERE type = '%s' AND title = '%s' LIMIT 1";
  2. $duplicate = db_result(db_query($sql, 'story', $item->title));  

Let's tackle the first obvious mistake here. We are selecting nodes from the {node} table. So straight away we should know that the SQL should be wrapped in db_rewrite_sql()<. Failing to do this is an access by-pass violation. Since this is a security issue this makes this application an automatic decline.

I suppose it is worth point out that since this seeks a duplication check the author may actually want to check against all nodes. If this is the case the code should have a comment stating that fact to a) tell me this and b) tell himself this in 6months when (s)he's forgotten (or the next maintainer who takes it over).

Additionally, the node's status flag isn't being checked so this statement will match unpublished story nodes. As noted above, this may well have been the intention of the author as this statement appears to be seeking to avoid a duplication. In that case, again, a comment should be provided indicating that the lack of a status check is intentional.

But there's more here than just that. Note how the SQL includes the use of the LIMIT statement? That's just not compatible across all potential database server implementations. Drupal gets around this issue with the db_query_range()< function. Not using this shows a lack of full understanding of Drupal's database abstraction layer.

Lastly, this is a pretty minor point but I'll note it anyway. The query is designed to select node types of "story". This is a module choice made by the author. The type of node here isn't passed in from user input. So there's really no need in this instance to use the '%s' token to place the fixed string within the SQL. Might as well just bury the variable directly into the SQL statement and save some processing. The token replace is only needed when variables are being placed into the SQL statement.

I personally would have used something similar to the following...

  1.   $sql = db_rewrite_sql("
  2.    SELECT COUNT(*)
  3.    FROM {node}
  4.    WHERE
  5.      status = 1 AND
  6.      type = 'story' AND
  7.      title = '%s'");
  8.   $is_dup = (int)db_result(db_query($sql, $item->title)) ? TRUE : FALSE;  

or, if I intentionally want to avoid any title clash this...

  1.   // We want to avoid all possible name clashes so here
  2.   // db_rewrite_sql() is not used and status is not checked
  3.   // so we test all node titles of type 'story'.
  4.   $sql = "
  5.    SELECT COUNT(*)
  6.    FROM {node}
  7.    WHERE
  8.      type = 'story' AND
  9.      title = '%s'";
  10.   $is_dup = (int)db_result(db_query($sql, $item->title)) ? TRUE : FALSE;