Nuts and bolts - part 1

Before starting with the book’s topic I want to explain how to set up an efficient environment and some good practice which can improve the code’s readability and quality. As somebody will notice these methods are completely opposite to the general code style trends. I’ll try to give the motivation for each rule. Anyway, in general because the SQL universe is a strange place this requires strange approach. In order to write and read effectively the SQL the coder should gain the capability to get a mental map between the query’s sections and the underlying logic. This can be facilitated using a clear and well defined formatting.

Code formatting

The SQL language is based on statements. Each statement is terminated by a specific character, by default ``;’’ . In order to understand the logic of the SQL statements is very important to get the statements formatted in the same way the database executes them. We’ll look first how to format the SELECT, then the DML and finally the DDL. Each of those queries needs a different structure because of the different logic. The common formatting rules are the following.

  • There is a tab separator of 7 spaces
  • All the keywords are in upper case
  • After a round bracket there is a carriage return and one tab indented
  • The opening and closing round brackets are indented at the same level 

SELECT

When processing a query the parser works backward. It starts from the innermost and lowest complete statements and moves upward in order to get the entire picture. There is one remarkable exception. The WITH statements are computed first because required in the rest of the query. Actually the WITH acts like a temporary table. In order to see immediately the way PostgreSQL processes the query the indention should follow the same logic.
Basic formatting rules.

  • After the word SELECT there is a carriage return
  • The columns are separated by a comma and a carriage return
  • The columns indent one tab from the respective SELECT
  • The word FROM indents at the same level of SELECT
  • The relation indent one tab from the word FROM
  • The word WHERE indents at the same level of SELECT and FROM
  • A single condition indents one tab from the WHERE
  • Multiple conditions indent two tabs from the where and the logical operators indent one tab from the where
  • Between the logical operators and the condition there is one tab
  • Group by and order by follow the same rules of the select list
  • The queries joined by UNION, INTERSECT, and EXCEPT indent one tab
  • The terminator character indents at the same level of the SELECT,FROM and WHERE

Let’s format a simple select from one relation and with one or more condition in the where section.

The first thing to do is to change all the keywords in uppercase.

We’ll then format the select list.

Finally  the from and the where.

Having multiple conditions requires an extra tab in order to keep aligned the logical operators and the conditions. 

Joins.

  • If using the implicit joins the relations are separated by a comma and a carriage return, and indented at the same level
  • When using explicit joins specify the kind of join [INNER,LEFT OUTER,CROSS etc.]
  • After the left hand relation there is a carriage return
  • The join clause indents at the same level of the left hand relation
  • After the words [INNER,LEFT OUTER,CROSS] JOIN follows the relation’s name and a carriage return
  • The word ON indents one tab from the JOIN and there is a carriage return
  • The join conditions follow the same rules of the WHERE condition
  • Nested joins indent one tab from the upper join
  • The usage of the aliases, with the optional word AS is strongly encouraged 

Query with implicit join 

Query with explicit inner join

Subqueries

  • All the subqueries will indent one tab from the surrounding brackets
  • If the subquery is short in can stay in line with the round brackets (e.g. SELECT now())

Query with subselect in the SELECT’s list

Query with subselect in the FROM’s list