-
Notifications
You must be signed in to change notification settings - Fork 4
Google's SQL Pipe Syntax
ccs2199 - all
Overview
Scholars at Google have found a way to improve SQL. Key word being improve. SQL is a complicated language to write, learn and understand. For years it has been at the base of many database systems and throughout its lifetime, has had many attempted adjustments to no avail. Since its creation in the 1970s, scholars have tried to combat the difficult nature of SQL through creating other languages. These new languages, however, have lacked the ability to be widely adopted by users, who feel that while SQL is hard, it is better than adopting a new language. That is where the Google scholars come into play. These scholars have adapted SQL not by creating a new language but by extending it. By adding a piped data flow syntax to the language they have made SQL easier to learn and write and, by retaining the same base language, they have also retained SQL’s existing user base. More than that, Google SQL Pipe Syntax maintains full backwards compatibility and interoperability.
Current Issues
The current issues with SQL that the scholars wanted to address were the strict clause order, redundancy, need for subqueries, the confused data flow, and overall difficulty of the language to work with and learn. Clauses must be written in specific orders and often create redundancy with the lack of aggregation- leading to the use of multiple subqueries. The language also can not be interpreted well from relational algebra as the SQL clause orders do not match the semantic evaluations.
How does the technology solve the problem?
Pipe syntax, allows for operations to be composed arbitrarily and in any order. Similarly, while the syntactic clause order doesn't match semantic evaluation order in SQL, the extended version creates a syntactic clause that matches the semantic evaluation. It also reduces redundancy and the need for many subqueries through its extension and the addition of multiple new functions including an AGGREGATE function. Creating a language that allows for a logical flow of language by use of functions and by eliminating the need for new queries to get around past issues, these scholars have made SQL a much easier language to learn and write.

Other Technology & Pros/Cons
PRQL is another technology that bears similarities to SQL and Google’s Pipe Flow Syntax. PRQL is implemented on the front end and translates to and from SQL. However, PRQL is a new language that translates SQL rather than extending it. Google SQL is different in that it is not a new language but rather fixes the issues of SQL and extends it. Thereby, the pros are that Google SQL retains the same base language and simply fixes the problem meaning SQL users do not need to learn a new language. A con of the Google SQL though is that the pipe syntax isn’t easy to read when reading queries.
Connection to COMSW4111
Looking at what we did in class with translating relational algebra to SQL, Google’s extended version will make that job much easier. For example, when using the old SQL, the order of the relational algebra expression when read from right to left (starting at the innermost part of the expression) did not match its corresponding SQL code when read from top to bottom.
Using the pipe data flow syntax, reading the relational algebra expression right to left matches reading the corresponding SQL top to bottom due to the extended SQL’s ability to reduce redundancy, correct the order of data flow, and create a 1:1 relationship between pipe operators and relational operations.
For this example, I have chosen one of the queries written for Project 1. This example asks you to rewrite the following SQL query in Google’s SQL Pipe Syntax.
Setting up and installing Google’s pipe syntax is not readily available at this time. That being said, if you wish to use if for a project, follow this link LINK and it will take you to a form that enables you to enroll your project in the first preview of the syntax in BigQuery. The tutorial steps below demonstrate how the above SQL code can be written in pipe syntax following clear and easy instructions for even the most basic SQL user.
Step 1: Write the SQL query’s relational algebra expression.
Step 2: Map the order of commands from the inside out.
Step 3: Begin from the FROM clause and using the |> syntax for each following lines and the attached table of pipe clauses, re-write the SQL following the order of the mapped relational algebra (step 2).
Image from "SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL" by Jeff Shute et al.