Data Science and ML

How to Write Basic SQL Queries in BigQuery

How to Write Basic SQL Queries in BigQuery
Image by Editor | Ideogram

 

This tutorial introduces the basics of SQL querying with Google BigQuery. While very similar, BigQuery SQL has some syntax differences with standard SQL, some of which will be highlighted along the post. For those familiar with SQL, adapting to BigQuery should be pretty straightforward.

Throughout examples, we will explore basic SELECT-FROM-WHERE queries and discover how to sort retrieved results in BigQuery. The examples shown below revolve around three tables loaded as illustrated in the previous article in this BigQuery series: ‘chinese_cuisine’, ‘japanese_cuisine’, and ‘thai_cuisine’, describing popular recipes from different asian cuisines. These three tables have identical schema, consisting of five columns:

  • name: string
  • description: string
  • type: string
  • preparation_time_mins: string
  • difficulty: string

 

Basic SELECT Statements

 

One of the simplest queries in BigQuery is the default one shown when opening a new query tab. For instance, when opening a new query tab for the ‘chinese_cuisine’ table, the default query is:

SELECT FROM `bigquery-project-432209.asian_cuisines.chinese_cuisine` LIMIT 1000

 

Executing this query by clicking the “RUN” button will throw an error. The reason is that it misses an important element: the column(s) of the table to be retrieved. In the simplest case that we want to retrieve all columns, then we only need to add an asterisk ‘*’ after the SELECT keyword.

SELECT * FROM `bigquery-project-432209.asian_cuisines.chinese_cuisine` LIMIT 1000

 

This query returns the entire table, or the first 1000 rows if it were larger (due to the ‘LIMIT 1000’ clause at the end).

name                 description                                                                                                                                               type          preparation_time_mins    difficulty
Egg Fried Rice       Fried rice with eggs, vegetables, and sometimes meat, seasoned with soy sauce.                                                                            Side Dish     20                       Low
Spring Rolls         Crispy rolls filled with vegetables, meat, or seafood, served with dipping sauce.                                                                         Appetizer     45                       Low
Sichuan Hotpot       A communal dish where diners cook various meats, vegetables, and tofu in a spicy and numbing broth at the table.                                          Main Course   90                       High
Mooncake             A traditional Chinese pastry typically eaten during the Mid-Autumn Festival, filled with lotus seed paste or red bean paste.                              Dessert       120                      High
Peking Duck          A famous Beijing dish featuring roasted duck with crispy skin, served with pancakes, hoisin sauce, and scallions.                                         Main Course   240                      High
Ma Po Tofu           A Sichuan dish made with soft tofu set in a spicy, oily, and salty chili-and-bean-based sauce, usually with minced meat.                                  Main Course   25                       Medium
Kung Pao Chicken     A spicy stir-fry dish made with chicken, peanuts, vegetables, and chili peppers.                                                                          Main Course   30                       Medium
Chow Mein            Stir-fried noodles with vegetables and a choice of meat, often seasoned with soy sauce.                                                                   Main Course   30                       Medium
Sweet and Sour Pork  A popular dish featuring deep-fried pork coated in a tangy, sweet, and sour sauce with bell peppers and pineapple.                                        Main Course   35                       Medium
Sesame Chicken       Deep-fried chicken pieces coated in a sweet and savory sauce, garnished with sesame seeds.                                                                Main Course   35                       Medium
Hot and Sour Soup    A flavorful soup with a tangy and spicy broth, typically made with mushrooms, tofu, bamboo shoots, and pork.                                              Soup          40                       Medium
Wonton Soup          A clear broth soup containing wontons (dumplings) filled with seasoned meat or seafood.                                                                   Soup          50                       Medium
Steamed Dumplings    Soft, steamed dumplings filled with meat, seafood, or vegetables, often served with a dipping sauce.                                                      Appetizer     50                       Medium
Dim Sum              A variety of bite-sized portions of food traditionally served in small steamer baskets or on small plates.                                                Appetizer     60                       Medium
Red Bean Buns        Steamed buns filled with sweet red bean paste, a popular dessert.                                                                                         Dessert       90                       Medium

 

Just like in standard SQL, retrieving one or more specific columns is as easy as specifying their names separated by commas in the SELECT clause.

SELECT name FROM `bigquery-project-432209.asian_cuisines.chinese_cuisine` LIMIT 1000

 

This query returns the names of chinese dishes in our table:

Egg Fried Rice
Spring Rolls
Sichuan Hotpot
Mooncake
Peking Duck
Ma Po Tofu
Kung Pao Chicken
Chow Mein
Sweet and Sour Pork
Sesame Chicken
Hot and Sour Soup
Wonton Soup
Steamed Dumplings
Dim Sum
Red Bean Buns

 

Meanwhile, this query retrieves a list of japanese dish names along with their preparation time in minutes, by selecting the two associated columns accordingly:

SELECT name, preparation_time_mins FROM `bigquery-project-432209.asian_cuisines.japanese_cuisine` LIMIT 1000

 

Output:

name                  preparation_time_mins
Miso Soup             15
Onigiri               20
Matcha Ice Cream      30
Sushi	              60
Sukiyaki	      40
Mochi	              90
Tempura	              45
Ramen	              60
Tonkatsu	      30
Takoyaki	      40

 

 

Filtering Data with the WHERE Clause

 

So far, the above examples retrieve all the data instances from the queried table without any filtering. If instead of retrieving all thai dishes from the ‘thai_cuisine’ table, we only were interested in retrieving those dishes labeled with a low difficulty level of preparation, how can we apply this condition in our query?

This is where the ‘WHERE’ clause comes into scene. A ‘WHERE’ clause contains a boolean condition that can be either true or false, and generally involves checking the values of one or more columns in the table. This condition acts as a filter in the query, returning only those rows (recipes) where the condition holds true.

For example, adding the following filter helps retrieve the name, dish type, and preparation time of thai dishes with low difficulty.

SELECT name, type, preparation_time_mins FROM `bigquery-project-432209.asian_cuisines.thai_cuisine` WHERE difficulty="Low"

 

Some bad news here: only two of the thai dishes in our table seem to be easy to prepare!

name                type        preparation_time_mins
Mango Sticky Rice   Dessert     20
Thai Iced Tea       Beverage    10

 

Filtering based on numerical data -for instance based on the time of preparation- is typically done via comparison operators like , =,

SELECT name, type, difficulty FROM `bigquery-project-432209.asian_cuisines.chinese_cuisine` WHERE preparation_time_mins 

The ORDER BY clause is useful to present query results in a certain order, e.g. alphabetically, in decreasing or increasing order of a numerical column values, etc.

This example sorts the names of all japanese dishes alphabetically and presents the name and description of each recipe in such alphabetical order:

name              description
Matcha Ice Cream  "A popular Japanese dessert made from green tea powder (matcha) mixed into creamy ice cream."
Miso Soup         "A traditional Japanese soup made from a stock called dashi, miso paste, tofu, and seaweed."
Mochi             "A traditional Japanese rice cake made from glutinous rice, often filled with sweet fillings like red bean paste or ice cream."
Onigiri           "Rice balls often wrapped in nori and filled with various ingredients like pickled plum, salmon, or tuna."
Ramen             "A noodle soup dish with a meat or fish-based broth, typically flavored with soy sauce or miso, and topped with sliced pork, nori, and egg."
Sukiyaki          "A hot pot dish of thinly sliced beef, vegetables, tofu, and noodles, simmered in a sweet and savory broth."
Sushi             "Vinegared rice served with a variety of ingredients such as raw fish, vegetables, and seaweed."
Takoyaki          "Ball-shaped snacks made from a batter filled with octopus, tempura scraps, pickled ginger, and green onion, topped with takoyaki sauce and bonito flakes."
Tempura           "Lightly battered and deep-fried vegetables, seafood, or meat, often served with a dipping sauce."
Tonkatsu          "A breaded and deep-fried pork cutlet, typically served with shredded cabbage and tonkatsu sauce."

Interestingly, few japanese dish names seem to start with the first letters in the roman alphabet!

The ASC and DESC options can be optionally appended to the SORT BY clause to specific an ascending or descending order, respectively. The default order is ascending or alphabetical (ASC), whereas DESC would display the results in inverse alphabetical order.

This last example applies the SORT BY clause to a numerical column, sorting japanese recipes of both medium and high difficulty in decreasing time of preparation.

A few things to note here. First, when using the WHERE and ORDER BY clauses jointly, ORDER BY normally comes last. Second, to apply a filter that checks for more than one possible categorical (string) value, we use the IN() operator that contains the values to verify in the filter, separated by commas.

In the next post of this series, we will delve into more advance queries, some of them involving multiple tables jointly.
 
 

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button