100% found this document useful (1 vote)

821 views64 pages

Big Query

Uploaded by

Susan F

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

821 views64 pages

Big Query

Uploaded by

Susan F

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 64

Glossary terms from module 3

Terms and definitions for Course 3, Module 3

Administrative metadata: Metadata that indicates the technical source of a digital asset
CSV (comma-separated values) file: A delimited text file that uses a comma to separate
values
Data governance: A process for ensuring the formal management of a company’s data assets
Descriptive metadata: Metadata that describes a piece of data and can be used to identify it
at a later point in time
Foreign key: A field within a database table that is a primary key in another table (Refer to
primary key)
FROM: The section of a query that indicates where the selected data comes from
Geolocation: The geographical location of a person or device by means of digital information
Metadata: Data about data
Metadata repository: A database created to store metadata
Naming conventions: Consistent guidelines that describe the content, creation date, and
version of a file in its name
Normalized database: A database in which only related data is stored in each table
Notebook: An interactive, editable programming environment for creating data reports and
showcasing data skills
Primary key: An identifier in a database that references a column in which each value is
unique (Refer to foreign key)
Redundancy: When the same piece of data is stored in two or more places
Schema: A way of describing how something, such as data, is organized
SELECT: The section of a query that indicates the subset of a dataset
Structural metadata: Metadata that indicates how a piece of data is organized and whether
it is part of one or more than one data collection
WHERE: The section of a query that specifies criteria that the requested data must meet
World Health Organization: An organization whose primary role is to direct and coordinate
international health within the United Nations system
Get started with BigQuery
BigQuery is a data warehouse on the Google Cloud Platform used to query and filter large datasets,
aggregate results, and perform complex operations. Throughout this program, you’re going to use
BigQuery to practice your SQL skills and collect, prepare, and analyze data. At this point, you have set
up your own account. Now, explore some of the important elements of the SQL workspace. This will
prepare you for the upcoming activities in which you will use BigQuery. Note that BigQuery updates
its interface frequently, so your console might be slightly different from what is described in this
reading. That’s okay; use your troubleshooting skills to find what you need!

When you log in to BigQuery using the landing page, you will automatically open your project space.
This is a high-level overview of your project, including the project information and the current
resources being used. From here, you can check your recent activity.

Navigate to your project’s BigQuery Studio by selecting BigQuery from the navigation menu and
BigQuery Studio from the dropdown menu.
BiqQuery Studio components

Once you have navigated to BigQuery from the project space, most of the major components of the
BigQuery console will be present: the Navigation pane, the Explorer pane, and the SQL Workspace.

The Navigation pane

On the console page, find the Navigation pane. This is how you navigate from the project space to
the BigQuery tool. This menu also contains a list of other Google Cloud Project (GCP) data tools.
During this program, you will focus on BigQuery, but it’s useful to understand that the GCP has a
collection of connected tools data professionals use every day.

The Explorer pane

The Explorer pane lists your current projects and any starred projects you have added to your
console. It’s also where you’ll find the + ADD button, which you can use to add datasets.
This button opens the Add dialog that allows you to open or import a variety of datasets.
Add Public Datasets

BigQuery offers a variety of public datasets from the Google Cloud Public Dataset Program. Scroll
down the Add dialog to the Public Datasets option.

Select Public Datasets. This takes you to the Public Datasets Marketplace, where you can search for
and select public datasets to add to your BigQuery console. For example, search for the "noaa
lightning" dataset in the Marketplace search bar. When you search for this dataset, you will find
NOAA’s Cloud-to-Ground Lightning Strikes data.

Select the dataset to read its description. Select View dataset to create a tab of the dataset’s
information within the SQL workspace.
The Explorer Pane lists the noaa_lightning and other public datasets.

Star and examine Public Datasets

You added the public noaa_lightning dataset to your BigQuery Workspace, so the Explorer pane
displays the noaa_lightning dataset, along with the list of other public datasets. These datasets are
nested under bigquery-public-data. Star bigquery-public-data by navigating to the top of the Explorer
pane and selecting the star next to bigquery-public-data.
Starring bigquery-public-data will enable you to search for and add public datasets by scrolling in the
Explorer pane or by searching for them in the Explorer search bar.

For example, you might want to select a different public dataset. If you select the second dataset,
"austin_311," it will expand to list the table stored in it, “311_service_requests.”
The Explorer pane with the “bigquery-public data” and “austin_311” datasets expanded, revealing
the “311_service_requests” table

When you select a table, its information is displayed in the SQL Workspace. Select the
311_service_requests table to examine several tabs that describe it, including:

 Schema, which displays the column names in the dataset

 Details, which contains additional metadata, such as the creation date of the dataset

 Preview, which shows the first rows from the dataset

Additionally, you can select the Query button from the menu bar in the SQL Workspace to query this
table.

The SQL Workspace

The final menu pane in your console is the SQL Workspace. This is where you will actually write and
execute queries in BigQuery.

The SQL Workspace also gives you access to your personal and project history, which stores a record
of the queries you’ve run. This can be useful if you want to return to a query to run it again or use
part of it in another query.

Upload your data

In addition to offering access to public datasets, BigQuery also gives you the ability to upload your
own data directly into your workspace. Access this feature by opening the + ADD menu again or by
clicking the three vertical dots next to your project’s name in the Explorer pane. This will give you the
option to create your own dataset and upload your own tables. You will have the opportunity to
upload your own data in an upcoming activity to practice using this feature!

Key takeaways

BigQuery's SQL workspace allows you to search for public datasets, run SQL queries, and even upload
your own data for analysis. Whether you're working with public datasets, running SQL queries, or
uploading your own data, BigQuery’s SQL workspace offers a range of features to support all kinds of
data analysis tasks. Throughout this program, you will be using BigQuery to practice your SQL skills,
so being familiar with the major components of your BigQuery console will help you navigate it
effectively in the future!

Step-by-Step: BigQuery in action

This reading provides you with the steps the instructor performs in the following video, BigQuery in
action. The video focuses on how to create a query to view a small section of data from a large
dataset.

Keep this guide open as you watch the video. It can serve as a helpful reference if you need
additional context or clarification while following the video steps. This is not a graded activity, but
you can complete these steps to practice the skills demonstrated in the video.

What you'll need

To follow along with the examples in this video, log in to your BigQuery account and follow the
instructions to star bigquery-public-data in The Explorer pane section of the previous reading, Get
Started with BigQuery.

Example 1: Preview a section from a table viewer

A database is a collection of data stored in a computer system. Query languages such as SQL enable
communication between databases and data analysts. You discovered earlier that a relational
database is made up of several tables that may be joined together to create relationships. Primary
and foreign keys serve as representations of these relationships. To extract data from these tables,
data analysts use queries. To learn more about that, explore BigQuery in action:

1. Log in to BigQuery and go to your console. You should find the Welcome to your SQL
Workspace! landing page open. Select COMPOSE A NEW QUERY In the Bigquery console.
Make sure that no tabs are open so that the entire workspace is displayed, including the
Explorer pane.

2. Enter sunroof in the search bar. In the search results, expand sunroof_solar and then select
the solar_potential_by_postal_code dataset.

3. Observe the Schema tab of the Explorer pane to explore the table fields.

4. Select the Preview tab to view the regions, states, yearly sunlight, and more.
Example 2: Writing a query

In order to view the entire dataset, you will need to write a query.

1. The first step is finding out the complete, correct path to the table you want to work with.
Select the ellipses (three vertical dots) by the dataset solar_potential_by_postal_code, then
select Query. A new tab will populate on your screen. Select the tab. The path to the table
should be written inside two backticks.

2. Select the full path by highlighting the text including the backticks and copy it. (Note: You can
also get the full path to the project, database, and table directly by clicking the ellipses next
to the table's name in the Explorer panel on the left and selecting Copy ID.)

3. Now, click on the plus sign to create a new query. Notice that BigQuery doesn’t automatically
generate a SELECT statement in this window. Enter SELECT and add a space after it.

4. Put an asterisk * after SELECT to indicate you want to return the entire dataset. The asterisk
lets the database know to include all columns. Without this shortcut, you would have to
manually enter every column name!

5. Next, press the Enter/Return key and enter FROM on the second line. FROM indicates where
the data is coming from. After FROM, add another space.

6. Paste in the path to the table that you copied earlier. It will read `bigquery-public-
data.sunroof_solar.solar_potential_by_postal_code`

7. Execute the query by selecting the RUN button.

Important!

Many of the public databases on BigQuery are living records and, as such, are periodically updated
with new data. Throughout this course (and others in this certificate program), if your results differ
from those you encounter in videos or screenshots, there's a good chance it is due to a data refresh.
You can verify when a table has been refreshed by selecting it from the Explorer panel and clicking
Details. You'll find the date the table was created, when it was last modified, as well as other useful
information.

Example 3: Use SQL to view a piece of data

If the project doesn’t require every field to be completed, you can use SQL to see a particular piece,
or pieces, of data. To do this, specify a certain column name in the query.

1. For example, you might only need data from Pennsylvania. You’d begin your query the same
way you just did in the previous examples: Click on the plus sign, enter SELECT, add a space,
an asterisk (*), and then press Enter/Return.

2. Enter FROM and then paste `bigquery-public-

data.sunroof_solar.solar_potential_by_postal_code`. Press Enter/Return.

3. This time, add WHERE. It will be on the same line as the FROM statement. Add a space and
enter state_name with a space before state and a space after name. state_name is a column
name in the table.

4. Because you only want data from Pennsylvania, add = and 'Pennsylvania' on the same line as
state_name. In SQL, single quotes represent the beginning and ending of a string.

5. Execute the query with the RUN button.

6. Review the data on solar potential for Pennsylvania. Scroll through the query results.

Keep in mind that SQL queries can be written in a lot of different ways and still return the same
results. You might discover other ways to write these queries!

Think of a database like a well-organized warehouse. In this warehouse, data is stored in tables,
which are like different sections for storing specific types of items. For example, one table might
contain information about customers, while another table might store data about their orders.

Now, imagine you want to find information about a specific customer and their order history. You
wouldn't want to search through the entire warehouse, right? That's where SQL queries come in.

A SQL query is like a request you send to the database, asking it to retrieve specific information from
those tables. It's like giving the warehouse manager a detailed list of what you need, and they fetch it
for you.

Here's a breakdown of how a SQL query works:

 SELECT: This part of the query specifies what information you want to retrieve. It's like
saying, "I want to see the customer's name and order date."

 FROM: This tells the database where to find the information. You specify the table or tables
containing the data you need. It's like saying, "Get this information from the 'Customers' and
'Orders' tables."

 WHERE: This is where you get specific. The "WHERE" clause lets you filter the data based on
certain conditions. For example, you could say, "Show me only the orders placed by
customers living in California."

So, a simple SQL query might look like this:

SELECT customer_name, order_date

FROM Customers, Orders

WHERE customer_state = 'California';

This query would instruct the database to retrieve the customer name and order date from the
"Customers" and "Orders" tables, but only for customers located in California.

SQL queries can be much more complex than this, allowing you to perform calculations, sort data,
group information, and much more. The key is understanding the basic building blocks and how they
work together to help you extract meaningful insights from your data.

Keep practicing writing queries, and you'll become more comfortable with the language and its
capabilities. Remember, every data analyst starts somewhere, and with persistence, you'll master
these skills in no time!

You have recently been introduced to BigQuery, a data warehouse on Google Cloud that data
analysts use to query, filter large datasets, aggregate results, and perform complex operations. In this
activity, you will explore the BigQuery interface; upload public data to your console; and write some
simple SQL queries using SELECT, FROM, and WHERE.

By the time you complete this activity, you will be more familiar with writing queries in the BigQuery
interface. This will enable you to practice SQL, which is important for working with databases in your
career as a data analyst.

Step-By-Step Instructions

Follow the instructions to complete each step of the activity. Then answer the questions at the end
of the activity before going to the next course item.

Step 1: Get a BigQuery account

For this activity, you will need a BigQuery account. If you haven’t made one already, follow the
instructions from the Using BigQuery reading.

Once you have your account, start exploring!

Step 2: Open your BigQuery console

1. Log in to BigQuery.

2. Select the Go to console button on the BigQuery homepage. This will open a new tab with your
console.
3. Take a moment to explore your console. The Explorer menu includes a search bar you can use to
find resources, pinned projects, and the + ADD button for adding data. The Editor welcome page is
where you will navigate to a query editor, try sample data, add local data files, add Google cloud
storage, or add other external connections. You can also find your job history, query history, and
saved queries here.

Step 3: Access public data in BigQuery

In order to start writing queries, you will need some data to work with. Once you’re familiar with the
BigQuery interface, you can access a public dataset directly from your console.
1. Select the search bar in the Explorer pane.

2. Enter “london bicycle” in the search box and press enter; this will return the london_bicycles
database from the Greater London Authority. Select the database for more details. If you cannot find
it, make sure you're searching in all projects. The london_bicycles database is in the bigquery-public-
data project.

3. Select the arrow to the left of the london_bicycles database name. This expands the dataset to
reveal two table names: cycle_hire and cycle_stations. Select the cycle_hire table name within the
Explorer pane.
Screenshot of BigQuery. The titles cycle_hire and cycle_stations are shown beneath london_bicycles
in the explorer pane. Cycle_hire is selected.

This will pull the cycle_hire schema into the console. Take a moment to explore the field names and
the associated information.

4. Now, select the PREVIEW tab to find a sample of the data that you’ll be working with.

Once you have finished previewing the data, write a query!

Step 4: Review basic parts of a query

So far, you’ve learned three basic parts of a query: SELECT, FROM, and WHERE. As a refresher:

 SELECT is the section of a query that indicates what data you want SQL to return to you.

 FROM is the section of a query that indicates which table the desired data comes from. You
must provide a full path to the table. The path includes the project name, database name,
and table name, each separated by a period.

 WHERE is the section of a query that indicates any filters you’d like to apply to your table.

Step 5: Write a basic query

Now, construct a simple command using the basic parts of a query you have already learned! For
example, you can select a specific column from the cycle_hire table, such as the end_station_name
column.

1. Select the Blue + button or QUERY - In new tab to start a new query.

2. Start your query with a SELECT clause, and indicate which column you want to select from the
table; in this case, you’ll input end_station_name.

3. After you have indicated which column you are selecting, write your FROM clause. Specify the
table you are querying from by inputting the following location: `bigquery-public-
data.london_bicycles.cycle_hire`;

The completed query should appear like this:

SELECT

end_station_name

FROM

`bigquery-public-data.london_bicycles.cycle_hire`;

4. Run your completed query by selecting the blue RUN button.

This query may take a few seconds to execute. Once it has finished, you will find the list of station
names you requested under the Query Results console pane.

Step 6: Write a query to answer a question

After running the first basic query, try answering a specific question about the data. For example,
how many bike trips lasted for 20 minutes or longer?

1. Select the Blue + button or QUERY - In new tab to start a new query. Start with your SELECT
statement again. This time, include the two columns duration and start_station_name in the query.
The data in these columns will tell where the trip started and the duration of the trip. Be sure to
separate each column name with a comma.

2. Next, add your FROM statement. You will be using the same table as the previous query: FROM
`bigquery-public-data.london_bicycles.cycle_hire`;. Note: The backticks around the table in this line
of code are optional.

3. Finally, add a WHERE statement to specify that you want to filter for only bike rides 20 minutes or
longer. If you check the preview of this data, you might notice that the duration is recorded in
seconds, so you’ll specify 1200 seconds in your query. Write that as WHERE duration >= 1200;

Your completed query should be written like this:

1
2

SELECT

duration,

start_station_name

FROM

`bigquery-public-data.london_bicycles.cycle_hire`

WHERE

duration >= 1200;

4. Run your completed query by clicking the RUN button.

This query may take a few seconds to execute. Once it has finished, you will find a list of rides from
this table that fit your criteria. There are millions of rows with bike trips that are 20 minutes or
longer!

Optional Step 7: Up for a challenge?

If you’re comfortable using queries to answer questions, try creating and running queries to
complete the tasks below:

 What is the name of the station whose start_station_id is 111?

 Return all the rental_ids, station IDs, and station names that bike_id 1710 started from.

 What is the bike_model of bike_id 58782?

Step 8: Check your work

Use the solutions doc to check your work: Intro to BigQuery Solutions

Or download the file directly here:

Intro to BigQuery solutions

DOCX File

Question 1

Reflection

Run another query on your table:

SELECT

end_station_name

FROM

`bigquery-public-data.london_bicycles.cycle_hire`

WHERE

rental_id = 57635395;

At what station did the bike trip with rental_id 57635395 end?

1 point

Notting Hill Gate Station, Notting Hill

Southwark Street, Bankside

East Village, Queen Elizabeth Olympic Park

Tower Gardens, Tower

Question 2

In this activity, you had an opportunity to get more familiar with BigQuery and writing SQL queries. In
the text box below, write 2-3 sentences (40-60 words) in response to each of the following questions:

 How do you think you can use public datasets in BigQuery to help develop your data analysis
skills?
 How do you think understanding basic query syntax will help you write more complex
queries in the future?

1 point

Intro to BigQuery solutions

These are the solutions to the Intro to BigQuery activity. You can refer to this document if you

are troubleshooting your own queries. You can also copy and paste these solutions to test them

in your own BigQuery console.

1. What is the name of the station whose start_station_id is 111?

SELECT

start_station_name

FROM

`bigquery-public-data.london_bicycles.cycle_hire`

WHERE

start_station_id = 111;

2. Return all of the rental_ids, station IDs, and station names that bike_id 1710 started

from.

SELECT

rental_id,

start_station_id,

start_station_name

FROM

`bigquery-public-data.london_bicycles.cycle_hire`

WHERE

bike_id = 1710;

3. What is the bike_model of bike_id 58782?

SELECT

bike_model

FROM
`bigquery-public-data.london_bicycles.cycle_hire`

WHERE

bike_id = 58782;

Capitalization and case sensitivity

With SQL, capitalization usually doesn’t matter. You could enter SELECT or select or SeLeCT. They all
work! But if you use capitalization as part of a consistent style your queries will look more
professional.

To enter SQL queries like a pro, it is always a good idea to use all caps for clause starters (e.g. SELECT,
FROM, WHERE, etc.). Functions should also be in all caps (e.g. SUM()). Column names should be all
lowercase (refer to the section on snake_case later in this guide). Table names should be in
CamelCase (refer to the section on CamelCase later in this guide). This helps keep your queries
consistent and easier to read while not impacting the data that will be pulled when you run them.
The only time that capitalization does matter is when it is inside quotes (more on quotes below).

Vendors of SQL databases may use slightly different variations of SQL. These variations are called SQL
dialects. Some SQL dialects are case sensitive. BigQuery is one of them. Vertica is another. But most,
like MySQL, PostgreSQL, and SQL Server, aren’t case sensitive. This means if you searched for
country_code = ‘us’, it will return all entries that have 'us', 'uS', 'Us', and 'US'. This isn’t the case with
BigQuery. BigQuery is case sensitive, so that same search would only return entries where the
country_code is exactly 'us'. If the country_code is 'US', BigQuery wouldn’t return those entries as
part of your result.

Single or double quotes: '' or " "

For the most part, it also doesn’t matter if you use single quotes ' ' or double quotes " " when
referring to strings. For example, SELECT is a clause starter. If you put SELECT in quotes like 'SELECT'
or "SELECT", then SQL will treat it as a text string. Your query will return an error because your query
needs a SELECT clause.

But there are two situations where it does matter what kind of quotes you use:

1. When you want strings to be identifiable in any SQL dialect

2. When your string contains an apostrophe or quotation marks

Within each SQL dialect there are rules for what is accepted and what isn’t. But a general rule across
almost all SQL dialects is to use single quotes for strings. This helps get rid of a lot of confusion. So if
we want to reference the country US in a WHERE clause (e.g. country_code = 'US'), then use single
quotes around the string 'US'.

The second situation is when your string has quotes inside it. Suppose you have a column
favorite_food in a table called FavoriteFoods and the other column corresponds to each friend.

friend favorite_food

Rachel DeSantos Shepherd’s pie

Sujin Lee Tacos

friend favorite_food

Najil Okoro Spanish paella

You might notice how Rachel’s favorite food contains an apostrophe. If you were to use single quotes
in a WHERE clause to find the friend who has this favorite food, it would look like this:

SELECT

friend

FROM

FavoriteFoods

WHERE

favorite_food = 'Shepherd's pie'

This won’t work. If you run this query, you will get an error in return. This is because SQL recognizes
a text string as something that starts with a quote ' and ends with another quote '. So in the bad
query above, SQL thinks that the favorite_food you are looking for is 'Shepherd', because the
apostrophe in Shepherd's ends the string.

Generally speaking, this should be the only time you would use double quotes instead of single
quotes. So your query would look like this instead:

SELECT

friend

FROM

FavoriteFoods
WHERE

favorite_food = "Shepherd's pie"

SQL understands text strings as either starting with a single quote ' or double quote ". Since this
string starts with double quotes, SQL will expect another double quote to signal the end of the string.
This keeps the apostrophe safe, so it will return "Shepherd's pie" and not 'Shepherd'.

Comments as reminders

As you get more comfortable with SQL, you will be able to read and understand queries at a glance.
But it never hurts to have comments in the query to remind yourself of what you are trying to do.
And if you share your query, it also helps others understand it.

For example:

--This is an important query used later to join with the accounts table

SELECT

rowkey, -key used to join with account_id

Info.date, -date is in spring format YYYY-MM-DD HH:MM:SS

Info.code -e.g. 'pub-###'

FROM

Publishers

You can use # in place of the two dashes, --, in the above query but keep in mind that # isn’t
recognized in all SQL dialects (MySQL doesn’t recognize #). So it is best to use -- and be consistent
with it. When you add a comment to a query using --, the database query engine will ignore
everything in the same line after --. It will continue to process the query starting on the next line.

snake_case names for columns

It is important to always make sure that the output of your query has easy-to-understand names. If
you create a new column (say from a calculation or from concatenating new fields), the new column
will receive a generic default name (e.g. f0). For example:

2
3

SELECT

SUM(tickets),

COUNT(tickets),

SUM(tickets) AS total_tickets,

COUNT(tickets) AS number_of_purchases

FROM

Purchases

Results are:

f0 f1 total_tickets number_of_purchases

8 4 8 4

The first two columns are named f0 and f1 because they weren’t named in the above query. SQL
defaults to f0, f1, f2, f3, and so on. We named the last two columns total_tickets and
number_of_purchases so these column names show up in the query results. This is why it is always
good to give your columns useful names, especially when using functions. After running your query,
you want to be able to quickly understand your results, like the last two columns we described in the
example.

On top of that, you might notice how the column names have an underscore between the words.
Names should never have spaces in them. If total_tickets had a space and looked like total tickets
then SQL would throw a syntax error because it wouldn't know what to do with the second word
(tickets). So, spaces are bad in SQL names. Never use spaces.

The best practice is to use snake_case. This means that 'total tickets', which has a space between the
two words, should be entered as total_tickets with an underscore instead of a space.

CamelCase names for tables

You can also use CamelCase capitalization when naming your table. CamelCase capitalization means
that you capitalize the start of each word, like a two-humped (Bactrian) camel. So the table
TicketsByOccasion uses CamelCase capitalization. Please note that the capitalization of the first word
in CamelCase is optional; camelCase is also used. Some people differentiate between the two styles
by calling CamelCase, PascalCase, and reserving camelCase for when the first word isn't capitalized,
like a one-humped (Dromedary) camel; for example, ticketsByOccasion.

At the end of the day, CamelCase is a style choice. There are other ways you can name your tables,
including:
 All lower or upper case, like ticketsbyoccasion or TICKETSBYOCCASION

 With snake_case, like tickets_by_occasion

Keep in mind, the option with all lowercase or uppercase letters can make it difficult to read your
table name, so it isn’t recommended for professional use.

The second option, snake_case, is technically okay. With words separated by underscores, your table
name is easy to read, but it can get very long because you are adding the underscores. It also takes
more time to enter. If you use this table a lot, it can become a chore.

In summary, it is up to you to use snake_case or CamelCase when creating table names. Just make
sure your table name is easy to read and consistent. Also be sure to find out if your company has a
preferred way of naming their tables. If they do, always go with their naming convention for
consistency.

Indentation

As a general rule, you want to keep the length of each line in a query <= 100 characters. This makes
your queries easy to read. For example, check out this query with a line with >100 characters:

SELECT

CASE WHEN genre = 'horror' THEN 'Will not watch' WHEN genre = 'documentary'

THEN 'Will watch alone' ELSE 'Watch with others' END AS

watch_category, COUNT(movie_title) AS number_of_movies

FROM

MovieTheater

GROUP BY

This query is hard to read and just as hard to troubleshoot or edit. Now, here is a query where we
stick to the <= 100 character rule:

2
3

SELECT

CASE

WHEN genre = 'horror' THEN 'Will not watch'

WHEN genre = 'documentary' THEN 'Will watch alone'

ELSE 'Watch with others'

END AS watch_category, COUNT(movie_title) AS number_of_movies

FROM

MovieTheater

GROUP BY

Now it is much easier to understand what you are trying to do in the SELECT clause. Sure, both
queries will run without a problem because indentation doesn’t matter in SQL. But proper
indentation is still important to keep lines short. And it will be valued by anyone reading your query,
including yourself!

Multi-line comments

If you make comments that take up multiple lines, you can use -- for each line. Or, if you have more
than two lines of comments, it might be cleaner and easier is to use /* to start the comment and */
to close the comment. For example, you can use the -- method like below:

7
-- Date: September 15, 2020

-- Analyst: Jazmin Cisneros

-- Goal: Count the number of rows in the table

SELECT

COUNT() number of rows -- the stands for all so count all

FROM

table

Or, you can use the /* */ method like below:

/* Date: September 15, 2020

Analyst: Jazmin Cisneros

Goal: Count the number of rows in the table

SELECT

COUNT() number of rows -- the stands for all so count all

FROM

table

In SQL, it doesn’t matter which method you use. SQL ignores comments regardless of what you use:
#, --, or /* and */. So it is up to you and your personal preference. The /* and */ method for multi-
line comments usually looks cleaner and helps separate the comments from the query. But there
isn’t one right or wrong method.

SQL text editors

When you join a company, you can expect each company to use their own SQL platform and SQL
dialect. The SQL platform they use (e.g. BigQuery, MySQL, or SQL Server) is where you will enter and
run your SQL queries. But keep in mind that not all SQL platforms provide native script editors to
enter SQL code. SQL text editors give you an interface where you can enter your SQL queries in an
easier and color-coded way. In fact, all of the code we have been working with so far was entered
with an SQL text editor!

Examples with Sublime Text

If your SQL platform doesn’t have color coding, you might want to think about using a text editor like
Sublime Text or Atom. This section shows how SQL is displayed in Sublime Text. Here is a query in
Sublime Text:

With Sublime Text, you can also do advanced editing like deleting indents across multiple lines at the
same time. For example, suppose your query somehow had indents in the wrong places and looked
like this:

This is really hard to read, so you will want to eliminate those indents and start over. In a regular SQL
platform, you would have to go into each line and press BACKSPACE to delete each indent per line.
But in Sublime, you can get rid of all the indents at the same time by selecting all lines and pressing
Command (or CTRL in Windows) + [. This eliminates indents from every line. Then you can select the
lines that you want to indent (i.e., lines 2, 4, and 6) by pressing the Command key (or the CTRL key in
Windows) and selecting those lines. Then while still holding down the Command key (or the CTRL key
in Windows), press ] to indent lines 2, 4, and 6 at the same time. This will clean up your query and
make it look like this instead:
Sublime Text also supports regular expressions. Regular expressions (or regex) can be used to search
for and replace string patterns in queries. We won’t cover regular expressions here, but you might
want to learn more about them on your own because they are a very powerful tool.

Skip to main content

Search in course

 English

 Menu pop-up Collapsed

Hide menu

 Work with databases

 Manage data with metadata

 Access different data sources

 Sort and filter data

 Large datasets in SQL

o
Video: VideoGet to know BigQuery, including sandbox and billing options

. Duration: 2 minutes2 min

Reading: ReadingSet up your BigQuery account

. Duration: 20 minutes20 min

Reading: ReadingGet started with BigQuery

. Duration: 20 minutes20 min

Reading: ReadingStep-by-Step: BigQuery in action

. Duration: 20 minutes20 min

Video: VideoBigQuery in action

. Duration: 6 minutes6 min

Practice Quiz: Hands-On Activity: Introduction to BigQuery

. Duration: 1 hour1h

Reading: ReadingIn-depth guide: SQL best practices

. Duration: 20 minutes20 min

Ungraded App Item: Ungraded App ItemHands-On Guided Project: Select a new city using BigQuery

. Duration: 25 minutes25 min

Practice Quiz: Hands-On Activity: Create a custom table in BigQuery

. Duration: 1 hour1h

Practice Quiz: Hands-On Activity: Choose the right tool for the job

. Duration: 1 hour1h

Practice Quiz: Hands-On Activity: More practice with SQL

. Duration: 1 hour1h

Practice Quiz: Test your knowledge on using SQL with large datasets

. Duration: 8 minutes8 min

 Module 3 challenge

2. Module 3

3. Hands-On Activity: Create a custom table in BigQuery

PreviousNext

Hands-On Activity: Create a custom table in BigQuery

Ready to review what you’ve learned before taking the quiz? I’m here to help.

Help me practiceLet's chat

Submit your assignment

Resume assignment

Receive grade

To Pass100% or higher

Your grade

-Not available

Like
Dislike

Report an issue

Hi, Dr!

How can I help?

Help me practice

Let's chat

Coach is powered by AI, so check for mistakes and don't share sensitive info. Your data will be used in
accordance with Coursera's Privacy Notice.

Ctrl+M

Back

Hands-On Activity: Create a custom table in BigQuery

Practice Quiz. • 1h. • 2 total points available.2 total points

English

DueAug 7, 11:59 PM IST

Activity Overview

Recently, you’ve been thinking about identifying good data sources that would be useful for analysis.
You also spent some time in a previous activity exploring a public dataset in BigQuery and writing
some basic SQL queries. In addition to using public data in BigQuery, your future data career will
involve importing data from other sources. In this activity, you will create a custom table and dataset
that you’ll load into a new table and query.

By the time you complete this activity, you will be able to load your own data into BigQuery for
analysis. This will enable you to import your own data sources into BigQuery, which is a skill that will
enable you to more effectively analyze data from different sources.

Step-By-Step Instructions

Follow the instructions to complete each step of the activity. Then answer the questions at the end
of the activity before going to the next course item.

Step 1: Access the data source

To get started, download the baby names data zip file. This file contains about 7 MB of data about
popular baby names from the U.S. Social Security Administration website.

Select the link to the baby names data zip file to download it.

Link to baby names data: names.zip

Step 2: Unzip the file

Unzip the file you downloaded onto your computer to access it on BigQuery. Once you have
unzipped the file, find a .pdf file titled NationalReadMe that contains more information about the
dataset. This dataset tracks the popularity of baby names for each year; you can find text files labeled
by the year they contain. Open yob2014.txt to preview the data. You will notice that it’s a .csv file
with three columns. Remember where you saved this folder so you can reference it later.

Step 3: Create a dataset

Before uploading your .txt file and creating a table to query, you will need to create a dataset to
upload your data into and store your tables.

1. From the BigQuery console, go to the Explorer pane in your workspace and select the three dots
next to your project to open a menu. From here, select Create dataset. Note that unless you have
already specified your own project name, a unique name is assigned to your project by BigQuery,
typically in the format of two words and a number, separated by hyphens (e.g. loyal-glass-371423 in
the image below).
2. This will open the Create dataset menu. This is where you will fill out information about the
dataset. Input the Dataset ID as babynames and set the Data location to Multi-region (US). Once
you have finished filling out this information, select the blue CREATE DATASET button at the bottom
of the menu.
Step 4: Create a table

Now that you have a custom database stored in your project space, this is where you will add your
table.

1. Select the newly created babynames database. Check the tabs in your Dataset info window and
select the first blue + CREATE TABLE button. This will open another menu in your console.
2. In the Source section, select the Upload option in Create table from. Then select Browse to open
your files. Find and open the yob2014.txt file. Set the file format to .csv. In the Destination section,
in the Table data box, name your table as names_2014. For Schema, select Edit as text and input the
following code:

name:string,

gender:string,

count:integer

3. This will establish the data types of the three columns in the table. Leave the other parameters as
they are, and select Create table.
Once you have created your table titled names_2014, it will appear in your explorer pane under the
database babynames that you created earlier.
4. Select the table to open it in your workspace. Here, you can check the table schema. Then, go to
the Preview tab to explore your data. The table should have three columns: name, gender, and
count.

Step 5: Query your custom table

Now that your table is set up, you’re ready to start writing queries and answering questions about
this data. For example, let’s say you were interested in the top five baby names for boys in the United
States in 2014.

Select COMPOSE NEW QUERY to start a new query for this table. Then, enter this code:

3
4

SELECT

name,

count

FROM

your-project.babynames.names_2014

WHERE

gender = 'M'

ORDER BY

count DESC

LIMIT

NOTE: Making sure that your FROM statement is correct is key to making this query run! The
database needs the query to tell it the location of the table you just uploaded so that it can fetch the
data. It’s like giving the query a map to your table. That map will include your unique BigQuery
project name (replace your-project in the code above with your unique project name), the database
name (babynames), and the table name (names_2014). The location names for each of these
elements are separated by periods. The final result will be something like this:

loyal-glass-371423.babynames.names_2014

Note that loyal-glass-371423 is just an example of a project name. You must use your project’s name
in your FROM statement.

This query selects the name and count columns from the names_2014 table. Using the WHERE
clause, you are filtering for a specific gender for your results. Then, you’re sorting how you want your
results to appear with ORDER BY. Because you are ordering by the count in descending order, you
will get names and the corresponding count from largest to smallest. Finally, LIMIT tells SQL to only
return the top five most popular names and their counts.

Once you have input this in your console, select RUN to get your query results.
Optional Step 6: Up for a challenge?

If you are comfortable creating your own custom tables, try uploading more files from the baby
names dataset into tables you can query. For example, you could upload each of the files from 2015
to 2019 to find the top baby names for those years.

Question 1

Reflection

After running the query on your new table, what was the third-most popular baby name for boys in
2014?

1 point

Jacob

Mason

Noah

William

Question 2

In this activity, you explored public data in BigQuery and used it to create a custom table. In the text
box below, write 2-3 sentences (40-60 words) in response to each of the following questions:

 Why is being able to use data from different sources useful as a data analyst?

 How can you use BigQuery custom tables and datasets in your future analysis projects?

1 point

Skip to main content

Search in course

 English




 Menu pop-up Collapsed

Hide menu

 Work with databases

 Manage data with metadata

 Access different data sources

 Sort and filter data

 Large datasets in SQL

Video: VideoGet to know BigQuery, including sandbox and billing options

. Duration: 2 minutes2 min

Reading: ReadingSet up your BigQuery account

. Duration: 20 minutes20 min

Reading: ReadingGet started with BigQuery

. Duration: 20 minutes20 min

Reading: ReadingStep-by-Step: BigQuery in action

. Duration: 20 minutes20 min

Video: VideoBigQuery in action

. Duration: 6 minutes6 min

Practice Quiz: Hands-On Activity: Introduction to BigQuery

. Duration: 1 hour1h

Reading: ReadingIn-depth guide: SQL best practices

. Duration: 20 minutes20 min

Ungraded App Item: Ungraded App ItemHands-On Guided Project: Select a new city using BigQuery

. Duration: 25 minutes25 min

Practice Quiz: Hands-On Activity: Create a custom table in BigQuery

. Duration: 1 hour1h

Practice Quiz: Hands-On Activity: Choose the right tool for the job

. Duration: 1 hour1h

Practice Quiz: Hands-On Activity: More practice with SQL

. Duration: 1 hour1h

Practice Quiz: Test your knowledge on using SQL with large datasets

. Duration: 8 minutes8 min

 Module 3 challenge

2. Module 3

3. Hands-On Activity: Choose the right tool for the job

PreviousNext
Hands-On Activity: Choose the right tool for the job

Review Learning Objectives

Ready to review what you’ve learned before taking the quiz? I’m here to help.

Help me practiceLet's chat

Submit your assignment

Resume assignment

Receive grade

To Pass80% or higher

Your grade

-Not available

Dislike

Report an issue

Hi, Dr!

How can I help?

Help me practice

Let's chat

Coach is powered by AI, so check for mistakes and don't share sensitive info. Your data will be used in
accordance with Coursera's Privacy Notice.




Ctrl+M

Back

Hands-On Activity: Choose the right tool for the job

Practice Quiz. • 1h. • 4 total points available.4 total points

English

DueAug 7, 11:59 PM IST

Activity Overview

As a data analytics professional, part of your job will be understanding which tools are the right ones
for the job. At this point, you’ve gotten familiar with spreadsheets and SQL as potential tools for data
analysis. This activity is an opportunity to apply your new knowledge about data analytics tools to
workplace scenarios. After reading each scenario, you will be prompted to choose whether you
would use a spreadsheet or SQL to work with the data and to explain why you selected that tool.

Reflect on the scenarios

Review each scenario and provide your reflective response before going to the next course item.

Question 1

Identify top 10 pet supply products

You are a junior data analyst at a major online pet supplies business. The product development
department has asked you to identify the top 10 products with the highest sales in the last month so
that they can decide what types of products to develop next. These products come from multiple
tables within the company’s database: one containing product details and another with sales data.

Reflection

Consider whether you would use a spreadsheet or SQL in the scenario above.

 Which tool would provide the desired output?

 Which tool is appropriate considering the dataset’s size and complexity?

 How will you access the data?

Now, write 2-3 sentences (40-60 words) in response to these questions. Identify which tool you
would use and then justify your thinking. Enter your response in the text box below.

1 point

Your answer cannot be more than 10000 characters.

Question 2

Determine new store location

In this scenario, you work for a small fitness tech company. Your manager has asked you to
determine which cities have a large population of customers; this will help them decide where to
build the next retail store. You have a .csv file with 300 names, phone numbers, and addresses. The
manager also requested a quick, simple visualization, such as a bar chart, so the team can make
some quick comparisons.

Reflection

Consider whether you would use a spreadsheet or SQL in the scenario above.

 Which tool would provide the desired output?

 Which tool is appropriate considering the dataset’s size and complexity?

 How will you access the data?

Now, write 2-3 sentences (40-60 words) in response to these questions. Identify which tool you
would use and then justify your thinking. Enter your response in the text box below.

1 point

Your answer cannot be more than 10000 characters.

Question 3

Customer satisfaction survey results

The marketing team of a fashion retailer conducted a survey on customer satisfaction. The results are
in a .csv file. The team wants to calculate the average satisfaction score and compare it to the
average score from their last survey. They also want to segment responses based on demographic
data such as age, gender, and location. This segmentation will enable the marketing team to tailor
their marketing strategies to meet the specific needs and preferences of various customer groups.

Reflection

Consider whether you would use a spreadsheet or database in the scenario above.

 Which tool would provide the desired output?

 Which tool is appropriate considering the dataset’s size and complexity?

 How will you access the data?

Now, write 2-3 sentences (40-60 words) in response to these questions. Identify which tool you
would use and then justify your thinking. Enter your response in the text box below.

1 point

Your answer cannot be more than 10000 characters.

Question 4

Calculate course completion rates

The manager of an online education platform wants to know which courses have the lowest
completion rates so that they can invest in course enhancements and more targeted student support
for those courses. The data is stored in a relational database with separate tables for user
registration, course enrollment, and course completion.

Reflection

Consider whether you would use a spreadsheet or database in the scenario above.

 Which tool would provide the desired output?

 Which tool is appropriate considering the dataset’s size and complexity?

 How will you access the data?

Now, write 2-3 sentences (40-60 words) in response to these questions. Identify which tool you
would use and then justify your thinking. Enter your response in the text box below.

1 point

Your answer cannot be more than 10000 characters.

Skip to main content

Search in course

 English

 Menu pop-up Collapsed

D
Hide menu

 Work with databases

 Manage data with metadata

 Access different data sources

 Sort and filter data

 Large datasets in SQL

Video: VideoGet to know BigQuery, including sandbox and billing options

. Duration: 2 minutes2 min

Reading: ReadingSet up your BigQuery account

. Duration: 20 minutes20 min

Reading: ReadingGet started with BigQuery

. Duration: 20 minutes20 min

Reading: ReadingStep-by-Step: BigQuery in action

. Duration: 20 minutes20 min

Video: VideoBigQuery in action

. Duration: 6 minutes6 min

Practice Quiz: Hands-On Activity: Introduction to BigQuery

. Duration: 1 hour1h

o
Reading: ReadingIn-depth guide: SQL best practices

. Duration: 20 minutes20 min

Ungraded App Item: Ungraded App ItemHands-On Guided Project: Select a new city using BigQuery

. Duration: 25 minutes25 min

Practice Quiz: Hands-On Activity: Create a custom table in BigQuery

. Duration: 1 hour1h

Practice Quiz: Hands-On Activity: Choose the right tool for the job

. Duration: 1 hour1h

Practice Quiz: Hands-On Activity: More practice with SQL

. Duration: 1 hour1h

Practice Quiz: Test your knowledge on using SQL with large datasets

. Duration: 8 minutes8 min

 Module 3 challenge

2. Module 3

3. Hands-On Activity: More practice with SQL

PreviousNext

Hands-On Activity: More practice with SQL

Ready to review what you’ve learned before taking the quiz? I’m here to help.

Help me practiceLet's chat

Submit your assignment

Resume assignment

Receive grade

To Pass100% or higher

Your grade

-Not available

Dislike

Report an issue

Hi, Dr!

How can I help?

Help me practice

Let's chat

Coach is powered by AI, so check for mistakes and don't share sensitive info. Your data will be used in
accordance with Coursera's Privacy Notice.


Ctrl+M

Back

Hands-On Activity: More practice with SQL

Practice Quiz. • 1h. • 2 total points available.2 total points

English

DueAug 7, 11:59 PM IST

Activity Overview

In previous lessons, you learned how to work with data in spreadsheets. Now, you’ll practice using
SQL to work with data in databases. By the time you complete this activity, you’ll be able to use SQL
to write queries that retrieve data from databases. Further, you’ll practice navigating large public
datasets in BigQuery in order to become proficient in SQL queries—an essential skill for your future
career as a data analyst.

Scenario

Review the following scenario. Then complete the step-by-step instructions.

In this activity, you’ll query a public dataset that contains information about the trees in New York
City in 2005. You’ll also use the AVG function to determine the average diameter of all NYC trees in
2005. You’ll then have the opportunity to write queries using the data for 1995 and 2015 to compare
the average tree diameters.

Step-By-Step Instructions

Follow the instructions to complete each step of the activity. Then answer the questions at the end
of the activity before going to the next course item.

Step 1: Set up your Data

1. Log in to BigQuery Sandbox. On the BigQuery page, click the Go to BigQuery button.

If you have a free-of-charge trial version of BigQuery, you can use that instead.

Note: BigQuery Sandbox frequently updates its user interface. The latest changes may not be
reflected in the screenshots presented in this activity, but the principles remain the same. Adapting
to changes in software updates is an essential skill for data analysts, and it’s helpful for you to
practice troubleshooting. You can also reach out to your community of learners on the discussion
forum for help.

2. If you haven’t done so already, create a BigQuery project. (If you have a project, select it in the
Explorer pane.)
a. In the BigQuery console, select the dropdown list to the right of the Google Cloud logo to open the
Select a project dialog box.

b. In the Select a project dialog box, select the CREATE PROJECT button.

c. Give your project a name that will help you identify it later. This can be a unique project ID or use
an auto-generated one. You do not need to select an organization.

d. Select the CREATE button to create the project.

3. The three main sections of BigQuery are now onscreen: the BigQuery navigation menu; the
Explorer pane, which you can use to search for public datasets and open projects; and the Details
pane, which shows details of the database or dataset you’ve selected in the Explorer pane and
displays windows for you to enter queries.

Notice that you can use the <| symbol in the BigQuery navigation menu section to collapse it. There
is a similar symbol to collapse the Explorer pane.

Step 2: Choose a dataset

Follow these steps to find and select the NYC Trees dataset for this activity:

1. In the Explorer pane, select the + ADD button.

2. In the Add box that pops up, scroll down the Additional sources list. Select Public Datasets.

3. A new box opens where you can search public datasets that are available through Google Cloud. In
the Search Marketplace text box, search for New York City Trees.

4. Select the search result NYC Street Trees, then select the View Dataset button.
Heading is NYC Street Trees. Link to
City of New York provided. Subheading is New York City Street Tree Census data. Select View dataset
button to link to dataset.

5. Google Cloud opens a new browser tab displaying BigQuery with the bigquery-public-data
collection open in the Explorer pane. To ensure the bigquery-public-data database remains in your
project’s Explorer pane, select the star next to the dataset.

6. The BigQuery Details pane contains information about the new_york_trees dataset. This
information includes the date the dataset was created, when it was last modified, and the Dataset ID.
The Details pane displaying the
new_york_trees data description including the dataset ID, when it was created, default table
expiration, when it was last modified, the data location, description, the default collation, the default
rounding mode, case insensitive, the labels, and the tags.

Step 3: Choose a table

1. In the Explorer pane, select the arrow next to the new_york_trees dataset to display the tables it
contains.

Note: If the new_york_trees dataset is not in the Explorer pane, type new_york_trees into the
Search text box in the Explorer pane. (This will work if you have pinned bigquery-public-data in the
Explorer pane.) If search doesn’t return the needed results, follow the steps above to search for the
new_york_trees dataset.

2. Notice that the new_york_trees dataset contains three tree census tables from 1995, 2005, and
2015. It also contains a table that lists the tree species.
In the Explorer pane,
bigquery-public-data is open and the new_york_trees dataset is expanded to show the tables in the
dataset are: tree_census_1995, tree_census_2005, tree_census_2015, and tree_species.

These are all tables contained in the dataset. Now, examine the data for all trees cataloged in New
York City for three specific years.

3. Select the tree_census_2005 table. BigQuery displays the table’s structure in the Details pane.

4. In the Details pane, select Query > In new tab to open a new query window.
5. Notice that BigQuery populates the Query Window with a SELECT query. This query is incomplete
because it doesn’t contain anything in between SELECTand FROM.

Step 4: Query the data

This SELECT statement in the Details pane is incomplete because the columns to display have not
been specified. So, either list the columns separated by commas or use the asterisk to have BigQuery
return all columns in the table.

1. Type an asterisk * after the SELECT command in line one of the Query Editor. Your query should
now read SELECT * FROM followed by your table location. This command tells BigQuery to return all
of the columns in the tree_census_2005 table.

2. In the Query Editor, select the Run button to run the query. The results will be displayed as a table
in the Query Results pane below the Query Editor.
Results in the preview mode with columns with data populated including the row, object ID,
cen_year, tree_dbh, tree_loc, pit_type, soil_lvl, status, spc_latin, and spc_common.

This query returns all columns for the first 1,000 rows from the table. BigQuery returns only the first
1,000 rows because the SELECT query includes a LIMIT 1000 clause. This limits the rows returned to
reduce the processing time required.

3. Next, write a query to find out the average diameter of all NYC trees in 2005. On line 1, replace the
* after the SELECT command with AVG(tree_dbh). Select the Run button to execute the query.

This returns your answer, 12.833 (which means the average diameter of NYC trees in 2005 was
12.833 inches).

Step 5: Write your own queries

Now, come up with some questions and answer them with your own SQL queries. For example,
query the 1995 and the 2015 tables to find the average diameter of trees. You can then compare the
average diameter of the trees in all three datasets to determine whether the trees in NYC have
grown on average. Note that the field name for tree diameter in the tree_census_1995 table is
diameter.

You are also free to choose another publicly available dataset in BigQuery and write your own
queries for extra practice—there are many interesting choices!

Question 1

Reflection
You want to compare tree sizes in 2005 to tree sizes in 2015. In the activity above, you calculated the
average diameter of NYC trees in 2005. Based on the new_york_trees dataset, what is the average
diameter of NYC trees in 2015?

1 point

11.439

12.981

12.334

11.279

Question 2

During this activity, you practiced writing SQL queries to return information from datasets. In the text
box below, write a 2-3 sentence (40-60 words) response to each of the following questions:

 What do you think might happen if you wrote each component of a query correctly, but
rearranged the order?

 How can you use SQL queries to grow as a data analyst?

1 point

Your answer cannot be more than 10000 characters.

Coursera Honor Code Learn more

I, Susan Francis, understand that submitting work that isn’t my own may result in permanent failure
of this course or deactivation of my Coursera account.

SubmitSave draft

Dislike

Report an issue

Skip to main content

Search in course

Search
 English

 Menu pop-up Collapsed

Hide menu

 Work with databases

 Manage data with metadata

 Access different data sources

 Sort and filter data

 Large datasets in SQL

Video: VideoGet to know BigQuery, including sandbox and billing options

. Duration: 2 minutes2 min

Reading: ReadingSet up your BigQuery account

. Duration: 20 minutes20 min

Reading: ReadingGet started with BigQuery

. Duration: 20 minutes20 min

Reading: ReadingStep-by-Step: BigQuery in action

. Duration: 20 minutes20 min

Video: VideoBigQuery in action

. Duration: 6 minutes6 min

Practice Quiz: Hands-On Activity: Introduction to BigQuery

. Duration: 1 hour1h

Reading: ReadingIn-depth guide: SQL best practices

. Duration: 20 minutes20 min

Ungraded App Item: Ungraded App ItemHands-On Guided Project: Select a new city using BigQuery

. Duration: 25 minutes25 min

Practice Quiz: Hands-On Activity: Create a custom table in BigQuery

. Duration: 1 hour1h

Practice Quiz: Hands-On Activity: Choose the right tool for the job

. Duration: 1 hour1h

Practice Quiz: Hands-On Activity: More practice with SQL

. Duration: 1 hour1h

Practice Quiz: Test your knowledge on using SQL with large datasets

. Duration: 8 minutes8 min

 Module 3 challenge

2. Module 3
3. Test your knowledge on using SQL with large datasets

PreviousNext

Test your knowledge on using SQL with large datasets

Review Learning Objectives

Ready to review what you’ve learned before taking the quiz? I’m here to help.

Help me practiceLet's chat

Submit your assignment

Resume assignment

Receive grade

To Pass75% or higher

Your grade

-Not available

Dislike

Report an issue

Hi, Dr!

How can I help?

Help me practice

Let's chat

Coach is powered by AI, so check for mistakes and don't share sensitive info. Your data will be used in
accordance with Coursera's Privacy Notice.




Ctrl+M

Back

Test your knowledge on using SQL with large datasets

Practice Quiz. • 8 min. • 4 total points available.4 total points

English

DueAug 7, 11:59 PM IST

Question 1

Fill in the blank: When using SQL, the _____ clause can be used to filter a dataset of customers to
only include people who have made a purchase in the past month.

1 point

WHERE

FILTER

FROM

SELECT

Question 2

Which cases are most often used for column names in a database table and represent a SQL best
practice? Select all that apply.

1 point

Camel case

Snake case

Sentence case

Lowercase

3.
Question 3

A database table is named WebTrafficAnalytics. What type of case is this?

1 point

Sentence case

Snake case

Camel case

Lowercase

Question 4

What can be removed from the following query without preventing it from running or changing the
results?

SELECT *

FROM `Uni_dataset.new_table`

WHERE ID = 'Lawrence'

1 point

Backticks (`)

WHERE

SELECT

The asterisk (*)

Django postgreSQL
No ratings yet
Django postgreSQL
6 pages
(MarketCipher A & B) Algo Version
No ratings yet
(MarketCipher A & B) Algo Version
4 pages
ALGOX
No ratings yet
ALGOX
25 pages
مستند نصي جديد
No ratings yet
مستند نصي جديد
29 pages
Germencik at
No ratings yet
Germencik at
4 pages
Dashboard and Signals Table
No ratings yet
Dashboard and Signals Table
12 pages
Target Indicator
No ratings yet
Target Indicator
3 pages
ICT Immediate Rebalance Toolkit (LuxAlgo)
No ratings yet
ICT Immediate Rebalance Toolkit (LuxAlgo)
18 pages
Strategy 3x Supertrend
No ratings yet
Strategy 3x Supertrend
38 pages
Magnet v5
No ratings yet
Magnet v5
5 pages
Order Block Matrix
100% (1)
Order Block Matrix
20 pages
Vwap Done Only Color Left
No ratings yet
Vwap Done Only Color Left
3 pages
StratifyTrade - Volume Footprint Voids
No ratings yet
StratifyTrade - Volume Footprint Voids
5 pages
At Indicator 1.0
No ratings yet
At Indicator 1.0
3 pages
Text To PDF (4) PDF To Text 727 932
No ratings yet
Text To PDF (4) PDF To Text 727 932
4 pages
Super Scalper Strategy v3.3 Update (3) Taitrading
No ratings yet
Super Scalper Strategy v3.3 Update (3) Taitrading
10 pages
Version 4
No ratings yet
Version 4
11 pages
Banana For Camel
No ratings yet
Banana For Camel
28 pages
Trading Order Block Script
No ratings yet
Trading Order Block Script
5 pages
Elitealgo v30
No ratings yet
Elitealgo v30
30 pages
Phoenix Ascending 2.201
No ratings yet
Phoenix Ascending 2.201
3 pages
BigBeluga - Smart Money Concepts
No ratings yet
BigBeluga - Smart Money Concepts
28 pages
NY Trading Sessions Script
No ratings yet
NY Trading Sessions Script
3 pages
WaveTrend 3D (WT3D)
No ratings yet
WaveTrend 3D (WT3D)
11 pages
Trend Li̇ne
No ratings yet
Trend Li̇ne
9 pages
MACD PULBA CHART REDDIT COMMENT SCRIPT by ITCHY
No ratings yet
MACD PULBA CHART REDDIT COMMENT SCRIPT by ITCHY
2 pages
Aitrading System
0% (1)
Aitrading System
2 pages
Vscore July1
No ratings yet
Vscore July1
2 pages
QQEX Indikator
No ratings yet
QQEX Indikator
6 pages
VanPhu Dashboard
No ratings yet
VanPhu Dashboard
7 pages
RoyalPrince Oscillator With Crossover
No ratings yet
RoyalPrince Oscillator With Crossover
1 page
Trading Indicator Script
No ratings yet
Trading Indicator Script
11 pages
Pola Koda Pola Zvijer
No ratings yet
Pola Koda Pola Zvijer
4 pages
DOT Gann Time Cycle Dates
No ratings yet
DOT Gann Time Cycle Dates
2 pages
Trade Algo Dut
No ratings yet
Trade Algo Dut
11 pages
ICT
No ratings yet
ICT
8 pages
Script 92%
No ratings yet
Script 92%
12 pages
Message
No ratings yet
Message
4 pages
Daily Algo
No ratings yet
Daily Algo
12 pages
Ago Algo Market Cipher
No ratings yet
Ago Algo Market Cipher
10 pages
c1 s6 Plotting
No ratings yet
c1 s6 Plotting
7 pages
Structure Maping
No ratings yet
Structure Maping
6 pages
ALGOX
No ratings yet
ALGOX
23 pages
Indicator Volume-Weighted Average Price VWAP For TradingView
No ratings yet
Indicator Volume-Weighted Average Price VWAP For TradingView
3 pages
Super Indicator
No ratings yet
Super Indicator
41 pages
HalfTrend + TMA Centered Bands Indicator v0.42 BY LUPEN
No ratings yet
HalfTrend + TMA Centered Bands Indicator v0.42 BY LUPEN
6 pages
Signal Indicator
No ratings yet
Signal Indicator
2 pages
Free Indicator 4
No ratings yet
Free Indicator 4
4 pages
Super Scalper
No ratings yet
Super Scalper
2 pages
Simplealgo v3 4 - PDF
No ratings yet
Simplealgo v3 4 - PDF
1 page
Luxalgo Premium
No ratings yet
Luxalgo Premium
3 pages
This Pine Script
No ratings yet
This Pine Script
4 pages
VPMaps Lusk Aglo OBpmk 07 SRjordanfray
No ratings yet
VPMaps Lusk Aglo OBpmk 07 SRjordanfray
16 pages
VWAP Strategy
No ratings yet
VWAP Strategy
2 pages
Support Resistance Levels
No ratings yet
Support Resistance Levels
2 pages
EMA Crossover Trading Strategy With MACD, ADX, and Trailing Stops
No ratings yet
EMA Crossover Trading Strategy With MACD, ADX, and Trailing Stops
21 pages
Killzone Strategy Script Guide
No ratings yet
Killzone Strategy Script Guide
3 pages
турбо
No ratings yet
турбо
13 pages
1 Script 5X1 Elite
No ratings yet
1 Script 5X1 Elite
2 pages
CDA C2 R 200 en File 22.en
No ratings yet
CDA C2 R 200 en File 22.en
7 pages
OPJMBA - Beginner's Guide To Spreadsheets - Introduction To Spreadsheets - Session Overview
No ratings yet
OPJMBA - Beginner's Guide To Spreadsheets - Introduction To Spreadsheets - Session Overview
13 pages
Flyer
No ratings yet
Flyer
1 page
Responsible AI Development
No ratings yet
Responsible AI Development
19 pages
Collegedekho 250615 093743
No ratings yet
Collegedekho 250615 093743
3 pages
A TOOTH in The EYE-Dr Christopher Liu, Rawya Abdelhadi, and Angela Busuttil
No ratings yet
A TOOTH in The EYE-Dr Christopher Liu, Rawya Abdelhadi, and Angela Busuttil
7 pages
Phy - Waves
No ratings yet
Phy - Waves
26 pages
ISCR RWE-ICTD-Webinar-27th May2025
No ratings yet
ISCR RWE-ICTD-Webinar-27th May2025
2 pages
ISC Physics XI XII Revised 020424 2026
No ratings yet
ISC Physics XI XII Revised 020424 2026
24 pages
Health Cybersecurity Roadmap
No ratings yet
Health Cybersecurity Roadmap
2 pages
History and Civics
No ratings yet
History and Civics
63 pages
LS3 Firm and Consumer
No ratings yet
LS3 Firm and Consumer
69 pages
Women in AI Leadership
No ratings yet
Women in AI Leadership
56 pages
Health Cybersecurity MBA Career Path
No ratings yet
Health Cybersecurity MBA Career Path
1 page
Whey Proteins in Child Nutrition
No ratings yet
Whey Proteins in Child Nutrition
14 pages
Parinitha Rt-3
No ratings yet
Parinitha Rt-3
9 pages
Module 14
No ratings yet
Module 14
14 pages
Transcription+ +Thought+Leadership+Masterclass
No ratings yet
Transcription+ +Thought+Leadership+Masterclass
21 pages
Les Pronoms Personnels Completed
100% (2)
Les Pronoms Personnels Completed
11 pages
Sense Organs
No ratings yet
Sense Organs
11 pages
Mechanism of Protein in Weight Loss
No ratings yet
Mechanism of Protein in Weight Loss
8 pages
Beginner's Guide to Python and Data Science
No ratings yet
Beginner's Guide to Python and Data Science
4 pages
Analgesic Efficacy of Collagen Peptide in Knee Osteoarthritis: A Meta-Analysis of Randomized Controlled Trials
No ratings yet
Analgesic Efficacy of Collagen Peptide in Knee Osteoarthritis: A Meta-Analysis of Randomized Controlled Trials
11 pages
OHV Cable Test Solution
No ratings yet
OHV Cable Test Solution
10 pages
Report 2022 EPA Punjab
No ratings yet
Report 2022 EPA Punjab
136 pages
Criminology Study for Students
100% (5)
Criminology Study for Students
11 pages
Poetry Class 11
No ratings yet
Poetry Class 11
76 pages
Dry Bath Incubator Series
No ratings yet
Dry Bath Incubator Series
12 pages
Purposive Communication Guide
No ratings yet
Purposive Communication Guide
20 pages
Satish K. Tripathi
No ratings yet
Satish K. Tripathi
4 pages
Trident Dental Manual
No ratings yet
Trident Dental Manual
4 pages
QC Repro Health and Population Ordinance PDF
No ratings yet
QC Repro Health and Population Ordinance PDF
11 pages
Company Agreement: Mobile Working
No ratings yet
Company Agreement: Mobile Working
8 pages
Concrete Pouring RAMS
No ratings yet
Concrete Pouring RAMS
19 pages
Jee Adv Class2
No ratings yet
Jee Adv Class2
4 pages
Business Communication Business Communication: Lesikar + Flatley + Rentz Lesikar + Flatley + Rentz
No ratings yet
Business Communication Business Communication: Lesikar + Flatley + Rentz Lesikar + Flatley + Rentz
32 pages
Linear Algebra Course Guide
No ratings yet
Linear Algebra Course Guide
3 pages
Peace Fellowships Relevant Experience Eligibility Guide en
No ratings yet
Peace Fellowships Relevant Experience Eligibility Guide en
1 page
Notes - Classification of Elements and Periodicity in Properties0204161702frsufs
No ratings yet
Notes - Classification of Elements and Periodicity in Properties0204161702frsufs
11 pages
Intro-To-Crim Diffinition of Terms
No ratings yet
Intro-To-Crim Diffinition of Terms
22 pages
DITTE Civil & Building CURRICULUM
No ratings yet
DITTE Civil & Building CURRICULUM
90 pages
Limiter6 Manual
No ratings yet
Limiter6 Manual
22 pages
Semiotics - D - Alleva
No ratings yet
Semiotics - D - Alleva
9 pages
DS2000 Manual en Unlocked
No ratings yet
DS2000 Manual en Unlocked
142 pages
Aspiring Civil Engineer's Journey
No ratings yet
Aspiring Civil Engineer's Journey
1 page
Samacheer Kalvi 11th Accountancy Solutions Chapter 3 Books of Prime Entry - Samacheer Guru
No ratings yet
Samacheer Kalvi 11th Accountancy Solutions Chapter 3 Books of Prime Entry - Samacheer Guru
62 pages
Software Engineering Slide nr.2
No ratings yet
Software Engineering Slide nr.2
19 pages
Timeline of Philippine Educational System
No ratings yet
Timeline of Philippine Educational System
40 pages
14C13823-001 2 Ex2
No ratings yet
14C13823-001 2 Ex2
5 pages
TR100 7831
No ratings yet
TR100 7831
512 pages
ASME PaperTemplate-ECTC2012
No ratings yet
ASME PaperTemplate-ECTC2012
2 pages
SOP For Covid 19
No ratings yet
SOP For Covid 19
3 pages
DLP Convex
No ratings yet
DLP Convex
4 pages

Big Query

Uploaded by

Big Query

Uploaded by

Glossary terms from module 3

Terms and definitions for Course 3, Module 3

The Navigation pane

The Explorer pane

Star and examine Public Datasets

 Schema, which displays the column names in the dataset

 Preview, which shows the first rows from the dataset

The SQL Workspace

Upload your data

Step-by-Step: BigQuery in action

What you'll need

Example 1: Preview a section from a table viewer

7. Execute the query by selecting the RUN button.

Example 3: Use SQL to view a piece of data

2. Enter FROM and then paste `bigquery-public-

5. Execute the query with the RUN button.

Here's a breakdown of how a SQL query works:

So, a simple SQL query might look like this:

SELECT customer_name, order_date

FROM Customers, Orders

WHERE customer_state = 'California';

Step 1: Get a BigQuery account

Once you have your account, start exploring!

Step 2: Open your BigQuery console

Step 3: Access public data in BigQuery

Once you have finished previewing the data, write a query!

Step 4: Review basic parts of a query

Step 5: Write a basic query

The completed query should appear like this:

4. Run your completed query by selecting the blue RUN button.

Step 6: Write a query to answer a question

Your completed query should be written like this:

duration >= 1200;

4. Run your completed query by clicking the RUN button.

Optional Step 7: Up for a challenge?

 What is the name of the station whose start_station_id is 111?

 What is the bike_model of bike_id 58782?

Step 8: Check your work

Or download the file directly here:

Run another query on your table:

Notting Hill Gate Station, Notting Hill

Southwark Street, Bankside

East Village, Queen Elizabeth Olympic Park

Tower Gardens, Tower

Intro to BigQuery solutions

in your own BigQuery console.

1. What is the name of the station whose start_station_id is 111?

3. What is the bike_model of bike_id 58782?

Capitalization and case sensitivity

Single or double quotes: '' or " "

1. When you want strings to be identifiable in any SQL dialect

2. When your string contains an apostrophe or quotation marks

Rachel DeSantos Shepherd’s pie

Sujin Lee Tacos

Najil Okoro Spanish paella

favorite_food = 'Shepherd's pie'

favorite_food = "Shepherd's pie"

rowkey, -key used to join with account_id

Info.date, -date is in spring format YYYY-MM-DD HH:MM:SS

Info.code -e.g. 'pub-###'

snake_case names for columns

CamelCase names for tables

 With snake_case, like tickets_by_occasion

THEN 'Will watch alone' ELSE 'Watch with others' END AS

watch_category, COUNT(movie_title) AS number_of_movies

WHEN genre = 'horror' THEN 'Will not watch'

WHEN genre = 'documentary' THEN 'Will watch alone'

ELSE 'Watch with others'

END AS watch_category, COUNT(movie_title) AS number_of_movies

-- Analyst: Jazmin Cisneros

-- Goal: Count the number of rows in the table

COUNT(*) number of rows -- the * stands for all so count all

Or, you can use the /* */ method like below:

/* Date: September 15, 2020

Analyst: Jazmin Cisneros

Goal: Count the number of rows in the table

COUNT() number of rows -- the stands for all so count all

COUNT() number of rows -- the stands for all so count all