A Practical Introduction To Databases
A Practical Introduction To Databases
The illustration above shows a table named “fruit_stand” with three columns named “item”,
“price”, and “unit”. Although the illustration does not show the data types, we might
infer that the item and unit columns contain text and the price column contains
decimal numbers. Each row of fruit_stand contains information about one kind of fruit sold
at the fruit stand.
Structured query language (SQL) is the most popular query language for relational
databases, and is an example of a declarative language.
Even though a standard exists for SQL, relational databases have slightly different
implementations of the relational model, and similarly different “dialects” of SQL.
Despite these differences, most SQL dialects share enough in common that, with care, highly
portable SQL code can be written.
The basic element of SQL is the statement or query. While a distinction can be made
between these two (statements act to change the state of the database, while queries act
to retrieve data), it is common to use the terms interchangeably. Statements in SQL read
much like English sentences, although with very strict rules.
In its simplest form, the SELECT statement can be used to retrieve all data from a table.
We just need to know the table name:
Here, fruit_stand is the name of the table. The * is a special symbol used with SELECT
statements to mean “all columns in the table”.
The statement (or query) above is said to have two clauses; a clause is a part of a SQL
statement, usually starting with a SQL keyword. The two clauses in the statement above are
the SELECT clause, “SELECT *” and the FROM clause, “FROM fruit_stand”. Most clauses are
optional in the sense that they are not required in every query, although they will be
necessary to produce certain desired results.
We can specify the columns we wish to retrieve, and in what order, by replacing the * in
our SELECT clause with a comma-separated list of columns:
item TEXT,
price NUMERIC,
unit TEXT
);
*the data types we have used here are not standard SQL
When defining a table in SQL, you list the columns you want, giving the name and the data
type of the column.
Be sure to use single quotes around any string values you put into a table.
SQL statements are properly terminated by semicolons. In some software tools, single
statements are allowed to be unterminated. However, we will always show the semicolon in
our examples, as they become very important in settings where you want to send a list
of statements to the database at one time.
It is entirely permissible and (in many cases preferable, as in our CREATE TABLE examples) to
write statements on multiple lines. The query below is correct, and equivalent to the same
query expressed all on one line:
SELECT *
FROM fruit_stand;
SQL keywords are case-insensitive. We can write and get the same result for each query:
1.1.7. Comments
SELECT *
/* here is a
multi-line comment */
fruit_stand;
To see just a subset of rows, we include a WHERE clause in our query. The WHERE clause
consists of the keyword WHERE followed by an expression that evaluates to true or
false (a Boolean expression).The WHERE clause is placed after the FROM clause.
String literals in SQL are enclosed with single quotes - not double quotes. Double quotes
are used in SQL for a different purpose.
Queries can return zero, one, or many rows. If no rows match the WHERE condition,
zero rows are returned.
One fact about relational databases is that the rows in a table are not necessarily
ordered in any particular fashion. Relational DBMSes (RDBMSes) are permitted to store data
in whatever fashion is most convenient or efficient, as well as to retrieve data however is most
convenient.
SQL provides a mechanism by which we can put rows in order by whatever criteria we wish.
This is accomplished via the ORDER BY clause, which always comes last in any query. The
key phrase ORDER BY is followed by a comma-separated list of expressions, which must
evaluate to some type that can be put in order: numbers, character strings, dates, etc. By
default, numbers are sorted from smallest to largest, and dates from earliest to latest.
Character strings are a bit trickier, because different databases order them differently.
SQLite, defaults to lexicographic ordering based on ASCII values.
Ordering is initially applied using the first expression after the ORDER BY keyword. If
any two rows are equal according to that first expression, and there are additional
expressions in the ORDER BY clause, the next expression is then applied to groups of rows
that have equal values for the first expression, and so forth. For example, suppose you are
organizing books for a library or bookstore where books are grouped by genre and then
alphabetized by title. You could write the following query to help with this task:
FROM simple_books
It is possible to reverse the ordering for any or all of the criteria using the DESC
(“descending”) keyword. (You can also use ASC for “ascending”, but, as that is the default, it
is usually omitted.) If we want to see all books listed from most recent to least recent, we
can write:
It is usually good practice to set up database tables in such a way that each record in the
table is unique; that is, for each row, there will be no other row in the table that contains
exactly the same data in every column.
However, queries that SELECT a subset of the columns of a table can easily end up with
duplicate results; this may or may not be desired. Suppose you were interested in browsing
the books in our database for particular genres of books, but you weren’t sure what genres
the database puts books into - that is, you need to determine what would be valid choices
given the data. You could simply run the query:
SQL provides a keyword, DISTINCT, that can be added after the SELECT keyword and tells
SQL that we only want unique results, and if there are duplicates, it should discard
them. This will give us the desired result, a unique set of genres that we can choose
from: