SQL is a simple language used for retrieving and updating data in a database. A SQL statement is constructed based on what we call “clauses”. We assemble SQL clauses together to return, manipulate, filter, and update data.
The simplest SQL statement is:
SELECT 1;
This tutorial is interactive, run the command in the Postgres shell to the right. You’ll see that result returns a single value of “1”. What if we changed it to:
SELECT 'two';
Now, you’ll see that it returns a string with the value ‘two’. This clause we
are running is the SELECT clause. Notice, all SQL statements are finalized by
a semicolon: ; .
We have populated this database with a table called “regions” that houses a few
regions around the world. To return data from the database, then we will use a
FROM clause:
SELECT regions.* FROM regions;
Once we append the FROM regions clause it means we now have access to the
values in the regions table. The * can be through of as “return all columns on
this table.” If we only want to return a single column, we can be explicit about
the values in the SELECT clause:
SELECT regions.region_name FROM regions;
Next, we can use a WHERE clause to restrict the filter to certain regions:
SELECT regions.* FROM regions WHERE regions.region_id = 1;
Experiment changing the values of the WHERE conditional to see how the
different values work. In the where clause, you have access to all types of
comparison operators, include greater than, less than, equal, contained in a
list, etc:
SELECT regions.* FROM regions WHERE regions.region_id < 2;
SELECT regions.* FROM regions WHERE regions.region_id > 2;
SELECT regions.* FROM regions WHERE regions.region_id >= 2;
SELECT regions.* FROM regions WHERE regions.region_name IN ('Asia', 'Americas');
Additionally, you can use AND and OR logic to create complex conditionals
(this SQL statement is getting larger, so let’s break it into multiple lines —
notice how we break it along the clauses):
SELECT
regions.*
FROM regions
WHERE regions.region_id >= 2
AND regions.region_name IN ('Asia', 'Americas');
SQL gives us the ability to append a sorting clause called ORDER BY:
SELECT
regions.*
FROM regions
ORDER BY regions.region_name;
And, with that sorting clause, we can reverse order (the DESC means
descending, if not specified, ordering is assumed to be ASC or ascending):
SELECT
regions.*
FROM regions
ORDER BY regions.region_name DESC;
Because SQL is built on clauses, we can assemble the ORDER BY clause together
with the prior WHERE clause:
SELECT
regions.*
FROM regions
WHERE regions.region_id >= 2
AND regions.region_name IN ('Asia', 'Americas')
ORDER BY region_name;
See how these SQL clauses are used to assemble an entire statement? SQL is
really just assembling clauses together. If you only want to return a specific
number of rows, then we can use the LIMIT clause:
SELECT
regions.*
FROM regions
LIMIT 1;
As with all clauses, the LIMIT clause can be integrated with other clauses:
SELECT
regions.*
FROM regions
WHERE regions.region_id >= 2
AND regions.region_name IN ('Asia', 'Americas')
ORDER BY region_name
LIMIT 1;
Let me just take a second to say: SQL requires your clauses to be ordered
properly. So far, we have been using a regions table that has only an
region_id and a region_name field. SQL databases are not static. Far from
it, we can easily manipulate and update the tables. Guess how we manipulate the
tables? We also have SQL clauses for that. Below, we will use the ALTER TABLE
and ADD COLUMN clauses:
ALTER TABLE regions
ADD COLUMN region_population FLOAT DEFAULT NULL;
COMMENT ON COLUMN regions.region_population IS 'population in billions';
Excellent, run that command above, and it will add new column to the regions
table. We have also used the COMMENT clause to add metadata about the column
to the schema, this is optional but helpful when you have potentially ambiguous
data. Now, if you run SELECT * FROM regions; you’ll see this new column, but
it will be empty. Next, let’s use an UPDATE statement to update the population
for each of the regions:
UPDATE regions
SET region_population = 1.002
WHERE regions.region_name = 'Americas';
Above, you’ll see that the UPDATE statements use the SET and WHERE
clauses. The SET clause specifies the values to set on the updated table and
the WHERE clause restricts which rows the values are set for. UPDATE
statements are powerful, and can change the values for many rows all at once.
For instance, what if we left off the WHERE clause above and ran the following
it would update all rows:
UPDATE regions
SET region_population = 1.002;
The response in the terminal is UPDATE 4, which means we updated 4 ROWS! Uh oh
… now if you run SELECT * FROM regions , you’ll see the same value is set for
all regions (that’s ok, we can fix that):
UPDATE regions
SET region_population = 1.002
WHERE regions.region_name = 'Americas';
UPDATE regions
SET region_population = 0.493
WHERE regions.region_name = 'Middle East and Africa';
UPDATE regions
SET region_population = 4.561
WHERE regions.region_name = 'Asia';
UPDATE regions
SET region_population = 0.7464
WHERE regions.region_name = 'Europe';
As you can see from usage of the ALTER TABLE, COMMENT, and UPDATE
statements we are assembling clauses together in fairly predictable ways. Now,
let’s go back to the SELECT statement and see what we can do with this new
data that we’ve added.
Using SQL, we can manipulate values as well. For instance, we can SUM all of
the values returned by a column:
SELECT sum(regions.region_population)
FROM regions;
This will sum all populations for all regions, but we can use the WHERE
conditional to sum the values of Europe, Asia, Middle East, and Africa:
SELECT sum(regions.region_population)
FROM regions
WHERE regions.region_name IN ('Middle East and Africa', 'Asia', 'Europe');
I see that we are missing Australia from the list of regions! Let us add
Australia using an INSERT statement:
INSERT INTO regions (region_name, region_population)
VALUES ('Australia', 0.02569);
Now, when running SELECT * FROM regions; you’ll see Australia added to the
list.
Summary
SQL is just a statement with a series of clauses appended to the statement to modify behavior. The SQL statements that we covered are just a small number of the total number of statements available:
- SELECT
- INSERT
- UPDATE
- DELETE
- CREATE TABLE
- ALTER TABLE
Within SQL statements, we have a list of clauses, and those clauses are sometimes shared across different statements. The clauses we used were:
- FROM
- WHERE
- LIMIT
- SET
- GROUP BY
- ADD COLUMN