
Before we start..
In some databases like SQL Server, PostgreSQL and SQLite we use the EXCEPT
operator. In other databases (MySQL and Oracle e.g.) this operator goes under a unique name: MINUS
. Each MINUS
and EXCEPT
work in the very same way.
TL;DR:
EXCEPT == MINUS
For the remainder of this text we’ll use SQL Server examples with EXCEPT
; when you use one other database, just swap this out for MINUS
where needed.
Let’s start with a quite simple example. We’ll first create a table that incorporates countries: we have now an id, a reputation and a population. Then we’ll insert 10 countries:
CREATE TABLE countries (
id INT,
name VARCHAR(50),
population INTEGER,
);INSERT INTO countries (id, name, population) VALUES
(1, 'Armenia', 2978763),
(2, 'Belgium', 11730997),
(3, 'Canada', 37742154),
(4, 'Denmark', 5792202),
(5, 'Egypt', 102334404),
(6, 'Ghana', 31072940),
(7, 'Hungary', 9660351),
(8, 'Iraq', 40222493),
(9, 'Kiribati', 119449),
(10, 'Liechtenstein', 38128);
Now we’ll create two queries:
- small countries: select countries with a population of lower than 1 million
- c-countries: selects countries whose name incorporates the letter ‘c’
-- small countries (pop < 1M)
SELECT name FROM countries WHERE population < 1000000
--> Kiribati & Liechtenstein-- c-countries (name incorporates a c)
SELECT name FROM countries WHERE NAME LIKE '%c%'
--> Canada and Liechtenstein
Next we’ll reveal how one can compare and mix the outcomes of those queries into one result set.
1.1 Except
Now we are able to use EXCEPT
to pick out all small countries EXCEPT
people who have a ‘c’ of their name:
SELECT name FROM countries WHERE population < 1000000
EXCEPT
SELECT name FROM countries WHERE NAME LIKE '%c%'
--> Kiribati
1.2 Intersect
Use INTERSECT
to seek out out the overlap: small c-countries:
SELECT name FROM countries WHERE population < 1000000
INTERSECT
SELECT name FROM countries WHERE NAME LIKE '%c%'
--> Liechtenstein
1.3 UNION (and UNION ALL)
Union takes the results of each queries and merges them together in a single result set:
SELECT name FROM countries WHERE population < 1000000
UNION
SELECT name FROM countries WHERE NAME LIKE '%c%'
--> Canada, Kiribati, LiechtensteinSELECT name FROM countries WHERE population < 1000000
UNION ALL
SELECT name FROM countries WHERE NAME LIKE '%c%'
--> Kiribati, Liechtenstein, Canada, Liechtenstein
In the instance code above UNION ALL
produces one other Liechtenstein! It is because the regular UNION
removes duplicates; UNION ALL
returns all rows from each queries.
SQL analyzes the outcomes of the queries and determines whether there may be or isn’t an intersection based on the columns that get returned from either query. Broadly speaking, there are two rules regarding the queries we’re comparing:
- The columns have to be comparable (same data type)
- The variety of columns have to be equal
RULE 1: comparable columns
The code below will fail because we cannot compare id
(integer) and name
(string-type).
-- FAILS: Conversion failed when converting the varchar value 'Canada' to data type int.
SELECT id FROM countries WHERE population < 1000000
EXCEPT
SELECT name FROM countries WHERE NAME LIKE '%c%'
RULE 2: equal variety of columns
The code below will fail because we select two columns from query1 and just one from query2:
-- All queries combined using a UNION, INTERSECT or EXCEPT operator should have an equal variety of expressions of their goal lists.
SELECT id, name FROM countries WHERE population < 1000000
EXCEPT
SELECT name FROM countries WHERE NAME LIKE '%c%'
Following the principles
After we adhere to those rules we can even perform queries just like the one below: matching all small-country id
‘s with the population
of ‘c’-countries. Although that is perfectly valid syntax; it’s going to yield no records.
SELECT id FROM countries WHERE population < 1000000
INTERSECT
SELECT population FROM countries WHERE NAME LIKE '%c%'
--> returns no rows (id and population match nowhere)
In the identical way we are able to UNION
the outcomes.
SELECT id FROM countries WHERE population < 1000000
UNION
SELECT population FROM countries WHERE NAME LIKE '%c%'
Will result into:
Also notice that the names of the columns don’t should match. If the names conflict, the column-name of the primary query can be used.
After all, we don’t have to check data from the identical table, we are able to use different tables as well, so long as the information structures resulting from the SELECT
statements match. I’ve prepared the next examples with this statement.
1. EXCEPT example:
In this instance we compare records from the products
table with records from the sales
table: we wish to know if there are any products with a price < 100 cents which were sold anywhere however the Netherlands:
SELECT id as product_id FROM products WHERE price < 100
EXCEPT
SELECT product_id FROM sales WHERE country = 'Netherlands'
--> Ends in product_id 3
2. INTERSECT example
Next up, we’ll select any product with SALES
within the USA that even have returns:
SELECT product_id FROM sales WHERE country = 'USA'
INTERSECT
SELECT product_id from returns
--> Ends in product_id 2
3. UNION example
Lastly, we’ll UNION the outcomes of three queries: low-cost products, products with any sales within the Netherlands or any product that’s returned:
SELECT id as product_id FROM products WHERE price < 100
UNION
SELECT product_id FROM sales WHERE country = 'Netherlands'
UNION
SELECT product_id from returns
--> Returns product_id 1, 2, and three
I believe of JOIN
‘s as combining columns from two or more tables data sets using a related column between them. The UNION
, INTERSECT
and EXCEPT
operator, however, mix or compares data from two or more tables after which appends this data.
I like to think about a join operating “horizontally”, “zipping” query results together. The UNION
, INTERSECT
and EXCEPT
operator to work “vertically”, “stacking up” result sets. Let’s reveal with some images:
Visualizing a join
Below a visible representation from joining two tables: we take two columns from table1 and mix them with one column from table2 using values in related columns (table1.Id <-> table2.SomeFk):
Consider it this fashion: using a JOIN
we “zip up” to tables. We make the table wider than it was before: we add to it horizontally.
Visualizing a UNION
Now let’s take a look at a UNION
statement. Notice that the outcomes are pretty independent of one another. They don’t have any related columns; we just “stack” the result sets on top of one another, expanding the table vertically. We will do that because each queries output the identical variety of columns and all column data types match.
On this case we use a UNION operator to stack the records on top of one another but INTERSECT
works the identical way; only keeping the overlap. In the identical way EXCEPT
“subtracts” some records from the “stack”.
I hope to have clarified the workings of those three wonderful operators and that you simply had a good time reading this text.
I hope this text was as clear as I hope it to be but when this isn’t the case please let me know what I can do to make clear further. Within the meantime, take a look at my other articles on every kind of programming-related topics like these:
Pleased coding!
— Mike
P.S: like what I’m doing? Follow me!