Home Artificial Intelligence 2 Necessary SQL CASE WHEN Examples You Need To Know in 2023 Example 1: Create a Points Table For Sport Tournament Example 2: Find Latest and Repeated Customers For an eCommerce Website

2 Necessary SQL CASE WHEN Examples You Need To Know in 2023 Example 1: Create a Points Table For Sport Tournament Example 2: Find Latest and Repeated Customers For an eCommerce Website

0
2 Necessary SQL CASE WHEN Examples You Need To Know in 2023
Example 1: Create a Points Table For Sport Tournament
Example 2: Find Latest and Repeated Customers For an eCommerce Website

Let’s start with the query which was suggested by a Sports Analyst. Sometimes, their analytics team must create a points table based on the entire matches played between different teams.

And that’s why they ask this query in every data analyst job interview.

It is a classic scenario of converting a table from a long-form (variety of rows > variety of columns ) to a large form (variety of columns > variety of rows). This can be called Data Pivoting, which is a vital use-case of CASE WHEN in SQL.

On this scenario, you’ve a table containing the names of the teams who played the match and the winner. It’s good to create a points table where you get details about what number of matches each team played, what number of matches they won, lost and what number of matches draw.

Variety of example | Image by Writer

Let’s see the way to tackle this sort of query —

Here is the input table where each row belongs to a match between two teams and the column winner indicates which team won. The worth NULL within the winner column indicates that the match was a draw i.e. not one of the team won the match.

Input data | Image by Writer

Let’s divide this query into the next sub-tasks.

  1. Find the entire variety of matches each team won
  2. Find the entire variety of matches each team lost
  3. Find the entire variety of matches where not one of the teams won
  4. Find the entire variety of matches each team played

To know what number of matches a team won, it is advisable to understand for every match which team won the match. You possibly can do that by comparing the team_1 and team_2 columns with the winner column.

So, for a particular row when the values in columns team_1 and winner are equal then team_1 is the winner.

You possibly can translate the exact same logic using CASE..WHEN..THEN statement in SQL as shown below.

SELECT team_1
, team_2
, winner
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams

As shown within the above query, you’ll create an extra column win_flag. When a team is a winner, you’ll assign the worth 1 to this column. Similarly, if the winner column is NULL, you then’ll assign the worth 1 to the draw_flag column.

So the above query will create the next output for all of the teams within the column team_1.

First partial output for team_1 | Image by Writer

Similarly, when the values within the columns team_2 and winner are equal then team_2 is the winner. So you’ll be able to write precisely the similar query for all of the teams in team_2

SELECT team_1
, team_2
, winner
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams

where you’ll get the next output for the values in team_2

Partial output for team_2 | Image by Writer

Well, the above two queries are only on your understanding. In point of fact, you’ll be able to create a single CTE for the win_flag and draw_flag of every team within the columns team_1 and team_2 as shown below.

WITH win_draw_flag AS
(
SELECT team_1 as team
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
UNION ALL
SELECT team_2 as team
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
)

It will create a CTE like this — I’ve shown this only on your understanding.

The output of CTE | Image by Writer

Remember you might be still within the long type of the table and also you now have information on whether or not each team wins the match.

Next, it is advisable to simply aggregate the columns to get the entire variety of matches each team played, won, and lost. You possibly can do it as simply as the next query.

SELECT team
, COUNT(*) AS matches_played
, SUM(win_flag) AS matches_won
, COUNT(*) - SUM(win_flag) - SUM(draw_flag) AS matches_lost
, SUM(draw_flag) AS matches_draw
FROM win_draw_flag
GROUP BY team
ORDER BY team

Where COUNT(*) gives you the entire variety of times each team occurred within the CTE win_draw_flag and subtracting matches won & draw from it will provide you with the entire variety of matches each team lost.

Final output — Points table | Image by Writer

Without making a CTE individually, you can even write the query like the next and pass the complete CASE..WHEN query as a sub-query.

SELECT team
, COUNT(*) AS matches_played
, SUM(win_flag) AS matches_won
, COUNT(*) - SUM(win_flag) - SUM(draw_flag) AS matches_lost
, SUM(draw_flag) AS matches_draw
FROM
(
SELECT team_1 as team
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
UNION ALL
SELECT team_2 as team
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.teams
) AS win_draw_flag
GROUP BY team
ORDER BY team

It’s going to also lead to the exact same output as mentioned above.

Well, there could be multiple ways to unravel this query — I discovered this approach easier. When you find another solution to this query, be happy to say it within the comments.

That is one among the classic examples of comparing dates after which implementing If..Else logic using CASE..WHEN statement. You possibly can encounter this sort of problem in any company that deals with customers.

The scenario is — You’ve got an eCommerce website where on daily basis customers visit and buy products. Your task is to discover on every day how many purchasers were recent and the way many purchasers were repeated.

Variety of query | Image by Writer

Here is an input table — orders — where you’ll be able to see the shoppers with customer_id ABC101, BCD201, and ABD101 visited the web site on multiple days and purchased different products.

Input table with dummy data | Image by Writer

Let’s break down the query into the next sub-tasks —

  1. Find the primary time i.e. the primary date when the shopper visited the web site
  2. Compare the primary date with the order date to choose whether the shopper is a repeated or first-time visitor

You possibly can easily solve the primary sub-task through the use of GROUP BY to group all of the records by customer_id and find the minimum of the order_date, as shown below.

SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
Minimum order date | Image by Writer

This was easy!

Next, to check the first_order_date with each order_date, first, it is advisable to bring each columns in a single table.

You possibly can easily try this using JOIN on customer_id as shown below. Here you’ll be able to create a CTE using the above query so that you’re going to get a brief table to affix with the input table.

WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
)

SELECT t1.*
, t2.first_order_date
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id

Inner Join output | Image by Writer

Now, as you bought each the columns in a single table, you’ll be able to compare order_date with the first_order_date and implement the next If..Else logic.

  1. If first_order_date & order_date are the identical then the shopper is the brand new customer
  2. If first_order_date & order_date are different then the shopper is repeated customer

So ideally, it is advisable to create two columns to implement the above two If..Else statements using CASE WHEN in SQL.

You don’t must create any separate table, slightly you’ll be able to add two more columns within the above query where you joined two tables. Here is how it may well be done.

WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
)

SELECT t1.*
, t2.first_order_date
, CASE WHEN t1.order_date=t2.first_order_date THEN 1 ELSE 0 END AS new_customer_flag
, CASE WHEN t1.order_date!=t2.first_order_date THEN 1 ELSE 0 END AS repeat_customer_flag
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id

CASE WHEN Output creating two recent columns | Image by Writer

Because of this, the column new_customer_flag will likely be 1 when the columns first_order_date and order_date are equal. Similarly, the column repeat_customer_flag will likely be 1 when the columns first_order_date and order_date are different.

Now the last step is just to group all of the records by order date and sum up the columns new_customer_flag and repeat_customer_flag.

To do that, you’ll need the above table which you’ll achieve by creating one other CTE as shown below.

WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
),

customers AS
(
SELECT t1.*
, t2.first_order_date
, CASE WHEN t1.order_date=t2.first_order_date THEN 1 ELSE 0 END AS new_customer_flag
, CASE WHEN t1.order_date!=t2.first_order_date THEN 1 ELSE 0 END AS repeat_customer_flag
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id
)

SELECT order_date
, SUM(new_customer_flag) AS number_of_new_customers
, SUM(repeat_customer_flag) AS number_of_repeat_customers
FROM customers
GROUP BY order_date
ORDER BY order_date

Number of latest and repeat customers every day | Image by Writer

That is the way you’ll get the required output. You possibly can cross-check the outcomes by comparing them with the input table.

Again, you’ll be able to have a unique approach to solving this query — that is the only approach I discovered. Don’t forget to say your approach within the comments below.

LEAVE A REPLY

Please enter your comment!
Please enter your name here