Technical Interview Help for Data Professionals
In the event you’re aspiring and currently interviewing for roles corresponding to data scientists, data analysts, and data engineers then you definately are more likely to encounter a number of technical interviews that require live coding, often involving SQL. While later interviews might require different programming languages like Python, which is common in the info domain, let’s give attention to the everyday SQL questions that I’ve encountered during these interviews. For the aim of this discussion, I’ll assume that you just’re already acquainted with fundamental SQL concepts corresponding to SELECT
, FROM
, WHERE
, in addition to aggregate functions like SUM
and COUNT
. Let’s get into the specifics!
1. Mastering Joins and Table Types
Indisputably, essentially the most common SQL query is around table joins. It may appear too obvious, but every interview I’ve participated in has centered around this topic. It is best to feel comfy with inner joins and left joins. Moreover, proficiency in handling self-joins and unions is useful. Equally essential is the power to execute these joins across different table types, particularly fact and dimension tables. Listed here are my loose definitions for these two terms:
Fact Table: A table containing quite a few rows but relatively few attributes or columns. Imagine an example where an internet retailer maintains an “orders” table with columns like: date, customer_id, order_id, product_id, units, amount
. This table has few attributes but comprises an enormous volume of records.
Dimension Table: A dimensional table with fewer rows yet many attributes. As an illustration, the identical online retailer’s “customer” table might hold one row per customer, featuring attributes corresponding to customer_id, first_name, last_name, ship_street_addr, ship_zip_code
and more.
Understanding these two primary table types is essential. It’s crucial to understand why and easy methods to merge fact and dimension tables to make sure accurate results. Let’s consider a real-world example: the interview query presents two tables (“orders” and “customer”) and asks:
How many shoppers have purchased a minimum of 3 units of their lifetime and have a shipping zip code of 90210?