Skip to main content

Command Palette

Search for a command to run...

8 Week SQL Challenge - Case Study 2 (Pizza Metrics)

Updated
4 min read
M
Mary has been writing about Data Science on her blog (http://maryjonah.me) for years. In her free time, she likes to read books and have a good rest.

For the Case Study #2 - Pizza Runner, the questions were segmented by focus areas which were 5 in total.

Data Cleaning

Before answering any of these questions in any of the focus areas, there needed to be cleaning of columns in customer_orders and runner_orders tables.
They both had inconsistent use of ‘null’, NULL, and ‘‘which should be represented by a single NULL value in the columns: exclusions and extras (in the customer_orders table) and pickup_time, distance, and duration (in the runner_orders table).
Also used it as the opportunity to convert each column into its appropriate data type using the CAST keyword.

-- Data cleaning for customer_orders and runner_orders table
-- 1. Standardize use of NULL in both tables
CREATE TABLE customer_orders_clean AS 
SELECT
    CAST(order_id AS SIGNED) as order_id,
    CAST(customer_id AS SIGNED) as customer_id,
    CAST(pizza_id as SIGNED) as pizza_id,
    CASE WHEN exclusions IN ('null', 'NULL', '') THEN NULL ELSE exclusions END AS exclusions,
    CASE WHEN extras IN ('null', 'NULL', '') THEN NULL ELSE extras END AS extras,
    CAST(order_time AS DATETIME) as order_time
FROM customer_orders;

CREATE TABLE runner_orders_clean
    SELECT 
        CAST(order_id AS SIGNED) AS order_id,
        CAST(runner_id AS SIGNED) AS runner_id,
        CASE WHEN pickup_time IN ('null', 'NULL', '') THEN NULL ELSE CAST(pickup_time AS DATETIME) END AS pickup_time,
        CASE WHEN distance IN ('null', 'NULL', '') THEN NULL ELSE CAST(REPLACE(distance, 'km', '') AS DECIMAL(5,2)) END AS distance,
        CASE WHEN duration IN ('null', 'NULL', '') THEN NULL ELSE CAST(REPLACE(REPLACE(REPLACE(duration, 'minutes', ''), 'minute', ''), 'mins', '') AS SIGNED) END AS duration,
        CASE WHEN cancellation IN ('null', 'NULL', '') THEN NULL ELSE cancellation END AS cancellation
FROM runner_orders;


-- 2. Delete the original table
DROP TABLE customer_orders;
DROP TABLE runner_orders;


-- 3. Rename cleaned tables to their original table names
RENAME TABLE customer_orders_clean TO customer_orders;
RENAME TABLE runner_orders_clean TO runner_orders;

Pizza Metrics

How many pizzas were ordered?

SELECT
    COUNT(pizza_id) AS total_number_of_pizzas
FROM customer_orders;

How many unique customer orders were made?

SELECT
    COUNT(DISTINCT order_id) AS total_number_of_orders
FROM customer_orders;

How many successful orders were delivered by each runner?

SELECT 
    runner_orders.runner_id,
    COUNT(*) AS total_successful_order
FROM runner_orders
WHERE cancellation IS NULL
GROUP BY runner_orders.runner_id;

How many of each type of pizza was delivered?

SELECT 
    pizza_names.pizza_name,
    COUNT(*) AS total_number_of_pizza_delivered
FROM customer_orders
INNER JOIN runner_orders
    ON customer_orders.order_id = runner_orders.order_id
LEFT JOIN pizza_names
    ON customer_orders.pizza_id = pizza_names.pizza_id
WHERE runner_orders.cancellation IS NULL
GROUP BY pizza_names.pizza_name;

How many Vegetarian and Meatlovers were ordered by each customer?

SELECT
    customer_orders.customer_id,
    pizza_names.pizza_name,
    COUNT(*) AS total_pizza_ordered
FROM customer_orders
INNER JOIN pizza_names
    ON customer_orders.pizza_id = pizza_names.pizza_id
GROUP BY customer_orders.customer_id, pizza_names.pizza_name
ORDER BY customer_orders.customer_id;

What was the maximum number of pizzas delivered in a single order?

WITH total_pizzas_per_order AS (
    SELECT
        customer_orders.order_id,
        COUNT(*) AS total_pizzas_per_order
    FROM customer_orders
    INNER JOIN runner_orders
        ON customer_orders.order_id = runner_orders.order_id
    WHERE runner_orders.cancellation IS NULL
    GROUP BY customer_orders.order_id
)

SELECT
    MAX(total_pizzas_per_order) AS max_single_order
FROM total_pizzas_per_order;

For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

SELECT 
    customer_orders.customer_id,
    SUM(
        CASE
            WHEN (customer_orders.exclusions IS NOT NULL and customer_orders.exclusions != '') 
                OR (customer_orders.extras IS NOT NULL and customer_orders.extras != '') THEN 1 ELSE 0
        END) AS LeastOneChange,
    SUM(
        CASE
            WHEN (customer_orders.exclusions IS NULL or customer_orders.exclusions = '')
                AND (customer_orders.extras IS NULL or customer_orders.extras = '') THEN 1 ELSE 0
        END
    ) AS NoChanges
FROM customer_orders
INNER JOIN runner_orders
    ON customer_orders.order_id = runner_orders.order_id
WHERE runner_orders.cancellation IS NULL
GROUP BY customer_orders.customer_id;

How many pizzas were delivered that had both exclusions and extras?

SELECT
    COUNT(*) AS total_pizzas_with_exclusions_and_extras
FROM customer_orders
INNER JOIN runner_orders
    ON customer_orders.order_id = runner_orders.order_id
WHERE runner_orders.cancellation IS NULL 
    AND customer_orders.exclusions IS NOT NULL
    AND customer_orders.extras IS NOT NULL;

What was the total volume of pizzas ordered for each hour of the day?

SELECT
    HOUR(customer_orders.order_time) AS hour_ordered,
    COUNT(*) AS total_volume_per_hour
FROM customer_orders
GROUP BY hour_ordered
ORDER BY hour_ordered;

What was the volume of orders for each day of the week?

SELECT
    DAYOFWEEK(customer_orders.order_time) AS day_of_week,
    COUNT(*) AS total_volume_per_day_of_week
FROM customer_orders
GROUP BY day_of_week
ORDER BY day_of_week;

My answers to the next focus area B. Runner and Customer Experience in the challenge are shared in the next article: 8 Week SQL Challenge - Case Study 2 (Runner and Customer Experience)