Skip to main content

Command Palette

Search for a command to run...

8 Week SQL Challenge - Case Study 2 (Runner and Customer Experience)

Updated
3 min read
M
Mary has been writing about Data Science on her blog (http://maryjonah.me) for years. In her free time, she likes to read books and have a good rest.

The previous article 8 Week SQL Challenge - Case Study 2 answered all the questions under the first focus group for the use case: A. Pizza Metrics
This article answers the questions in the B. Runner and Customer Experience focus area in the of the second week of the second case study: Pizza Runner.

NB:
Remember the tables used here have already been cleaned.
Refer to Data Cleaning section in A. Pizza Metrics to remind yourself what was done.

How many runners signed up for each 1 week period? (i.e. week starts 2021-01-01)

SELECT
    DATE_ADD('2021-01-01', INTERVAL (DATEDIFF(registration_date, '2021-01-01') DIV 7) * 7 DAY) AS week_start,
    COUNT(*) AS runners_signed_up
FROM runners
GROUP BY week_start
ORDER BY week_start;

What was the average time in minutes it took for each runner to arrive at the Pizza Runner HQ to pickup the order?

WITH distinct_customer_orders AS (
    SELECT DISTINCT
        customer_orders.order_id,
        customer_orders.order_time
    FROM customer_orders
)

SELECT
    runner_orders.runner_id,
    AVG(TIMESTAMPDIFF(minute, distinct_customer_orders.order_time, runner_orders.pickup_time)) AS avg_arrival_time
FROM distinct_customer_orders
INNER JOIN runner_orders
    ON distinct_customer_orders.order_id = runner_orders.order_id
WHERE runner_orders.cancellation IS NULL
GROUP BY runner_orders.runner_id;

Is there any relationship between the number of pizzas and how long the order takes to prepare?

WITH total_pizza_per_order AS (
    SELECT
        customer_orders.order_id,
        COUNT(*) AS total_pizzas
    FROM customer_orders
    INNER JOIN runner_orders
        ON customer_orders.order_id = runner_orders.order_id
    WHERE runner_orders.cancellation IS NULL
    GROUP BY customer_orders.order_id
),
cook_duration AS (
    SELECT DISTINCT
        customer_orders.order_id,
        TIMESTAMPDIFF(minute, customer_orders.order_time, runner_orders.pickup_time) AS preparation_period
    FROM customer_orders
    INNER JOIN runner_orders
        ON customer_orders.order_id = runner_orders.order_id
    WHERE runner_orders.cancellation IS NULL
)

SELECT 
    total_pizza_per_order.order_id,
    total_pizza_per_order.total_pizzas,
    cook_duration.preparation_period
FROM total_pizza_per_order
INNER JOIN cook_duration
    ON total_pizza_per_order.order_id = cook_duration.order_id;

What was the average distance travelled for each customer?

WITH distinct_orders AS (
    SELECT DISTINCT 
        customer_orders.order_id,
        customer_orders.customer_id
    FROM customer_orders
)

SELECT
    distinct_orders.customer_id,
    ROUND(AVG(runner_orders.distance), 2) AS avg_distance_travelled
FROM distinct_orders
INNER JOIN runner_orders
    ON distinct_orders.order_id = runner_orders.order_id
WHERE runner_orders.cancellation IS NULL
GROUP BY distinct_orders.customer_id;

What was the difference between the longest and shortest delivery times for all orders?

WITH cleaned_runner_table AS (
    SELECT
        runner_orders.order_id,
        runner_orders.cancellation,
        runner_orders.duration AS duration
    FROM runner_orders
)

SELECT 
    MAX(cleaned_runner_table.duration) - MIN(cleaned_runner_table.duration) AS delivery_time_difference
FROM cleaned_runner_table
WHERE cleaned_runner_table.cancellation IS NULL;

What was the average speed for each runner for each delivery and do you notice any trend for these values?

WITH cleaned_data AS (
    SELECT DISTINCT  
        customer_orders.order_id,
        runner_orders.runner_id,
        runner_orders.distance AS distance_km,
        runner_orders.duration AS duration_min
    FROM customer_orders
    INNER JOIN runner_orders
    ON customer_orders.order_id = runner_orders.order_id
    WHERE runner_orders.cancellation IS NULL
)

SELECT 
    runner_id,
    order_id,
    distance_km,
    duration_min,
    ROUND(distance_km / duration_min, 2) AS avg_speed_kmh
FROM cleaned_data
ORDER BY runner_id, order_id;

What is the successful delivery percentage for each runner?

WITH total_run_per_runner AS (
    SELECT
        runner_orders.runner_id,
        COUNT(*) AS tt_runs
    FROM runner_orders
    GROUP BY runner_orders.runner_id
),
total_confirmed_run AS (
    SELECT 
        runner_orders.runner_id,
        COUNT(*) AS tt_confirmed
    FROM runner_orders
    WHERE runner_orders.cancellation IS NULL
    GROUP BY runner_orders.runner_id
)

SELECT 
    total_run_per_runner.runner_id,
    ROUND((total_confirmed_run.tt_confirmed / total_run_per_runner.tt_runs) * 100.0, 2) AS percentage_delivered
FROM total_run_per_runner
INNER JOIN total_confirmed_run
    ON total_run_per_runner.runner_id = total_confirmed_run.runner_id;