Skip to main content

Articles

A SQL Murder Mystery

AuthorAutumn Kloth

During my internship at The Action Company, my goal has been learning essential data analytics tools. While building dashboards with Tableau, I discovered how understanding your audience is crucial. This understanding helps in visually communicating key analytical insights. You can read more about my Tableau experience here.

With SQL, I’ve learned about the importance of data management, which includes retrieving, modifying, and storing data. SQL is also important for business operations, enabling companies to efficiently manage and analyze their data.

I learned the basics of SQL at college, and have been able to apply that to this internship. However, the most engaging way I’ve grasped SQL is through solving the SQL Murder Mystery! This activity was a great way to refresh my SQL skills and to gain a better understanding of the language as a whole. In this article, I will explain the elements of SQL, common statements used to solve this murder mystery, and show how you can solve it, too!

What is SQL?

Structured Query Language (SQL) is the standard language for relational database management, creation, and manipulation. SQL statements perform tasks such as retrieving data from a database, updating records, creating stored procedures, and more. In 1986, SQL became a standard language of the American National Standards Institute (ANSI), and of the International Organization for Standardization (ISO) in 1987.

What are Relational Databases?

Relational databases provide access to data points with predefined relationships, between them, usually in the form of a table. In these databases, every row in a table is a record with a unique ID (called a key) and contains data representing a real-life event. Each column in the table describes the data the row holds, such as providing the descriptive name or specifying the data type. Data in different tables can be joined together by primary or foreign keys, which I will explain later.

What are Queries?

Queries are statements you write to produce data. Think of a query as a “question” composed of statements that instruct your output. Several elements or statements make up a query.

SQL’s Popularity

According to the current April 2024 TIOBE index, SQL is the 9th most popular programming language in the world, with Python being the 1st most popular. Also, Oracle Database (which uses SQL) is the most popular database management system in the world. This shows the importance of SQL as a skill when starting a career in data analytics.

Starting the SQL Murder Mystery

SQL Murder Mystery detective illustration
Detective illustration courtesy of Vectors by Vecteezy

It’s been a long and frustrating night. A detective at SQL City gives you a crime scene report of a murder that happened there, but somehow, you managed to lose it! The only thing you remember is that the crime took place on Jan 15, 2018. To solve the case and bring justice to the victims, you need to retrieve the correct crime scene report from the police department’s database, using only your SQL skills. But where do you begin?

A huge database contains all the clues, witnesses, and suspects. You need SQL to query and uncover these details. Since there is no documentation, a relational diagram (ERD) helps navigate through the tables.

Note: This SQL murder mystery was actually inspired by a crime in neighboring Terminal City, but all of the names, SSNs, events, etc. are fake.

Using the ERD

Entity Relationship Diagram for the SQL Murder Mystery

An ERD (Entity Relationship Diagram) visually shows the relationship between all tables in a database. Each table name is bolded and contains a list of column names and their data type. The ERD also contains primary and foreign keys, which are important for establishing relationships between tables.

Primary and Foreign Keys

A primary key is a unique, non-empty (non-NULL) identifier for every record in a table. A foreign key is an identifier used to reference the primary key of a record(s) in a “foreign” table. In the ERD above, the primary key is indicated by a gold key, and the foreign key is indicated by a blue arrow.

For example, the “person_id” field in the table “get_fit_now_member” is the foreign key to the variable of “id” in the table “person,” which is the primary key.

Understanding the relationship between primary keys and foreign keys is important for joining tables together, which we’ll discuss later.

Important Statements to Know

A SQL query contains statements, table names, wild cards, and more that communicate actions for your query to perform. There are many more statements used in SQL than listed below, but I will go over the most common statements that are used to solve this murder.

SELECT
The “SELECT” statement allows you to specify the column names from a table you want outputted. You can also use the asterisk symbol (*) to indicate you want all columns from the query. Naming specific columns allows you to return only the columns you want.

FROM
The “FROM” statement allows you to specify the primary table from which your data is coming.

WHERE
The “WHERE” statement acts as a filter to narrow down your results.

AND (Typically used in JOIN criteria and WHERE filters)
The “AND” statement connects multiple records together so that the result meets all of the criteria.

OR (Typically used in WHERE filters)
The “OR” statement is used to filter fields based on multiple conditions.

LIKE
The “LIKE” statement searches for a specified pattern in a column. The most common “wildcard” used with this operator is the (%) wildcard, which allows you to match based on partial strings.

Statements for the SQL Murder Mystery

Building Relationships Between Tables

Joining Tables – Join Statements

In this activity, you will need to use information that comes from two different tables. In order to query multiple tables at the same time, you need to use the “JOIN” statement. I will only go over one type of “Join” that is needed in this activity, but there are several different “JOIN” statements that are used to connect different tables together. This resource explains different SQL join types.

Relationships for the SQL Murder Mystery

The “INNER JOIN” statement, which can be written as just “JOIN” in SQL, returns records that have matching values in both tables, based on the join criteria. To join two tables together, you need to refer back to the ERD and look at the primary/foreign keys for each table. To join tables, you want to connect on the primary and foreign keys after the JOIN statement using an ON statement to join the tables as one. Generally, the order of the tables written in the JOIN statements will not affect the query result as long as you specify what columns to select. However, if you select all the columns in the tables, the order of the columns depends on the order of the tables.

Order of Table 1Order of Table 2

These examples show that when you don’t specify the columns in your SELECT statement, the order the tables are written will affect the order in which the columns appear.

Solving the SQL Murder Mystery

1. Query the Crime Scene Report

To solve the mystery, you first need to query the crime scene report that matches the information you remember of the crime: a murder that took place on January 15, 2018 in SQL City. Remember, there are multiple ways to do this!

Crime scene report

2. Identify the Witnesses.

You can now use the hints given above to identify the witnesses.

Witnesses 1Witnesses 2

3. Query the Witness Transcripts

Now that you know who the witnesses are, you can query their transcripts to get clues about the murderer. The tables “person” and “interview” share the key “id,” which is how you will join the tables together. Because a lot of unneeded information will return, you want to specify which columns you want.

Query the witness

4. Now, Solve the SQL Murder Mystery!

You have all the skills and knowledge you need to find the murderer! It will take awhile, especially if you are new to this. It took me around 3 hours, and I already had the SQL basics down. But solving this mystery will teach you important basics about SQL.

Connect with me on Linkedin if you need help finding the answer, or if you want to talk about your experience.