Skip to main content

Articles

A SQL Murder Mystery

AuthorAutumn Kloth

During my internship at The Action Company, my goal has been to learn essential data analytics tools. While building dashboards with Tableau, I discovered how crucial it is to understand your audience. This understanding helps in visually communicating key analytical insights. You can read about my Tableau experience here.

With SQL, I’ve learned about the importance of data management, which includes retrieving, modifying, and storing data. I’ve also learned the importance of using SQL in business operations, and how it can enable companies to efficiently manage and analyze their data.

I’ve learned the basics of SQL at college, and have been able to apply that to this internship, but the most engaging way that I’ve grasped the SQL language is through solving a murder mystery! I found this to be a great way to refresh my SQL skills and to gain a better understanding of the language as a whole. In this article, I will explain the elements of SQL, common statements used to solve this murder mystery, and show how you can solve it, too!

What is SQL?

Structured Query Language (SQL) is the standard language for relational database management, creation, and manipulation. SQL statements are used to perform tasks such as retrieving data from a database, updating records, creating stored procedures, and more. In 1986, SQL became a standard language of the American National Standards Institute (ANSI), and of the International Organization for Standardization (ISO) in 1987.

What are Relational Databases?

Relational databases are specific types of databases that provide access to data points which have pre-defined relationships between them, usually in the form of a table. In this database, every row in each table is a record that has a unique ID (called a key) and contains data representing some real-life event. Each column in the table describes the data the row holds, such as providing the descriptive name or specifying the data type. Data in different tables can be joined together by primary or foreign keys, which I will mention later on.

What are Queries?

Queries are statements you write to produce data. A query can be thought of as a “question” you make through the combination of statements which you want your output to answer. There are several elements or statements which make up a query.

SQL’s Popularity

According to the current April 2024 TIOBE index, SQL currently sits as the 9th most popular programming language in the world, with Python being the 1st most popular. Also, Oracle Database (which uses SQL) is the first most popular database management system in the world. This shows that SQL is a very important skill to have when beginning a career in data analytics.

Starting the Murder Mystery

Detective illustration courtesy of Vectors by Vecteezy

It’s been a long and frustrating night. A detective at SQL City has just given you a crime scene report of a murder that happened there, but somehow, you managed to lose it! The only thing you remember is that the crime took place on Jan 15, 2018. The only way you can solve the case and bring justice to the victims is to solve it by using the correct crime scene report from the police department’s database, and the only skill you have at your disposal is SQL. But where to begin?

A huge database contains all the clues, witnesses, and murderer(s) which require SQL to query and uncover. Because we’re not given documentation, we have a relational diagram (commonly known as an ERD) that will help us navigate through the tables.

Note: This mystery was actually inspired by a crime in neighboring Terminal City, but all of the names, SSNs, events, etc. are fake.

Using the ERD

An ERD is an Entity Relationship Diagram which visually shows the relationship between all tables in a database. Each name of the table is written in bold, and each table contains a list of column names and their data type. The ERD also contains primary and foreign keys, which are important for establishing relationships between tables.

Primary and Foreign Keys

A primary key is a unique, non-empty (non-NULL) identifier for every record in a table. A foreign key is an identifier used to reference the primary key of a record(s) in a “foreign” table. In the ERD above, the primary key is indicated by a gold key, and the foreign key is indicated by a blue arrow.

For example, the “person_id” field in the table “get_fit_now_member” is the foreign key to the variable of “id” in the table “person,” which is the primary key.

Understanding the relationship between primary keys and foreign keys is important for joining tables together, which we’ll discuss later.

Important Statements to Know

A SQL query contains statements, table names, wild cards, and more that communicate actions for your query to perform. There are many more statements used in SQL than listed below, but I will go over the most common statements that are used to solve this murder.

SELECT
The “SELECT” statement allows you to specify the column names from a table you want outputted. You can also use the asterisk symbol (*) to indicate you want all columns from the query. Naming specific columns allows you to return only the columns you want.

FROM
The “FROM” statement allows you to specify the primary table from which your data is coming.

WHERE
The “WHERE” statement acts as a filter to narrow down your results.

AND (Typically used in JOIN criteria and WHERE filters)
The “AND” statement connects multiple records together so that the result meets all of the criteria.

OR (Typically used in WHERE filters)
The “OR” statement is used to filter fields based on multiple conditions.

LIKE
The “LIKE” statement searches for a specified pattern in a column. The most common “wildcard” used with this operator is the (%) wildcard, which allows you to match based on partial strings.

Building Relationships Between Tables

Joining Tables – Join Statements

In this activity, you will need to use information that comes from two different tables. In order to query multiple tables at the same time, you need to use the “JOIN” statement. I will only go over one type of “Join” that is needed in this activity, but there are several different “JOIN” statements that are used to connect different tables together. This resource explains different SQL join types.

The “INNER JOIN” statement, which can be written as just “JOIN” in SQL, returns records that have matching values in both tables, based on the join criteria. To join two tables together, you need to refer back to the ERD and look at the primary/foreign keys for each table. To join tables, you want to connect on the primary and foreign keys after the JOIN statement using an ON statement to join the tables as one. Generally, the order of the tables written in the JOIN statements will not affect the query result as long as you specify what columns to select. However, if you select all the columns in the tables, the order of the columns depends on the order of the tables.

These examples show that when you don’t specify the columns in your SELECT statement, the order the tables are written will affect the order in which the columns appear.

Solving the Murder

1. Query the Crime Scene Report

To solve the mystery, you first need to query the crime scene report that matches the information you remember of the crime: a murder that took place on January 15, 2018 in SQL City. Remember, there are multiple ways to do this!

2. Identify the Witnesses.

You can now use the hints given above to identify the witnesses.

3. Query the Witness Transcripts

Now that you know who the witnesses are, you can query their transcripts to get clues about the murderer. The tables “person” and “interview” share the key “id,” which is how you will join the tables together. Because a lot of unneeded information will return, you want to specify which columns you want.

4. Now, Solve That Murder!

You have all the skills and knowledge you need to find the murderer! It will take awhile, especially if you are new to this. It took me around 3 hours, and I already had the SQL basics down. But solving this mystery will teach you important basics about SQL.

Connect with me on Linkedin if you need help finding the answer, or if you want to talk about your experience.