How Do SQL Database Engines Work

Agbaje Ayomide
4 min readMar 22, 2024

--

Introduction

Have you ever wondered how database applications like Microsoft Access or MySQL store and retrieve your data? The magic behind it is the SQL database engine. SQL, which stands for Structured Query Language, is a programming language used to manage and manipulate databases. And just like other programming languages, SQL requires an engine or interpreter to understand and execute the instructions.

In this post, I'll explain how these SQL database engines work in simple terms. I'll use some examples you might relate to from your life to illustrate the key concepts. By the end, you'll understand what happens when you click that "Submit" button on a database form.

How SQL Database Engines Work

At a high level, every SQL database engine consists of two main components:

  1. The Compiler
  2. The Virtual Machine

The compiler is like a translator that converts the SQL statements you submit and converts them into a low-level program that the database can execute. It's similar to how websites are written in high-level programming languages like JavaScript but must be translated into machine code for your computer to understand.

The virtual machine is what runs or interprets this low-level program created by the compiler. It's a piece of software that simulates a real computer's CPU, memory, etc. Using a virtual environment, the database engine can control precisely how the program executes and isolate it from the rest of the operating system.

Here's a simple example of how these two components work together:

Let's say you have a digital cookbook app that stores recipes in a database. When you click the button to view all the breakfast recipes, the app generates this SQL query:

SELECT name, instructions 
FROM recipes
WHERE category = 'breakfast';

The compiler takes this SQL statement and translates it to something like:

  1. Open the "recipes" data file
  2. For each entry in the file
  3. Check if the "category" field is "breakfast"
  4. If so, read the "name" and "instructions" fields
  5. Return those fields to the user interface

This low-level program is then passed to the virtual machine, which executes each instruction individually, retrieving and returning the requested recipe data.

Database Indexing

One essential part of optimizing database performance is through the use of indexes. Indexes are like the tables of contents at the beginning of books — they allow the database to quickly locate and retrieve specific data without reading every entry.

For example, your cookbook app tracks which vegetarian recipes are available. An index can be created on the "vegetarian" field of the recipes table.

CREATE INDEX veggie_idx ON recipes(vegetarian);

This index works similarly to a dictionary, keeping all vegetarian recipes grouped and sorted. So when you query for just the vegetarian breakfast recipes:

SELECT name, instructions
FROM recipes
WHERE category = 'breakfast'
AND vegetarian = 1;

The database first uses the index on the "vegetarian" field to only look at those entries. Then, it can filter that subset of data for the "breakfast" category rather than read through the entire table.

Indexes are like highway road signs — they help the database locate the data you want much faster. But you don't want too many indexes as that starts to waste memory, so finding the right balance is essential.

Ordering and Sorting Data

Another standard database operation is ordering or sorting the result set by specific fields. For example:

SELECT name, instructions
FROM recipes
ORDER BY prep_time ASC;

This query asks to return all recipes ordered or sorted by the "prep_time" field from shortest to longest preparation time.

The database engine uses a temporary sorting workspace in memory to accomplish this. It reads in chunks of data from the table, orders those chunks by the "prep_time" field, and streams out the sorted result set to the user. This is more efficient than the alternative of reading the entire table into memory and sorting the complete data.

Summary

In summary, here's what happens when you execute a SQL query:

The SQL statement is translated to a low-level program by the compiler

This program is fed into the database virtual machine

The virtual machine executes the program instructions

Requested data is retrieved from the disk and filtered/sorted in memory

The final result set is returned to you, the user

SQL database engines are lightweight, restricted programming languages explicitly tailored for data storage and querying. While the concepts can get complicated, even simple queries demonstrate robust data retrieval and filtering capabilities compared to manually coding such operations.

This high-level overview sheds some light on what happens behind the scenes when interacting with a database application! Let me know if you have any other questions.

--

--