The Lifecycle of an INSERT Statement in MySQL

1. Background: Basic MySQL Architecture

Overall, MySQL is roughly divided into two parts: the Server layer and the Storage Engine layer.

Server Layer

This includes the connection manager, query cache, parser, optimizer, executor, etc. Features such as stored procedures, triggers, and views are also implemented at this layer.

Connection Manager: Responsible for handling the connection between the client and the server. It accepts client requests, performs authentication and permission checks, and manages the connection lifecycle.
Query Cache: Available in older versions of MySQL but no longer recommended in recent versions. It caches queries and their results to improve query performance. However, in high-concurrency and large-scale databases, it can become a performance bottleneck due to locking and overhead issues.
Parser: Responsible for analyzing SQL queries, verifying their syntax and semantics, and ensuring correctness. It converts SQL statements into internal data structures used by the optimizer and executor.
Optimizer: Receives queries from the parser and determines the most efficient way to execute them. The optimizer aims to find the optimal execution path, selecting appropriate indexes, join orders, and access methods to improve performance.
Executor: Executes the execution plan generated by the optimizer, retrieves data from the storage engine, and handles client requests. It interacts with the storage engine to run the query and return results to the user.
Storage Engine Layer: Responsible for data storage and retrieval. MySQL supports multiple storage engines such as InnoDB, MyISAM, and Memory. In daily development, InnoDB is typically used. Since MySQL version 5.5, InnoDB has been the default storage engine.

Now that we’ve introduced the basic MySQL architecture, let’s look at what each component does when processing a write SQL statement.

2. Connection Manager

To execute a write SQL statement, you typically start from the MySQL client by entering a command to connect to the MySQL server. On the server side, it’s the connection manager that establishes the connection with your client, handles authentication, and manages the connection lifecycle.

Connection command:

mysql -h(ip address) -P(port) -u(username) -p

After entering the connection command and the correct password, and completing the classic TCP handshake, the connection to the MySQL server is successfully established.

Then, you can directly input a write SQL command and see the result:

mysql> insert into user_score_tab(user_id,score) values(888,10);
Query OK, 1 row affected (0.02 sec)

3. Query Cache

In MySQL versions 5.6 and earlier, after a successful connection, the query cache was available to optimize SQL queries. If the table you're querying is updated or inserted into, the cache will be cleared. Therefore, executing a write SQL will clear the cache.

In fact, in newer versions of MySQL (such as 8.0), the query cache has been completely deprecated. This is because in high-concurrency and large-scale database environments, query caching can cause performance issues. Tests have shown that disabling the query cache can actually improve overall performance and scalability.

4. Parser

When you submit a write SQL command to the MySQL server, it needs to parse the command to understand what it is supposed to do.

The parser first performs lexical analysis. The SQL you submit consists of strings and spaces, and MySQL will analyze what each string means. For example, in this insert SQL:

insert into user_score_tab(user_id,score) values(888,10);

It parses the keyword insert into, identifies user_score_tab as the table, and user_id, score as column names. After lexical analysis, syntax analysis follows.

Syntax analysis checks whether your SQL statement complies with MySQL syntax rules. For example, if your SQL is incorrect, it will return an error like:

mysql> inser into user_score_tab(user_id,score) value(888,10);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'inser into user_score_tab(user_id,score) value(888,10)' at line 1

After completing both lexical and syntax analysis, the system understands that this is an insert SQL.

5. Optimizer

For simple INSERT SQL statements, the optimizer does not perform complex query plan generation.

The optimizer is responsible for index selection and maintenance, though it does not engage in complex query optimization in this case. If the table has a primary key or index, the optimizer ensures that the index is updated to maintain data consistency and checks constraint conditions during data insertion.

The INSERT statement also results in the generation of an execution plan, which describes in detail how the database will access data, which indexes to use, and the order of data operations.

6. Executor

The executor is responsible for carrying out the actual SQL operation and is the core execution module of the database system. For an INSERT statement, the executor handles the actual data insertion process.

Determine Insertion Point: Based on the execution plan from the optimizer, the executor identifies the precise location in the table where the data should be inserted—such as the position dictated by a primary or unique key.
Load Data Pages: If the target data page is already in memory (Buffer Pool), it uses it directly. Otherwise, it loads the corresponding data page from disk into memory.
Update Indexes: If the table has indexes (like primary keys, unique indexes, or others), the executor updates them accordingly.

7. Buffer Pool

The Buffer Pool is a memory area in MySQL’s InnoDB storage engine that caches data pages, index pages, and other content from database tables. Its primary purpose is to improve read/write performance and reduce disk I/O operations.

The executor inserts new data into the appropriate data page within the Buffer Pool. This operation occurs in memory and does not immediately modify files on disk.

8. Undo Log

Before the data is actually inserted, InnoDB generates an undo log. For INSERT operations, the undo log records how to delete the newly inserted record (this is used to roll back the insert if the transaction is later aborted).

Why is an undo log generated?

When a transaction is rolled back, MySQL needs to undo uncommitted operations. With the undo log, MySQL can delete records that were inserted but not committed, ensuring the atomicity of transactions.

9. Redo Log

After the executor inserts the data, the operation is immediately recorded in the redo log.

To ensure data reliability, MySQL uses a Write-Ahead Logging (WAL) mechanism. Before the data is physically written to disk, the operation is first logged in the redo log.

What’s the process?

MySQL first writes the operation to the redo log and marks it as prepare (pre-commit) status. This means that in case of a crash, MySQL can replay the redo log to recover the data.

10. Writing to Binlog

While writing to the redo log, MySQL also writes the operation to the binlog for replication and disaster recovery purposes.

The binlog is MySQL’s logical log that records the details of SQL operations such as INSERT INTO. It differs from the redo log, which is a physical log.

11. Transaction Commit (Two-Phase Commit)

In the two-phase commit process, MySQL updates the redo log status to commit when the transaction is committed.

Why is two-phase commit necessary?

To ensure consistency between the binlog and redo log. If the system crashes, MySQL can replay operations from the redo log and recover missing data using the binlog.

12. Flushing Data to Disk

The executor does not immediately flush dirty pages in memory to disk. Background threads asynchronously flush dirty pages from the Buffer Pool to the tablespace file on disk according to certain strategies (e.g., periodic flushing). This avoids frequent disk I/O operations and improves performance.

We are Leapcell, your top choice for hosting backend projects.

Leapcell is the Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis:

Multi-Language Support

Develop with Node.js, Python, Go, or Rust.

Deploy unlimited projects for free

pay only for usage — no requests, no charges.

Unbeatable Cost Efficiency

Pay-as-you-go with no idle charges.
Example: $25 supports 6.94M requests at a 60ms average response time.

Streamlined Developer Experience

Intuitive UI for effortless setup.
Fully automated CI/CD pipelines and GitOps integration.
Real-time metrics and logging for actionable insights.

Effortless Scalability and High Performance

Auto-scaling to handle high concurrency with ease.
Zero operational overhead — just focus on building.

Explore more in the Documentation!