Simulating Per-Row Iteration in SQL Using PostgreSQL's LATERAL JOIN
Olivia Novak
Dev Intern · Leapcell

Introduction
In the realm of relational databases, SQL is primarily designed for set-based operations, where an entire data set is processed at once. This paradigm often presents a challenge when we encounter scenarios that naturally lend themselves to row-by-row processing, similar to a "for-each" loop in conventional programming languages. While traditional SQL constructions can sometimes mimic this behavior, they often become convoluted or inefficient for specific use cases. This is where PostgreSQL's LATERAL JOIN emerges as a powerful and elegant solution. It allows a subquery to reference columns from a preceding FROM clause item, effectively enabling per-row computation and unlocking new possibilities for complex data manipulation. This article will explore how LATERAL JOIN bridges the gap between set-based and row-based processing in SQL, providing practical examples and insights into its utility.
Understanding LATERAL JOIN for Row-by-Row Logic
To fully appreciate the power of LATERAL JOIN, it's essential to first grasp a few core concepts.
Key Concepts
- Subquery Independence: Generally, a subquery executed for a
JOINcondition or in theSELECTlist cannot reference columns from the main query'sFROMclause, unless it's a correlated subquery which operates on a row-by-row basis but is often less efficient for complex operations. - Set-based vs. Row-based Processing: SQL fundamentally works on sets of data. A simple
JOINcombines sets. When we discuss "row-by-row" in SQL, we're referring to operations where the result of a subquery or function depends on the values of each individual row from another table. LATERALKeyword: TheLATERALkeyword, when applied to a subquery or aFROMclause item, enables that item to reference columns provided byFROMitems that appear before it in the sameFROMlist. This is the crucial enabler for "for-each" like behavior.
The Principle of LATERAL JOIN
Imagine you have a table of users, and for each user, you want to find their latest three orders. Without LATERAL JOIN, you might resort to complex window functions or multiple correlated subqueries. With LATERAL JOIN, you can think of it as iterating through each user row and, for each user, executing a subquery that retrieves their latest three orders.
The SQL engine, when encountering a LATERAL JOIN, processes the left-hand table (the one before LATERAL JOIN) first. For each row produced by the left-hand table, it then executes the LATERAL subquery (the right-hand side), passing the current row's values to the subquery. The results of this subquery are then joined with the current row from the left-hand table. This effectively simulates a "for-each" loop: for each row from the outer query, perform an operation using that row's data.
Implementation and Application Scenarios
Let's illustrate with practical examples.
Scenario 1: Finding Top N Related Records Per Group
Consider two tables: users and orders. We want to find the top 3 most recent orders for each user.
users table:
CREATE TABLE users ( user_id INT PRIMARY KEY, username VARCHAR(50) ); INSERT INTO users (user_id, username) VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Charlie');
orders table:
CREATE TABLE orders ( order_id INT PRIMARY KEY, user_id INT, order_date DATE, amount DECIMAL(10, 2), FOREIGN KEY (user_id) REFERENCES users(user_id) ); INSERT INTO orders (order_id, user_id, order_date, amount) VALUES (101, 1, '2023-01-10', 50.00), (102, 1, '2023-01-15', 75.00), (103, 2, '2023-01-12', 20.00), (104, 1, '2023-01-20', 100.00), (105, 3, '2023-01-05', 30.00), (106, 2, '2023-01-18', 45.00), (107, 1, '2023-01-25', 120.00), (108, 3, '2023-01-10', 60.00), (109, 2, '2023-01-22', 90.00), (110, 1, '2023-01-28', 80.00);
Using LATERAL JOIN:
SELECT u.username, o.order_id, o.order_date, o.amount FROM users AS u, LATERAL ( SELECT * FROM orders AS ord WHERE ord.user_id = u.user_id ORDER BY ord.order_date DESC LIMIT 3 ) AS o;
In this example, for each user row (u), the LATERAL subquery (aliased as o) is executed. This subquery filters orders by the current u.user_id, orders them by order_date descending, and takes the LIMIT 3. The results are then joined back to the user row. This provides a clean way to get the top N related records per group, which is a common analytical task.
Scenario 2: Dynamic Column Generation or Complex Calculations
Imagine you have a table where each row contains parameters for a calculation, and you want to perform a different calculation for each row.
CREATE TABLE calculation_params ( param_id INT PRIMARY KEY, value1 INT, value2 INT ); INSERT INTO calculation_params (param_id, value1, value2) VALUES (1, 10, 5), (2, 20, 4), (3, 15, 3);
Now, let's say we want to calculate value1 * value2 and value1 + value2 for each row, but we want to put them into a structure returned by a function or a derived table.
SELECT cp.param_id, calcs.product_result, calcs.sum_result FROM calculation_params AS cp, LATERAL ( SELECT cp.value1 * cp.value2 AS product_result, cp.value1 + cp.value2 AS sum_result ) AS calcs;
Here, the LATERAL query effectively acts as a function applied per row, generating derived columns based on the original row's data. This can be particularly useful when combined with table-returning functions or VALUES clauses.
Scenario 3: Unnesting Arrays or JSONB Per Row
A common use case for LATERAL JOIN is with table-returning functions, especially unnest() or jsonb_array_elements(), when the array or JSONB data is stored in a column of your main table.
CREATE TABLE product_tags ( product_id INT PRIMARY KEY, name VARCHAR(100), tags TEXT[] -- an array of tags ); INSERT INTO product_tags (product_id, name, tags) VALUES (1, 'Laptop', ARRAY['electronics', 'computing', 'portable']), (2, 'Keyboard', ARRAY['electronics', 'peripherals']), (3, 'Mouse', ARRAY['electronics', 'peripherals', 'wireless']);
To list each product with its individual tags:
SELECT pt.product_id, pt.name, tag_element AS tag FROM product_tags AS pt, LATERAL unnest(pt.tags) AS tag_element;
Without LATERAL, unnest() would unnest all tags from all rows into a single set, making it difficult to associate them back with their respective products efficiently. LATERAL ensures that unnest() is called for each product_tags row, passing that row's tags array to it.
LEFT LATERAL JOIN for Optional Results
Just like LEFT JOIN, LEFT LATERAL JOIN allows the primary table's rows to be returned even if the LATERAL subquery produces no rows. This is crucial when the "for-each" operation might not yield results for every outer row, but you still want the outer row included.
-- Suppose user 3 has no orders in our 'orders' table -- If Charlie (user_id = 3) had only one order, and we tried to get 3, it would only return 1. -- If Charlie had 0 orders, a simple LATERAL JOIN would exclude Charlie. ALTER TABLE orders DISABLE TRIGGER ALL; -- Temporarily disable foreign key check for example DELETE FROM orders WHERE user_id = 3; ALTER TABLE orders ENABLE TRIGGER ALL; SELECT u.username, o.order_id, o.order_date, o.amount FROM users AS u LEFT JOIN LATERAL ( SELECT * FROM orders AS ord WHERE ord.user_id = u.user_id ORDER BY ord.order_date DESC LIMIT 3 ) AS o ON TRUE; -- ON TRUE is common as the join condition is handled within the LATERAL subquery's WHERE clause
This query would now include 'Charlie' even though he has no associated orders, with NULL values for the order columns.
Conclusion
LATERAL JOIN in PostgreSQL is a highly versatile and powerful feature that enables truly row-dependent operations in SQL, effectively bringing "for-each" loop semantics to set-based querying. Its ability to reference columns from preceding FROM items unlocks elegant solutions for problems like finding N correlated items per group, dynamically generating per-row derived data, and efficiently unnesting row-specific array or JSONB data. By understanding and utilizing LATERAL JOIN, developers can write more concise, readable, and often more performant queries for complex analytical and data transformation tasks, bridging the gap between typical procedural logic and the declarative nature of SQL. It empowers SQL to handle scenarios previously cumbersome or inefficient, making it an indispensable tool for advanced PostgreSQL users.

