Efficient Data Cleanup in Partitioned PostgreSQL Tables using Common Table Expressions

By Dwarka Rao, Database Engineer & Aakash Agarwal, Java Software Engineer.

Adyen

Published in

Adyen Tech

9 min readMar 22, 2024

Two engineers stacking server racks together

Introduction

Effective data management is critical in PostgreSQL, especially when dealing with sizable datasets.This statement is true for all the RDBMS engines. Partitioned tables offer a structured approach to organising data. Still, the challenge arises when it has to do with efficiently cleaning up data within these partitions.

In this blog post, we’ll explore a method that utilises Common Table Expressions (CTEs) to streamline scanning and deleting relevant data in partitioned PostgreSQL tables.

Using Common Table Expressions (CTEs) as a crucial cleanup tool

Efficiently cleaning up data within partitioned tables in PostgreSQL introduces several intricate challenges that demand careful consideration — Identifying specific data for deletion requires a sophisticated approach. Common Table Expressions (CTEs) emerge as a crucial tool in this context; they provide a structured mechanism to traverse the partition hierarchy and pinpoint relevant data. As the cleanup operation unfolds, the potential impact on performance looms large, given the distributed nature of data within partitioned tables.
Traditional DELETE operations can be resource-intensive, but one can mitigate the impact on overall performance by incorporating optimization techniques such as batch processing and leveraging indexes in WHERE conditions.
Also, maintaining consistency across partitions requires a meticulous balance that emphasises the importance of atomic transactions and well-designed CTEs to uphold database integrity.

Our Story

In a recent endeavour, we encountered a challenge while aiming to remove a substantial volume of data from numerous partitioned tables.
This task posed several complexities: since each partitioned table approached a Terabyte size, opting for an immediate data deletion strategy within the partitions would be careless. Potential restoration complications, especially during validation errors, were also daunting. Additionally, devising a precise logic to pinpoint the data eligible for deletion introduced an extra layer of complexity to the task.
Balancing the need for a thorough cleanup with the risk of irreversible actions became critical in tackling this challenge.

In the following sections, we’ll share details of the particular use case of this challenge, as well as our approach and implementation.

Use Case

We structured our table with weekly partitions and housed Entity event overviews in an append-only format. These overviews functioned as consolidated snapshots of past events for a specific entity, each version distinguished by the stateid column. Here is the parent table’s structure for your reference:

CREATE TABLE purple.entityevent (
entityeventid BIGINT PRIMARY KEY,
entityid BIGINT,
stateid INTEGER,
eventoverview VARCHAR
);

However, Postgres’ limitations made it more complex when we attempted to update the same event overview: updates triggered the vacuum and analysis processes. They presented complications for our sizable and high-throughput table.
As a result, we decided against modifying existing overviews. Instead, we read from partitions to retrieve the latest event overview using eventId and stateId indexes. During write operations, we fetched the last overview, calculated the final version, and stored it in the latest weekly partition by inheritance. And below is how the child tables look like:

CREATE TABLE purple.entityevent_20210701_20210731 (
entityeventid BIGINT PRIMARY KEY,
entityid BIGINT,
stateid INTEGER,
eventoverview VARCHAR
);

ALTER TABLE purple.entityevent_20210701_20210731 inherit purple.entityevent;

This decision was strategic; it ensured that all prior iterations would be obsolete and never used again once we generated a new version.
When we recognised that the sheer volume of data in those partitions had surpassed practical limits, we had to switch our focus to efficiency. This focus prompted us to streamline the database by eliminating redundant and outdated records.

Approach

The new challenge was cleaning up data on nearly a hundred partitions. So, we devised an automated approach, and we could trust its logic to identify deletable data. This approach would also be reversible if the validation raised red flags. If we split the entire task into multiple phases, it would follow these concepts/processes below in no particular order:

Identifying deletable data
Performance considerations
Maintaining database consistency
Testing and Validation

Let’s expound on each of them:

Identifying deletable data

All the tables we wanted to clean up had dependencies on other tables to help us identify deletable data. Hence, using CTEs made absolute sense. Common Table Expressions (CTEs) provide a powerful tool for creating temporary result sets within a query.
By leveraging CTEs, we can construct a recursive query to traverse the partitions and identify the data requiring deletion. In addition to that, CTEs play a pivotal role in streamlining the identification and filtering of deletable data, especially when dealing with complex relationships across multiple partitioned tables. When confronted with the challenge of deleting data from such tables, employing CTEs becomes an indispensable strategy. These temporary result sets provide a concise and organised way to articulate the logic required for identifying data for deletion.

Consider this example below involving two CTEs in an ecommerce platform with tables for customers, orders, and products. The goal is to retrieve a report showing each customer’s top-selling products based on the total amount spent:

- Sample tables for customers, orders, and products
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
customer_name VARCHAR(50)
);
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(50),
unit_price DECIMAL
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers(customer_id),
product_id INT REFERENCES products(product_id),
quantity INT,
total_amount DECIMAL
);
 - Sample data
INSERT INTO customers (customer_name) VALUES
('Customer A'),
('Customer B');
INSERT INTO products (product_name, unit_price) VALUES
('Product X', 10),
('Product Y', 15),
('Product Z', 20);
INSERT INTO orders (customer_id, product_id, quantity, total_amount) VALUES
(1, 1, 3, 30),
(1, 2, 2, 30),
(2, 2, 5, 75),
(2, 3, 1, 20);
 - Query with two CTEs
WITH CustomerTotalSpent AS (
SELECT
o.customer_id,
c.customer_name,
SUM(o.total_amount) AS total_spent
FROM
orders o
JOIN customers c ON o.customer_id = c.customer_id
GROUP BY
o.customer_id, c.customer_name
),
TopProductsPerCustomer AS (
SELECT
o.customer_id,
c.customer_name,
p.product_name,
o.quantity,
o.total_amount,
RANK() OVER (PARTITION BY o.customer_id ORDER BY o.total_amount DESC) AS rank
FROM
orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id
)
 - Main query combining results from CTEs
SELECT
cts.customer_id,
cts.customer_name,
tpc.product_name,
tpc.quantity,
tpc.total_amount
FROM
CustomerTotalSpent cts
JOIN TopProductsPerCustomer tpc ON cts.customer_id = tpc.customer_id
WHERE
tpc.rank = 1; - Retrieve only the top-selling product for each customer

Performance Considerations

One must carefully consider performance when delving into the intricate task of cleaning up data within partitioned tables. The sheer size of partitioned tables, often approaching Terabytes, can pose a challenge regarding the performance impact of deletion operations.

Optimization techniques are crucial in addressing this; they ensure the cleanup process is effective and resource-efficient.

One key optimization strategy is the implementation of batch processing. Rather than executing a single, resource-intensive DELETE operation on the entire dataset, breaking the operation into smaller, manageable batches can significantly mitigate the impact on system resources. This process prevents potential bottlenecks and allows for better monitoring and control over the deletion process.

Furthermore, leveraging the appropriate columns in WHERE clauses optimises performance. Making the most of Primary Keys and Indexes ensures that the database engine can quickly locate and process the relevant records for deletion.

Maintaining Database Consistency

Ensuring database consistency during the cleanup process is critical to avoiding potential pitfalls and maintaining data integrity.

Using transactions plays a pivotal role in achieving this goal. By encapsulating the data cleanup operations within a transaction block, administrators can guarantee that the changes are treated as a single atomic unit of work. This means that all cleanup operations are successfully executed, or none are. In the event of an error or interruption, the database can be rolled back to its original state, preventing partial or inconsistent modifications and safeguarding the overall consistency of the dataset.

Testing and Validation

Before applying cleanup operations to a production database, it’s essential to conduct extensive testing in a controlled environment. This includes testing different scenarios, edge cases, and the impact on database performance to identify and address potential issues.

Implementation

So, we started with the above mentioned approach and spent some time finding the most efficient way to find deletable data using CTEs.

'WITH entityIds as (
select distinct entityid from purple.entityevent_%s where
entityid >= %s::BIGINT order by entityid limit %s
),
entityEventData (entityid, maxstateid) as (
select entityid, max(stateid) from purple.entityevent_%s
join entityIds using (entityid) group by entityid
),
entityEventIdToKeep as (
select entityeventid from purple.entityevent_%s join entityEventData
using (entityid) where stateid = maxstateid
)
insert into purple.entityevent_%s_v2
select ee.* from purple.entityevent_%s ee
where ee.entityeventid in (
select entityeventid from entityEventIdToKeep
)'

We tested the CTEs thoroughly and moved on to write a Stored Procedure to use for multiple partitions.
The idea was to make the Stored Procedure with below tasks (steps):

open a transaction
create an empty partition (new)
move all the relevant data to this new partition in customizable batch size
adjust the pointer to move to the next batch
iterate till it moves all relevant data.

Once the Stored Procedure completes these steps, it:

starts a loop to get a lock with the lock_timeout parameter on the session level (this lock was sufficient in our use case because the our application does not write to historical partitions)
removes the old partition from inheritance, attaches the new partition
renames the old partition with a suffix
renames the new partition as the old one
analyses the newly created partition and ends the transaction.

During all these steps, it should print required metrics wherever necessary. Here is what the complete Stored Procedure looks like below:

CREATE OR REPLACE PROCEDURE cleanup_entityevent(date_string text)
LANGUAGE plpgsql SECURITY DEFINER
AS
$$
DECLARE
rows_inserted BIGINT := 0;
total_rows_inserted BIGINT := 0;
start_time TIMESTAMP;
end_time TIMESTAMP;
elapsed_time INTERVAL;
start_point BIGINT := 0;
end_point BIGINT := 0;
batch_size BIGINT := 100001;
batch_num BIGINT := 0;
attach_lock_timeout CONSTANT INT := 1000;
attach_retries CONSTANT INT := 3;
attach_retry_sleep CONSTANT INT := 5; - seconds
BEGIN
EXECUTE format('LOCK TABLE purple.entityevent_%s IN ACCESS SHARE MODE', date_string);
EXECUTE format('select max(entityid) from purple.entityevent_%s', date_string) into end_point;
EXECUTE format('CREATE TABLE purple.entityevent_%s_v2 (LIKE purple.entityevent_%s INCLUDING ALL)', date_string, date_string);
LOOP
start_time := clock_timestamp();
EXECUTE format(
'WITH entityIds as (
select distinct entityid from purple.entityevent_%s where
entityid >= %s::BIGINT order by entityid limit %s
),
entityEventData (entityid, maxstateid) as (
select entityid, max(stateid) from purple.entityevent_%s
join entityIds using (entityid) group by entityid
),
entityEventIdToKeep as (
select entityeventid from purple.entityevent_%s
join entityEventData
using (entityid) where stateid = maxstateid
)
insert into purple.entityevent_%s_v2
select ee.* from purple.entityevent_%s ee
where ee.entityeventid in (
select entityeventid from entityEventIdToKeep
)',
date_string, start_point, batch_size,
date_string, date_string, date_string, date_string);
end_time := clock_timestamp();
elapsed_time := end_time - start_time;
GET DIAGNOSTICS rows_inserted = ROW_COUNT;
RAISE NOTICE 'Batch % processed: , processed count : %, current point: %, time taken: % ',
batch_num, rows_inserted, start_point, elapsed_time;
batch_num := batch_num + 1;
total_rows_inserted := rows_inserted + total_rows_inserted;
EXECUTE format(
'WITH entityIds as (
select distinct entityid from purple.entityevent_%s where
entityid >= %s::BIGINT order by entityid limit %s
)
select max(entityid)+1 from entityIds',
date_string,start_point,batch_size) INTO start_point;
EXIT WHEN start_point > end_point;
PERFORM pg_sleep(1);
END LOOP;
FOR loop_cnt in 1..attach_retries
LOOP
BEGIN
 - Add the new table as partition to the parent table
EXECUTE FORMAT('SET local lock_timeout TO %L', attach_lock_timeout);
EXECUTE format('ALTER TABLE purple.entityevent_%s no inherit purple.entityevent', date_string);
EXECUTE format('ALTER TABLE purple.entityevent_%s_v2 inherit purple.entityevent', date_string);
EXECUTE format('ALTER TABLE purple.entityevent_%s rename to entityevent_%s_old', date_string,date_string);
EXECUTE format('ALTER TABLE purple.entityevent_%s_v2 rename to entityevent_%s', date_string,date_string);
 - Attaching succeeded. Exit the loop.
EXIT;
EXCEPTION
WHEN lock_not_available THEN
RAISE NOTICE 'Lock not available ';
IF loop_cnt = attach_retries THEN
RAISE NOTICE 'Attaching table failed';
END IF;
perform pg_sleep(attach_retry_sleep);
END;
END LOOP;
RAISE NOTICE 'Current Date and Time: %', NOW();
EXECUTE format('ANALYZE (VERBOSE) purple purple.entityevent_%s', date_string);
RAISE NOTICE 'Current Date and Time: %', NOW();
RAISE NOTICE 'All inserted completed: % rows affected in total', total_rows_inserted;
END;
$$;

When the Stored Procedure is ready, we could call it for all the required partitions and increase/decrease batch size based on our database’s available resources.

We execute this script below to call the procedure :

call cleanup_entityevent('20210701_20210731');

With this approach, the rollback plan is ready in case the validation fails. After all the required validations, if we revert, we only need to remove the new partition from inheritance and add the old one back to inheritance. If the validation is all green, we drop the old partition.

In conclusion, navigating the complexities of cleaning up data within partitioned PostgreSQL tables demands a strategic blend of careful planning and the right tools. Leveraging Common Table Expressions (CTEs) proves instrumental in seamlessly traversing partition hierarchies and precisely identifying deletable data.