Guessing the Price of Execution

When a database query planner receives a complex SQL query involving multiple joins and aggregations, there are often hundreds of different valid physical paths (Execution Plans) it could take to generate the final result.

If the planner chooses the wrong path, a query that should take 5 seconds could take 5 hours. To make the correct choice, modern analytical databases rely on a Cost-Based Optimizer (CBO).

The CBO’s job is to look at all the possible execution plans and mathematically estimate the “cost” of each one. Cost is a complex algorithm that factors in estimated CPU cycles, disk I/O (how much data needs to be read from storage), and network shuffling (how much data needs to be moved between servers in a distributed cluster). The CBO evaluates the options and selects the execution plan with the lowest total estimated cost.

The Importance of Statistics

A CBO cannot run the query to see how long it takes; it must guess the cost before execution begins. To make accurate guesses, the CBO relies entirely on table statistics.

If a query says SELECT * FROM customers WHERE country = 'Canada' AND loyalty_status = 'Platinum', the CBO needs to know how many rows that filter will return.

If it returns 10 rows, the CBO might choose a “Nested Loop Join” to connect it to the sales table.
If it returns 10 million rows, a Nested Loop Join would crash the database, so the CBO chooses a “Hash Join” instead.

To know this, the database maintains metadata statistics for every table and column: total row counts, minimum/maximum values, null counts, and Data Distribution Histograms (showing that 80% of customers are in the US, and only 5% in Canada).

Cost-Based Optimizer Architecture

The Lakehouse CBO Challenge

In a traditional, monolithic data warehouse (like Oracle or SQL Server), the database controls the storage layer, so it constantly updates its internal statistics every time data is inserted. The CBO always has fresh data.

In a modern data lakehouse, the storage layer consists of open Parquet files sitting on S3. An external Spark job might write 100 new Parquet files, and the query engine (like Dremio or Trino) won’t inherently know the statistics of that new data. If the engine’s CBO has stale statistics, it will make terrible execution decisions.

Apache Iceberg solves this fundamental lakehouse problem. Iceberg maintains column-level statistics (min, max, null counts) directly in its manifest metadata files. When a query engine reads the Iceberg table, it immediately accesses these rich statistics without having to scan the raw Parquet files. This provides the CBO with the accurate, up-to-date mathematical fuel it needs to generate highly optimized execution plans against decoupled cloud object storage.

Learn More

To dive deeper into these architectures and master the modern data ecosystem, check out the comprehensive books by Alex Merced available in our Books section.