Multi-Tenant Architecture

When building a software application or a data platform that will serve multiple distinct customers or independent internal departments, engineers must choose how to deploy the infrastructure.

In a Single-Tenant Architecture, every customer gets their own dedicated, isolated environment. If a SaaS company has 100 clients, they spin up 100 separate database servers and 100 separate web servers. This provides ultimate security and isolation (if Customer A’s server crashes, Customer B is unaffected), but it is incredibly expensive and a nightmare to maintain. When a new software update is released, the engineering team must deploy it 100 separate times.

In a Multi-Tenant Architecture, a single instance of the software application and a single database cluster serve all 100 customers simultaneously. The infrastructure is shared, dramatically reducing cloud compute costs and allowing the engineering team to update the software for everyone with a single deployment. This is the architecture that powers modern cloud giants like Salesforce, Snowflake, and AWS.

The Challenge of Data Isolation

The critical challenge in multi-tenancy is isolation. If Customer A logs into the application, it is absolutely unacceptable for them to accidentally see Customer B’s financial data.

Because all the data lives in the exact same database cluster, isolation must be enforced logically through software, rather than physically through separate hard drives.

Row-Level Security (RLS) is the primary mechanism for multi-tenant data isolation. Every row in the massive shared database table is tagged with a tenant_id. When Customer A’s application queries the database, the query engine dynamically injects a hidden WHERE tenant_id = 'A' filter into the SQL statement before it executes. This guarantees that Customer A can only ever retrieve rows that belong to them, even though they are querying the same physical table as everyone else.

Multi-Tenant Architecture

Multi-Tenancy in the Lakehouse

When building a multi-tenant data platform using an Iceberg lakehouse (for example, a centralized data team serving the HR, Finance, and Marketing departments), data engineers must manage compute isolation as well as data isolation.

If the Marketing team runs a massive, poorly-written SQL query that consumes 100% of the CPU on the shared Dremio compute cluster, the Finance team’s executive dashboards will freeze. This is known as the “Noisy Neighbor” problem.

Modern lakehouses solve this by decoupling storage and compute. The data lives in a single, multi-tenant S3 bucket. However, the data platform spins up separate, isolated compute clusters (engines) for each tenant. Marketing gets their own cluster, and Finance gets their own cluster. They both query the same multi-tenant storage, but if Marketing maxes out their CPU, it has absolutely zero impact on Finance’s query performance.

Learn More

To dive deeper into these architectures and master the modern data ecosystem, check out the comprehensive books by Alex Merced available in our Books section.

Multi-Tenant Architecture

Sharing the Infrastructure

The Challenge of Data Isolation

Multi-Tenancy in the Lakehouse

Learn More