Multi-Tenancy Is a Security Problem, Not a Scaling Problem

· Eurtifact Platform Team

Context

Multi-tenant SaaS architectures serve multiple customers from a shared infrastructure. This reduces operational overhead and improves resource utilization.

In consumer applications, tenant isolation is a feature boundary. In regulated systems serving government agencies, healthcare providers, or financial institutions, it is a security boundary.

Failure to enforce isolation at the database level has led to data breaches where one tenant’s queries returned another tenant’s records. Application-layer filtering is not sufficient.

Reality Check

Common Belief

“We filter queries by tenant_id in the application layer, so tenants only see their own data.”

Why That’s Incomplete

Application-layer filtering depends on every query including the correct WHERE tenant_id = X clause. A single missing filter exposes all tenant data.

This is not theoretical. Real-world breaches:

  • 2019: Imperva breach exposed customer databases due to missing tenant filter in admin query
  • 2021: Codecov supply chain attack modified uploader script to exfiltrate credentials across all tenants
  • 2023: CircleCI breach allowed attackers to access secrets across customer accounts

Application code is not a security boundary. It is the implementation layer. Security boundaries must be enforced below the application.

Engineering Implications

Multi-tenant security requires isolation at the database level, not at the query level.

1. Row-Level Security (RLS) Enforcement

Requirement: The database must enforce tenant isolation regardless of what the application requests.

What this means:

  • PostgreSQL Row-Level Security (RLS) policies applied to all tenant-scoped tables
  • Policies check current_setting('app.tenant_id') set per database session
  • Even SELECT * FROM users without a WHERE clause returns only the current tenant’s rows

Example (PostgreSQL):

-- Enable RLS on table
ALTER TABLE artifacts ENABLE ROW LEVEL SECURITY;

-- Create policy: users can only see their tenant's artifacts
CREATE POLICY tenant_isolation ON artifacts
  FOR ALL
  TO app_user
  USING (tenant_id = current_setting('app.tenant_id')::uuid);

-- Set tenant context per connection
SET app.tenant_id = '550e8400-e29b-41d4-a716-446655440000';

Common failure: Relying on ORM-level query filters (Django filter(tenant=X), Rails where(tenant_id: X)) without database enforcement.

2. Connection-Level Tenant Context

Requirement: Tenant identity must be set once per connection, not per query.

What this means:

  • Connection pools tagged with tenant ID
  • Session variables (SET app.tenant_id) configured immediately after connection establishment
  • No queries executed before tenant context is set

Common failure: Setting tenant_id in application memory but not in the database session, allowing direct SQL or admin queries to bypass isolation.

3. Privilege Separation for Admin Operations

Requirement: Administrative operations (migrations, analytics, support queries) must use a separate database role that explicitly bypasses RLS.

What this means:

-- Application role: RLS enforced
CREATE ROLE app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON artifacts TO app_user;

-- Admin role: RLS bypassed for operational queries
CREATE ROLE admin_user BYPASSRLS;
GRANT ALL ON artifacts TO admin_user;

Application connections use app_user. Migrations and support tooling use admin_user with audit logging.

Common failure: Using a single database role for both application and admin operations, defeating the purpose of RLS.

4. Blast Radius Isolation

Requirement: Tenant data must be logically partitioned to limit the scope of SQL injection or privilege escalation attacks.

What this means:

  • PostgreSQL table partitioning by tenant_id (if tenant count is manageable)
  • Separate schemas per tenant (e.g., tenant_a.artifacts, tenant_b.artifacts)
  • Physical database separation for high-security tenants (government, healthcare)

Trade-offs:

Approach Blast Radius Operational Complexity Cost
Shared table + RLS All tenants Low Low
Partitioned table + RLS Single partition Medium Low
Schema-per-tenant Single schema High Medium
Database-per-tenant Single database Very high High

Common failure: Treating all tenants identically regardless of risk profile. A public sector tenant should not share a database with a startup pilot user.

Failure Modes

Pattern 1: Missing Tenant Filter in Admin Query

A support engineer runs a direct SQL query to investigate a customer issue:

SELECT * FROM api_keys WHERE user_id = 12345;

Without RLS, this returns API keys for user_id = 12345 across all tenants. The engineer accidentally exposes another customer’s secrets.

Why this fails: No database-level enforcement. The query succeeds because the admin role has unrestricted access.

Pattern 2: ORM Query Bypassing Filter

A developer writes a raw SQL query for performance reasons:

# Correct: uses ORM filter
Artifact.objects.filter(tenant=request.tenant, status='active')

# Incorrect: raw SQL bypasses ORM filter
cursor.execute("SELECT * FROM artifacts WHERE status = 'active'")

The raw query returns artifacts from all tenants.

Why this fails: RLS was not enabled. The database trusts the application to filter correctly.

Pattern 3: Connection Pool Reuse Without Resetting Context

A connection pool reuses a connection from Tenant A for a request from Tenant B. The application sets tenant_id = B in memory but forgets to reset the database session variable.

# First request (Tenant A)
db.execute("SET app.tenant_id = 'tenant-a'")
db.execute("SELECT * FROM artifacts")  # Returns Tenant A's data

# Connection returned to pool

# Second request (Tenant B) - connection reused
# Forgot to reset session variable!
db.execute("SELECT * FROM artifacts")  # Still returns Tenant A's data

Why this fails: Session variables persist across connection pool reuse. Every connection checkout must reset tenant context.

What “Good” Looks Like

A secure multi-tenant system has these properties:

  1. Database-Enforced Isolation: RLS policies applied to all tenant-scoped tables. The database rejects queries that violate tenant boundaries, even if the application requests them.

  2. Session-Level Tenant Context: Tenant ID set via database session variables (SET app.tenant_id), not query parameters. Connection pools include tenant context in health checks.

  3. Role-Based Privilege Separation: Application connections use a restricted role with RLS enforced. Administrative operations use a separate role with audit logging.

  4. Automated Testing of Isolation: Integration tests verify that queries from Tenant A cannot return Tenant B’s data, including edge cases (admin queries, raw SQL, ORM bypasses).

  5. Blast Radius Tiering: High-risk tenants (government, healthcare, critical infrastructure) use schema-per-tenant or database-per-tenant isolation. Shared tables reserved for low-risk tenants.

These are technical controls, not documentation artifacts.

Limits & Trade-offs

This approach does not:

  • Eliminate application bugs: RLS prevents cross-tenant data leakage, but it does not prevent logic errors within a tenant’s scope (e.g., incorrect permission checks).
  • Solve performance at scale: RLS adds query overhead. For systems with millions of rows per tenant, partitioning or schema-per-tenant may be necessary.
  • Protect against compromised admin credentials: If an attacker obtains the admin_user role, RLS is bypassed. Credential security and audit logging remain critical.

Multi-tenancy isolation reduces attack surface. It does not eliminate the need for other security controls.

Key Takeaways

  • Application-layer filtering (WHERE tenant_id = X) is not a security boundary. It is a query pattern. Security must be enforced at the database level.
  • PostgreSQL Row-Level Security (RLS) makes tenant isolation a database guarantee, not an application responsibility.
  • Connection pools must reset tenant context on every connection checkout. Session variables persist across reuse.
  • Administrative operations (migrations, support queries) require a separate database role with audit logging. Shared roles defeat isolation.
  • High-risk tenants (government, healthcare) should use schema-per-tenant or database-per-tenant isolation to limit blast radius.

This article reflects the Eurtifact platform team’s experience with multi-tenant security in regulated environments. It is not legal or compliance advice. For questions specific to your system architecture, consult security and compliance specialists.