Multi-Tenancy Is a Security Problem, Not a Scaling Problem
· Eurtifact Platform Team
Context
Multi-tenant SaaS architectures serve multiple customers from a shared infrastructure. This reduces operational overhead and improves resource utilization.
In consumer applications, tenant isolation is a feature boundary. In regulated systems serving government agencies, healthcare providers, or financial institutions, it is a security boundary.
Failure to enforce isolation at the database level has led to data breaches where one tenant’s queries returned another tenant’s records. Application-layer filtering is not sufficient.
Reality Check
Common Belief
“We filter queries by tenant_id in the application layer, so tenants only see their own data.”
Why That’s Incomplete
Application-layer filtering depends on every query including the correct WHERE tenant_id = X clause. A single missing filter exposes all tenant data.
This is not theoretical. Real-world breaches:
- 2019: Imperva breach exposed customer databases due to missing tenant filter in admin query
- 2021: Codecov supply chain attack modified uploader script to exfiltrate credentials across all tenants
- 2023: CircleCI breach allowed attackers to access secrets across customer accounts
Application code is not a security boundary. It is the implementation layer. Security boundaries must be enforced below the application.
Engineering Implications
Multi-tenant security requires isolation at the database level, not at the query level.
1. Row-Level Security (RLS) Enforcement
Requirement: The database must enforce tenant isolation regardless of what the application requests.
What this means:
- PostgreSQL Row-Level Security (RLS) policies applied to all tenant-scoped tables
- Policies check
current_setting('app.tenant_id')set per database session - Even
SELECT * FROM userswithout aWHEREclause returns only the current tenant’s rows
Example (PostgreSQL):
-- Enable RLS on table
ALTER TABLE artifacts ENABLE ROW LEVEL SECURITY;
-- Create policy: users can only see their tenant's artifacts
CREATE POLICY tenant_isolation ON artifacts
FOR ALL
TO app_user
USING (tenant_id = current_setting('app.tenant_id')::uuid);
-- Set tenant context per connection
SET app.tenant_id = '550e8400-e29b-41d4-a716-446655440000';
Common failure: Relying on ORM-level query filters (Django filter(tenant=X), Rails where(tenant_id: X)) without database enforcement.
2. Connection-Level Tenant Context
Requirement: Tenant identity must be set once per connection, not per query.
What this means:
- Connection pools tagged with tenant ID
- Session variables (
SET app.tenant_id) configured immediately after connection establishment - No queries executed before tenant context is set
Common failure: Setting tenant_id in application memory but not in the database session, allowing direct SQL or admin queries to bypass isolation.
3. Privilege Separation for Admin Operations
Requirement: Administrative operations (migrations, analytics, support queries) must use a separate database role that explicitly bypasses RLS.
What this means:
-- Application role: RLS enforced
CREATE ROLE app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON artifacts TO app_user;
-- Admin role: RLS bypassed for operational queries
CREATE ROLE admin_user BYPASSRLS;
GRANT ALL ON artifacts TO admin_user;
Application connections use app_user. Migrations and support tooling use admin_user with audit logging.
Common failure: Using a single database role for both application and admin operations, defeating the purpose of RLS.
4. Blast Radius Isolation
Requirement: Tenant data must be logically partitioned to limit the scope of SQL injection or privilege escalation attacks.
What this means:
- PostgreSQL table partitioning by
tenant_id(if tenant count is manageable) - Separate schemas per tenant (e.g.,
tenant_a.artifacts,tenant_b.artifacts) - Physical database separation for high-security tenants (government, healthcare)
Trade-offs:
| Approach | Blast Radius | Operational Complexity | Cost |
|---|---|---|---|
| Shared table + RLS | All tenants | Low | Low |
| Partitioned table + RLS | Single partition | Medium | Low |
| Schema-per-tenant | Single schema | High | Medium |
| Database-per-tenant | Single database | Very high | High |
Common failure: Treating all tenants identically regardless of risk profile. A public sector tenant should not share a database with a startup pilot user.
Failure Modes
Pattern 1: Missing Tenant Filter in Admin Query
A support engineer runs a direct SQL query to investigate a customer issue:
SELECT * FROM api_keys WHERE user_id = 12345;
Without RLS, this returns API keys for user_id = 12345 across all tenants. The engineer accidentally exposes another customer’s secrets.
Why this fails: No database-level enforcement. The query succeeds because the admin role has unrestricted access.
Pattern 2: ORM Query Bypassing Filter
A developer writes a raw SQL query for performance reasons:
# Correct: uses ORM filter
Artifact.objects.filter(tenant=request.tenant, status='active')
# Incorrect: raw SQL bypasses ORM filter
cursor.execute("SELECT * FROM artifacts WHERE status = 'active'")
The raw query returns artifacts from all tenants.
Why this fails: RLS was not enabled. The database trusts the application to filter correctly.
Pattern 3: Connection Pool Reuse Without Resetting Context
A connection pool reuses a connection from Tenant A for a request from Tenant B. The application sets tenant_id = B in memory but forgets to reset the database session variable.
# First request (Tenant A)
db.execute("SET app.tenant_id = 'tenant-a'")
db.execute("SELECT * FROM artifacts") # Returns Tenant A's data
# Connection returned to pool
# Second request (Tenant B) - connection reused
# Forgot to reset session variable!
db.execute("SELECT * FROM artifacts") # Still returns Tenant A's data
Why this fails: Session variables persist across connection pool reuse. Every connection checkout must reset tenant context.
What “Good” Looks Like
A secure multi-tenant system has these properties:
-
Database-Enforced Isolation: RLS policies applied to all tenant-scoped tables. The database rejects queries that violate tenant boundaries, even if the application requests them.
-
Session-Level Tenant Context: Tenant ID set via database session variables (
SET app.tenant_id), not query parameters. Connection pools include tenant context in health checks. -
Role-Based Privilege Separation: Application connections use a restricted role with RLS enforced. Administrative operations use a separate role with audit logging.
-
Automated Testing of Isolation: Integration tests verify that queries from Tenant A cannot return Tenant B’s data, including edge cases (admin queries, raw SQL, ORM bypasses).
-
Blast Radius Tiering: High-risk tenants (government, healthcare, critical infrastructure) use schema-per-tenant or database-per-tenant isolation. Shared tables reserved for low-risk tenants.
These are technical controls, not documentation artifacts.
Limits & Trade-offs
This approach does not:
- Eliminate application bugs: RLS prevents cross-tenant data leakage, but it does not prevent logic errors within a tenant’s scope (e.g., incorrect permission checks).
- Solve performance at scale: RLS adds query overhead. For systems with millions of rows per tenant, partitioning or schema-per-tenant may be necessary.
- Protect against compromised admin credentials: If an attacker obtains the
admin_userrole, RLS is bypassed. Credential security and audit logging remain critical.
Multi-tenancy isolation reduces attack surface. It does not eliminate the need for other security controls.
Key Takeaways
- Application-layer filtering (
WHERE tenant_id = X) is not a security boundary. It is a query pattern. Security must be enforced at the database level. - PostgreSQL Row-Level Security (RLS) makes tenant isolation a database guarantee, not an application responsibility.
- Connection pools must reset tenant context on every connection checkout. Session variables persist across reuse.
- Administrative operations (migrations, support queries) require a separate database role with audit logging. Shared roles defeat isolation.
- High-risk tenants (government, healthcare) should use schema-per-tenant or database-per-tenant isolation to limit blast radius.
This article reflects the Eurtifact platform team’s experience with multi-tenant security in regulated environments. It is not legal or compliance advice. For questions specific to your system architecture, consult security and compliance specialists.