Library

Aug 18, 2024Approaches to Multitenancy


Introduction to the different ways you can create logical separation between tenants.

Christian Yalamov


Software-as-a-Service (SaaS) has emerged as a popular distribution model for software.

SaaS software is deployed and hosted by the vendor for use by its customers— we tend to refer to these as 'tenants.' A tenant can represent something like a client or workspace. Each tenant has its own data

Storing everyone's data together, but making it seem as if your application runs independently for every tenant can be challenging. This is achieved bylogically separating tenants' data, and making sure they can only see and manipulate the data that relates to them.

Separating your application's data and behaviour for different tenants is typically called multi-tenancy. The best multi-tenancy architecture will depend on a lot of factors to do with the nature and design of your application.

Brief note

If you are implementing a multi-tenant architecture, you will usually have a lot more things to worry about than just your database. It is important to consider your data model as a whole, and the infrastructure on which your application will run. Factors like the 'noisy neighbour' problem— a scenario where one tenant consumes a comparatively large proportion of resources— may influence your technical decisions.

It also helps to assess which parts of your data model will need to be isolated between tenants. Keep in mind that you will always have some data that will be shared, and some that will be siloed.

And finally, this is not a tutorial. There are some code examples in this post, but they are purely for demonstration purposes. Designing a multi-tenant architecture is something pretty hard, and this post is just a summary of some of the patterns or approaches you are likely to come across.

Approach I: Application-level filtering

A common way of going about multi-tenancy is to just filter all records, based on a field which indicates the tenant they belong to (simply put, use a WHERE clause to filter tenantID=$currentTenant).

The great thing about this approach is that it’s easy to implement. However you run queries, and whatever your database, the level of added complexity won’t be too high.

Example 1: Adding the check manually, using a parametrised query.

SELECT * FROM orders AS o WHERE o.tenant_id = $tenantID

Example 2: Using DrizzleORM (ORM for Javascript), manually add a filter

const results = db.select().from(orders).where(eq(orders.tenant, currentTenant))

Example 3: Using DrizzleORM (ORM for Javascript), create a higher-order function which enhances queries by adding a tenant filter

function withTenantFilter(query, tenant, table) {
	// the tenant can either be passed to this function,
	// or dynamically extracted, e.g. based on data
	// in an HTTP request
	return query.$dynamic().where(eq(table.tenant, tenant))
}

const resultsWithTenantFilter = await withTenantFilter(
	db.select().from(orders), 
	currentTenant, 
	orders
)
Interesting alternative approach proposed by @rsslldnphy: create a function which returns a table with an applied filter:

https://github.com/drizzle-team/drizzle-orm/discussions/1539#discussioncomment-7639604

// Function which produces an already filtered table
export const person = (tenant: { id: string }) =>
  db
    .select()
    .from(s.person)
    .where(eq(s.person.tenantId, tenant.id))
    .as("person");

// ...
  
  const people = await db
    .select()
    .from(person(tenant))
    .where(like(person(tenant).name, "Croenberg"))
  );

Example 4: Using Entity Framework Global Filters (.NET)

public class OrdersDbContext : DbContext
{
    private readonly string _tenantId;

		// The TenantProvider is injected as a scoped service
		// which detects the tenant (e.g. based off the request headers)
		// and supplies it to the DbContext
    public OrdersDbContext(DbContextOptions<OrdersDbContext> options, TenantProvider tenantProvider)
        : base(options)
    {
        _tenantId = tenantProvider.TenantId;
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
		    // Applying a filter on every query on the Orders table
        modelBuilder.Entity<Order>().HasQueryFilter(o => o.TenantId == _tenantId);
    }
}

Approach II: Row-Level Security

Some database products offer a feature called Row-Level Security (RLS). This allows you to set policies which can exclude particular rows, depending on whether they meet certain criteria. Postgres and Microsoft SQL Server are the two main products which offer this functionality.

This is similar to how we implemented application-level filtering using the Entity Framework Global Filters in .NET— except, this is now happening on the database side.

We can configure such a policy in Postgres like so:

CREATE POLICY orders_policy
  ON orders
  USING (tenant_id::TEXT = current_setting("context.tenant"));

Here, we’re only allowing rows to be visible if theirtenant_id matches the context.tenant configuration parameter.

You may wonder how we set this parameter value in the first place. In Postgres, this is done either with set_config(key, value, is_local) or the SQL syntax SET [LOCAL] key to value. A setting being local means that it is only scoped to the current transaction— after the transaction ends, the setting no longer exists. We can do this at the start of the transaction:

SET LOCAL "context.tenant" TO 'ACME Corp'

and then have the value available until the end.

NB Postgres has specific requirements for configuration parameters. Refer to Chapter 20.1 of the official documentation.

You might also wonder why we’re now doing the check on the database, instead of the application— what difference does it make?

One reason is certainty. With the first approach (filtering at the application level), it is possible to forget to add a WHERE clause somewhere, leaving data unfiltered.

Another reason is to do with your infrastructure. Application-level filtering is done on the server, whose job it is to query your database and send a response to the client. But what if you didn’t have a server? With platforms like Supabase, you can get away with having most of your application logic purely on your database— and with Postgres, that’s actually quite practical! Extensions like postgREST, pgJWT (and more) make it easy to add enough features to your DB that you no longer need a backend application server.

However, when you don’t have a server where you can enforce tenants’ data separation, and you let your users directly access your database, Row-Level Security can still help you achieve isolated multi-tenancy.

Approach III: Infrastructure replication

In most cases, multi-tenant products place all tenants together on the same, shared infrastructure. But there are other cases, where it might make more sense to create separate deployments on dedicated infrastructure for individual clients.

To do this, you can replicate components of your infrastructure for each new tenant.

Docker is a containerisation tool which allows you to package your applications into ‘images.’ Because the Docker runtime virtualises the operating system (not the hardware, like with a Virtual Machine), you can deploy these images on almost any host infrastructure.
IaC stands for Infrastructure as Code, and gives you a way to deploy infrastructure declaratively. Inside a code file, you declare the different infrastructure components required for your solution and your IaC tool (such as Pulumi or Terraform) handles their deployment and setup.

Using technologies like Docker and IaC, you can easily deploy components for new tenants, automatically. Microsoft refers to this pattern as ‘deployment stamps.’

Which components you replicate will again depend on your unique case. For instance, you may only need to create separate databases, while keeping application servers shared. Or, you may wish to have dedicated application servers for compliance reasons, but retain some services (like authentication) shared between everyone.

Separating infrastructure for tenants is also not trivial to implement, and it makes many aspects of your system design more complex.

Managing database schemas can become more complicated, as you are no longer working with one database for the whole system, but one database per tenant. Database migrations may succeed for some tenants (but not all), putting you in a position where the data model is inconsistent across tenants (the kind of situation that requires manual intervention).

If you are considering this deployment pattern, it’s also important to note it carries much higher costs per-tenant, than hosting everyone on shared infrastructure.

· · ·

When offering hosted software to different customers, you will have to make a lot of not-straightforward decisions about the design of your systems. In this article, three approaches are described for achieving multi-tenancy are described. None of them are likely to be a perfect fit for your use case. Your solution will probably include a mixture of different patterns and infrastructure components.