Identity Engine is our first-party data identity resolution application designed to create a consolidated view of customer data. As the need for the application grew, there wasn’t a secure, scalable solution to deploy separate instances for multiple customers. LiveRamp’s Engineering Team set out to fix this gap by enabling multi-tenancy within a single, trustworthy application.
Our challenge was to implement a robust multi-tenant architecture that could scale as we onboarded more customers. After evaluating various infrastructure approaches, we chose a single Cloud SQL instance with a dedicated database for each tenant.
Here’s a look at the process and tools used in this update:
1. Evaluating multi-tenancy models
We started by reviewing common multi-tenancy models and their pros and cons:
- Single Cloud SQL instance with shared database: All tenants share a single database, with tables including a tenant identifier column. While cost-effective, this approach can make schema changes complex and tenant data isolation tricky.
- Single Cloud SQL instance with separate databases per tenant: Each tenant has their own database. This approach offers strong data isolation and flexibility, but infrastructure costs are higher.
- Multiple Cloud SQL instances per tenant: Each tenant gets their own dedicated Cloud SQL instance. While this offers complete isolation, it becomes costly and difficult to manage as tenants grow.
We chose the second option, a single Cloud SQL instance with separate databases per tenant, because it struck a balance between security, scalability, and ease of maintenance, including:
- Data isolation: Each tenant’s data is completely separate, improving security and compliance.
- Scalability: Cloud SQL allows us to efficiently provision and manage databases.
- Maintenance flexibility: We can perform updates or maintenance for one tenant without affecting others.
2. Managing database schemas with Flyway
Handling schema management for multiple databases can get complex fast. We streamlined this using Flyway, a powerful database migration tool that enables seamless version control for our databases. Flyway supports SQL-based migrations, is lightweight and easy to set up, and ensures idempotent execution, guaranteeing that each migration runs only once per database.
Our setup uses a main database with its own schema, along with individual tenant databases. On application startup, Flyway handles migrations in two stages:
1. Main database migration: Flyway migrates all the scripts for the main database.
2. Tenant database migration: After the main database is ready, we fetch all tenant information from it and trigger parallel Flyway migrations for each tenant database.
This approach ensures that both the main and tenant databases are always up-to-date and consistent.
3. Centralizing tenant database for metadata
To effectively manage tenants and check the availability of their databases, we introduced a main database that stores metadata about each tenant, including their database IDs and statuses. Our DatasourceRouter dynamically routes connections based on this metadata, ensuring that each request reaches the correct tenant database.
This main database serves several purposes:
1. Tenant information storage: It holds details about each tenant, such as their database name and current status.
2. Availability checks: When a request comes in, our custom filter references this database to confirm that the tenant’s database is available before processing the request.
3. Centralized management: It allows us to easily monitor and manage tenant databases from a single source. The status helps identify databases with failed migrations.
4. Routing requests dynamically with Spring
A key challenge in a multi-tenant setup is routing requests to the correct tenant database. We achieved this using Spring’s AbstractRoutingDataSource. This seamlessly integrates with the database per tenant model and works well with Spring Data JPA and Hibernate.
Here is the implementation process we followed:
1. Tenant context: We built a TenantContext class to store the tenant ID for each request. It stores the TenantId in a TenantLocal class, each thread gets its own copy of the `CONTEXT` variable. This ensures that the tenant ID set by one thread doesn’t interfere with the tenant ID set by another thread. Each thread has its own isolated context.
2. Dynamic data source routing: We extended AbstractRoutingDataSource to select the right data source based on the tenant ID.
3. Data source registry: We maintain a registry of data sources for each tenant and add new ones when tenants are onboarded.
4. Request interceptor: This extracts the tenant ID from the API request header and sets it in the TenantContext.
5. Request filtering and validation:
We implemented a custom filter that validates the tenant ID from the request header and checks the main database for availability before setting the tenant context. Here’s a simplified version:
Benefits of the multi-tenant architecture
Since implementation, our multi-tenant architecture has delivered several key benefits:
- Data security: Tenants’ application data remains completely isolated.
- Operational flexibility: We can perform tenant-specific updates and data fixes without disrupting others.
- Scalability: Cloud SQL makes it easy to grow and manage the system as more customers join.
Ready to simplify your multi-tenant architecture?
By automating tenant onboarding, database creation, and schema migrations, our team transformed a time-consuming process into a streamlined, scalable, secure solution. Flyway played a key role in ensuring database consistency and version control across tenants. Additionally, we implemented automatic tenant database provisioning and connection pool updates, eliminating the need for application restarts when onboarding new tenants.
These optimizations have reduced Tenant onboarding time from days to less than an hour, reinforcing LiveRamp’s commitment to scalable, secure, and efficient data collaboration as the ecosystem continues to evolve.
Want to see how LiveRamp can help modernize your identity infrastructure? Get in touch with our team to explore what’s possible.