The Challenge of Database Migrations
Database schema changes are one of the riskiest operations in software deployment. Unlike application code, which can be rolled back by deploying a previous version, database migrations are often irreversible and can cause downtime if not handled carefully. At Nexis Limited, we have developed practices to run schema migrations safely across our SaaS products without taking applications offline.
The Expand-and-Contract Pattern
The expand-and-contract pattern breaks a potentially destructive migration into safe, backward-compatible steps:
- Expand: Add the new column, table, or index without removing anything. Both old and new application versions work correctly.
- Migrate data: Copy or transform data from the old structure to the new structure. This can happen gradually, in batches.
- Contract: Once all application instances use the new structure and data migration is complete, remove the old column or table.
Rules for Safe Migrations
- Never rename a column in one step. Add a new column, copy data, update application code, then drop the old column.
- Never add a NOT NULL column without a default. This locks the table and can cause downtime on large tables.
- Create indexes concurrently. In PostgreSQL, use CREATE INDEX CONCURRENTLY to avoid locking the table during index creation.
- Use small, incremental migrations. Each migration should do one thing. If a migration fails, its scope is limited and easier to debug.
- Test migrations against production-like data. A migration that runs in milliseconds on a test database with 100 rows may take hours on a production table with millions of rows.
Tools We Use
For Django projects (Ultimate HRM, Digital Menu, Digital School), we use Django's built-in migration framework with custom management commands for data migrations. For Go projects (Bondorix), we use golang-migrate with SQL migration files, giving us full control over the migration SQL.
Migration Testing in CI
We run migration tests in our CI pipeline that apply all migrations to a fresh database and verify the final schema matches expectations. This catches migration conflicts, ordering issues, and schema drift before code reaches production.
Conclusion
Zero-downtime migrations require discipline and planning, but the practices are well-established. Use the expand-and-contract pattern, keep migrations small and backward-compatible, and always test against realistic data volumes before deploying to production.
Need help with database architecture and migrations? Contact our engineering team.