What
Active-Active means two or more PostgreSQL databases can accept writes and stay in sync in near real-time. Unlike standard streaming replication (primary → replicas), this setup allows bi-directional writes.
We’ll use AWS DMS to achieve this:
- Full Load: copy existing schema + data
- Change Data Capture (CDC): replicate ongoing changes from WAL logs
Databases can be RDS, EC2 PostgreSQL, or on-premises. Latency is usually seconds.
Why
- Multi-region, hybrid infrastructure, disaster recovery
- True bi-directional sync is complex; DMS simplifies it
- Avoid downtime and manual syncing
- Most failures happen during setup, not concept
How
1. Decide DMS Deployment
- Provisioned: recommended for CDC. Runs 24/7, predictable performance.
- Serverless: flexible scaling, but dynamic cost and unsuitable for constant CDC.
Rule: Always use Provisioned for bi-directional replication.
2. Enable Logical Replication on PostgreSQL
RDS:
Create/modify a parameter group:
rds.logical_replication = 1
- Assign to database
- Manual reboot required (UI reboot insufficient)
Local PostgreSQL:
Edit postgresql.conf:
wal_level = logical
max_wal_senders = 10
max_replication_slots = 10
Edit pg_hba.conf to allow external connections.
3. Create DMS Endpoints
Endpoints are your source and target databases. You’ll need two endpoints for bi-directional replication:
- DMS Task A: DB1 → DB2
- DMS Task B: DB2 → DB1
4. Configure Pre-Migration Checks
DMS pre-migration check may fail at:
- wal_level not logical
- PGLOGICAL not configured
This is expected. Fix if needed; otherwise, proceed.
5. Configure Bi-Directional Loop Prevention
To prevent infinite loop replication, add to DMS task JSON settings:
"LoopbackPreventionSettings": {
"EnableLoopbackPrevention": true,
"SourceSchema": "public",
"TargetSchema": "public"
}
Note: The schema doesn’t have to be “public” - you can use any schema as long as it exists in your database. Using “public” schema makes it convenient to see all tables in a single schema. Be advised that AWS will create a few tables in this schema for loopback control.
Ensures records replicated by Task A are ignored when coming back via Task B. Apply for both DMS tasks.
6. Run Full Load + CDC Tasks
- Full Load: run only once per database
- CDC Tasks: ongoing changes only
- Avoid running full load after initial sync; it will fail if data exists
7. Provisioned Instance Sizing Tips
- Start with larger instance for full load
- Once full load completes, scale down to save cost
- One provisioned instance can run multiple DMS tasks (e.g., 6 tasks for hub-and-spoke 3-region setup)
8. Verification
- Check replication latency
- Confirm records are synced both ways
- Monitor replication slots, WAL sender processes
Large Caveat: Code-Level Design
Database replication is only half the solution. Your application code must handle ID generation properly to prevent conflicts:
- If IDs are serial, conflicts can occur when multiple databases insert simultaneously
- Options:
- Use UUIDs for all records
- For low-conflict serials, use a central server to assign numbers on demand
- Add region ID into UUID to completely remove risk of collisions
Why: Once a conflicting record enters the database, fixing it at DB level is far more difficult than at code level.
Additional Thought: Purpose of This Setup
Each database is close to the server serving clients in that region:
- Clients get low-latency responses
- Future blog will cover how servers interact with clients, how client locks move between regions, and maintaining low latency as clients move geographically
Thoughts / Caveats
- DMS simplifies active-active but conflicts must be handled carefully
- Pre-migration check failures are normal; WAL and PGLOGICAL fix them
- Provisioned instances give control, predictable cost, and support multiple tasks
- Full load task should never run repeatedly
- Code-level ID strategy is critical to avoid replication conflicts