Skip to main content

Source: ocean/docs/adr/0047-ebisu-data-warehouse-integration.md | ✏️ Edit on GitHub

ADR-0047: Real-time Tenant DB Sync with Postgres Logical Replication

Date: 2025-09-02

Status

Proposed

Context

We need to ensure that data in our multi-tenant databases on Neon is kept up-to-date with changes originating from our master trade database (Ebisu) hosted on Crunchy Bridge. The previous approach considered a scheduled ETL job, which would introduce data latency and require application-level logic to manage the process.

With the standardization on PostgreSQL 17, we can leverage its native logical replication capabilities for a more efficient, real-time solution.

Decision

We will implement a real-time, fan-out synchronization system using PostgreSQL 17's native logical replication features. This approach eliminates the need for intermediate ETL scripts and cron jobs.

  1. Publisher Database: The Ebisu database on Crunchy Bridge will be configured as the Publisher. A PUBLICATION will be created for the specific tables containing the master trade data that needs to be synchronized.

  2. Subscriber Databases: Each tenant database on Neon will be configured as a Subscriber. During tenant provisioning, a SUBSCRIPTION will be created in the tenant's database, pointing to the Ebisu publication.

  3. Data Flow: When a record is inserted, updated, or deleted in a published table on Ebisu, the change will be automatically and nearly instantaneously replicated to all subscribed tenant databases.

Consequences

Positive

  • Near Real-Time Synchronization: Data is propagated in milliseconds, not hours, ensuring tenants always have the freshest data.
  • Reduced Application Complexity: Eliminates the need to write, maintain, and schedule custom ETL scripts.
  • High Efficiency: Leverages a highly optimized, built-in C-level implementation within Postgres for minimal overhead.
  • Scalability: This is the standard, horizontally scalable way to distribute data from one source to many destinations in PostgreSQL.

Negative

  • Infrastructure & Security Requirements: This architecture requires:
    • Direct, secure network connectivity from the Neon tenant databases to the Crunchy Bridge master database.
    • Elevated database user permissions (REPLICATION) on both source and destination databases, which must be carefully managed.
  • Read-Only Data Constraint: The application logic for tenants must treat the subscribed tables as strictly read-only. Any tenant-side changes to this data will cause replication conflicts and break the subscription.
  • Provisioning Complexity: The tenant provisioning process must be updated to reliably create the SUBSCRIPTION for each new tenant.
  • Initial Data Sync: Logical replication handles ongoing changes. A separate process will be needed to perform the initial bulk-load of master data into a new tenant's database before the subscription is enabled.

Alternatives Considered

  • Scheduled ETL Job (Cron Job): This was the initial proposal. It was rejected because it introduces significant data latency, is less efficient, and requires more complex application-level code to manage state (e.g., tracking the last update time).

Decision Makers

  • ryan-taylor
  • Gemini