Skip to main content

Source: ocean/docs/adr/024-schema-evolution-with-jsonb.md | ✏️ Edit on GitHub

ADR-024: Schema Evolution with JSONB

Date: 2024-08-20

Status

Accepted

Context

We needed to track complex, evolving provisioning status across multiple services (Stripe, Neon, etc.). Initially, we tried using separate columns for each status, but this led to:

  1. Schema migrations every time we added a new service
  2. Complex queries to check overall status
  3. Difficulty representing nested status information

Decision

We will use JSONB columns for data that:

  1. Has a complex, nested structure
  2. Is likely to evolve over time
  3. Needs flexibility in what fields are stored
  4. Benefits from JSON querying capabilities

Example: Provisioning Status

-- Instead of multiple columns:
-- stripe_status TEXT,
-- stripe_customer_id TEXT,
-- neon_status TEXT,
-- neon_tenant_id TEXT,
-- etc...

-- Use a single JSONB column:
provisioning_status JSONB DEFAULT jsonb_build_object(
'stripe', jsonb_build_object(
'status', 'pending',
'customerId', null,
'error', null,
'lastUpdated', null
),
'neon', jsonb_build_object(
'status', 'pending',
'tenantId', null,
'databaseHost', null,
'error', null,
'lastUpdated', null
),
'overall', 'pending'
);

Querying Patterns

-- Check overall status
WHERE provisioning_status->>'overall' = 'completed'

-- Check specific service
WHERE provisioning_status->'stripe'->>'status' = 'completed'

-- Update nested field
UPDATE organizations
SET provisioning_status = jsonb_set(
provisioning_status,
'{stripe,customerId}',
'"cus_123"'::jsonb
);

When to Use JSONB

Good Use Cases:

  • Service integration status
  • Feature flags and configuration
  • Audit logs and event metadata
  • User preferences
  • Dynamic form data

Avoid For:

  • Frequently queried individual fields
  • Fields that need constraints/foreign keys
  • Simple scalar values
  • Data that needs indexing for performance

Consequences

Positive

  • Schema flexibility without migrations
  • Can represent complex nested data
  • Easy to add new fields
  • Good query performance with GIN indexes
  • Natural representation of JSON APIs

Negative

  • Less type safety than columns
  • Can't use foreign key constraints
  • Requires careful documentation
  • Risk of inconsistent structure
  • Slightly larger storage size

Best Practices

  1. Define TypeScript interfaces for JSONB structures
  2. Create database functions for common updates
  3. Add CHECK constraints for critical fields
  4. Document the expected structure
  5. Use GIN indexes for frequently queried paths

Validation

-- Add constraint to ensure required structure
ALTER TABLE organizations
ADD CONSTRAINT valid_provisioning_status
CHECK (
provisioning_status ? 'stripe' AND
provisioning_status ? 'neon' AND
provisioning_status ? 'overall' AND
provisioning_status->>'overall' IN ('pending', 'in_progress', 'completed', 'failed')
);

This approach has proven invaluable for handling the complexity of multi-service provisioning while maintaining schema flexibility.