Skip to main content

Source: ocean/docs/ISSUE_RESOLUTION_PROJECT.md | ✏️ Edit on GitHub

Issue Resolution Project Plan - Ocean Platform

Project Start Date: August 26, 2025
Target Completion: September 30, 2025
Priority: High
Document Version: 1.1
Last Updated: August 26, 2025

✅ Completed Tasks

  • PostgreSQL 17 Standardization (Phase 2) - Completed August 26, 2025
    • See ADR-025 for implementation details
  • Sentry Integration in Edge Functions (Phase 3.1) - Already Complete
    • All 13 Edge Functions have Sentry integration properly configured
    • Note: The original issue referenced non-existent functions
  • Cleanup Automation for Failed Provisioning (Phase 3.2) - Completed August 26, 2025
    • Created comprehensive cleanup system for failed provisions and unverified accounts
    • Added database functions, Edge Functions, and scheduling configuration

Executive Summary

This project plan addresses critical issues and gaps identified in the System Baseline Analysis. The resolution of these issues will improve security, consistency, and operational efficiency of the Ocean platform.

Project Objectives

  1. Resolve Security Vulnerabilities - Rotate exposed API keys and enhance security measures
  2. Standardize PostgreSQL Versions - Ensure consistent PostgreSQL 17 usage
  3. Complete Missing Integrations - Add Sentry to all Edge Functions
  4. Clarify Architecture - Document or remove unimplemented features
  5. Improve Operational Resilience - Add monitoring and cleanup automation

Issue Priority Matrix

| Priority | Issue | Impact | Effort | Risk | | -------- | -------------------------------------- | ------------------------- | ---------- | ---------- | ------------------- | | P0 | Exposed PostHog API Keys | Security breach risk | Low | High | | P0 | PostgreSQL Version Inconsistency | Deployment failures | Medium | Medium | ✅ RESOLVED | | P1 | Missing Sentry in 2 Edge Functions | Reduced observability | Low | Low | ✅ ALREADY COMPLETE | | P1 | No Cleanup for Failed Provisioning | Resource waste | Medium | Medium | ✅ RESOLVED | | P2 | Unused Feature Flags | Missing A/B testing | Low | Low | | P2 | Unimplemented Features Documentation | Developer confusion | Low | Low | | P3 | Missing Rate Limit Monitoring | API quota issues | Medium | Low |

Phase 1: Critical Security Fixes (Week 1)

Task 1.1: Rotate PostHog API Keys

Owner: Security Team
Duration: 2 days
Dependencies: None

Steps:

  1. Generate new PostHog API keys in PostHog dashboard

  2. Update all environment variables:

    VITE_POSTHOG_API_KEY
    VITE_POSTHOG_HOST
    VITE_POSTHOG_PROJECT_ID
    POSTHOG_PERSONAL_API_KEY
  3. Update GitHub Secrets

  4. Update Vercel environment variables

  5. Update Supabase Edge Function secrets

  6. Remove exposed keys from documentation

  7. Audit codebase for any hardcoded keys

Validation:

  • No API keys in source code
  • All services connect successfully with new keys
  • Documentation contains only placeholders

Task 1.2: Security Audit Follow-up

Owner: Security Team
Duration: 1 day
Dependencies: Task 1.1

Steps:

  1. Run secret scanning on entire codebase:

    pnpm run scan:secrets
  2. Review and update .gitleaks.toml configuration

  3. Ensure pre-commit hooks are working

  4. Document security best practices

Deliverables:

  • Security scan report
  • Updated security documentation

Phase 2: PostgreSQL Standardization ✅ COMPLETED

Task 2.1: Standardize to PostgreSQL 17 ✅

Owner: Backend Team
Duration: 3 days
Dependencies: None
Status: COMPLETED (August 26, 2025)

Files Updated:

  1. /supabase/config.toml - Updated from v15 to v17 ✅
  2. /supabase/functions/graphql-v2/services/atomic-provisioning.ts - Updated from v16 to v17 ✅
  3. /supabase/functions/graphql-v2/services/provisioning.ts - Updated from v16 to v17 ✅
  4. /supabase/functions/provision-tenant-resources/index.ts - Already using v17 ✅

ADR Created: See /docs/adr/025-postgresql-17-standardization.md

Implementation:

// atomic-provisioning.ts
const projectResponse = await fetch('https://console.neon.tech/api/v2/projects', {
method: 'POST',
headers: {
Authorization: `Bearer ${neonApiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
project: {
name: `org-${organizationId}`,
region_id: neonRegion,
pg_version: 17, // Standardized to v17
},
}),
})

Testing:

  • TypeScript compilation successful
  • ESLint validation passed
  • All provisioning services aligned
  • End-to-end testing with new organizations

Task 2.2: Document PostgreSQL 17 Features ✅

Owner: Backend Team
Duration: 1 day
Dependencies: Task 2.1
Status: COMPLETED (August 26, 2025)

Deliverables:

  • ✅ ADR-025 created documenting standardization decision
  • ✅ Documented PostgreSQL 17 features and compatibility
  • ✅ Updated version references across codebase

Phase 3: Complete Missing Integrations (Week 2)

Task 3.1: Add Sentry to Missing Edge Functions ✅ ALREADY COMPLETE

Owner: Backend Team
Duration: 2 days
Dependencies: None
Status: COMPLETED (Already implemented)

Findings:

  • The original issue referenced incorrect function names
  • /supabase/functions/provision-user-resources/index.ts doesn't exist
  • The actual function is provision-tenant-resources which already has Sentry
  • /supabase/functions/send-slack-alert/index.ts already has Sentry integration

Verified Edge Functions with Sentry (13/13):

  1. ✅ check-tenant-health
  2. ✅ provision-tenant-resources
  3. ✅ send-slack-alert
  4. ✅ handle-stripe-webhook
  5. ✅ GraphQL-v2
  6. ✅ stripe-billing
  7. ✅ stripe-portal
  8. ✅ stripe-products
  9. ✅ stripe-setup-intent
  10. ✅ stripe-subscription
  11. ✅ sync-stripe-data
  12. ✅ cleanup-resources
  13. ✅ scheduled-cleanup

No action required - All Edge Functions properly integrated with observability

Task 3.2: Implement Cleanup Automation ✅ COMPLETED

Owner: Backend Team
Duration: 3 days
Dependencies: None
Status: COMPLETED (August 26, 2025)

Implementation Summary:

  1. ✅ Created cleanup-failed-provisions Edge Function
  2. ✅ Enhanced scheduled-cleanup to handle both unverified accounts and failed provisions
  3. ✅ Added database functions:
    • get_failed_provisions() - Identifies failed/stuck provisions
    • get_cleanup_candidates() - Finds unverified accounts
    • cleanup_unverified_accounts() - Deletes unverified accounts
    • record_cleanup_attempt() - Audit trail for cleanups
  4. ✅ Created monitoring view for cleanup status
  5. ✅ Added dry-run mode for safe testing
  6. ✅ Documented scheduling options (Supabase cron, GitHub Actions, etc.)

Features Implemented:

  • ✅ Identifies stuck provisioning attempts (configurable threshold)
  • ✅ Cleans up partial Neon projects via rollback
  • ✅ Cancels incomplete Stripe customers
  • ✅ Provides detailed audit trail in provisioning_events
  • ✅ Supports batch processing with limits
  • ✅ Includes monitoring and alerting capabilities

Documentation: See /docs/cleanup-automation.md

Phase 4: Architecture Clarification (Week 2-3)

Task 4.1: Document or Remove Unimplemented Features

Owner: Architecture Team
Duration: 2 days
Dependencies: None

Decision Points:

  1. CrunchyBridge Integration

    • Keep as future feature → Document in roadmap
    • [ ]; Remove references → Clean up code and docs
  2. Drizzle ORM

    • Plan to implement → Create ADR
    • Remove references → Update documentation
  3. Industry Enrichment

    • Implement basic version → Design API
    • Remove feature → Update UI and docs

Deliverables:

  • Updated README.md
  • Architecture Decision Records (ADRs)
  • Cleaned up codebase

Task 4.2: Remove or Implement master_db

Owner: Backend Team
Duration: 2 days
Dependencies: Task 4.1

Options:

  1. Remove master_db references:

    // Update provisioning_status type
    provisioning_status: {
    stripe: 'pending' | 'completed' | 'failed'
    neon: 'pending' | 'completed' | 'failed'
    // Remove: master_db
    }
  2. Implement basic master_db:

    • Use shared Supabase tables
    • Add industry lookup table
    • Simple REST API for enrichment

Phase 5: Operational Improvements (Week 3-4)

Task 5.1: Implement Neon API Monitoring

Owner: DevOps Team
Duration: 2 days
Dependencies: None

Implementation:

  1. Add rate limit tracking to atomic-provisioning.ts
  2. Create monitoring dashboard in PostHog
  3. Set up alerts for approaching limits

Code Addition:

// Track API usage
await posthog.capture({
distinctId: 'system',
event: 'neon_api_call',
properties: {
endpoint: 'create_project',
remaining_quota: response.headers.get('X-RateLimit-Remaining'),
reset_time: response.headers.get('X-RateLimit-Reset'),
},
})

Task 5.2: Activate PostHog Feature Flags

Owner: Product Team
Duration: 3 days
Dependencies: None

Implementation Plan:

  1. Review defined feature flags

  2. Create rollout strategy

  3. Implement in dashboard component:

    // Uncomment and configure
    const showNewDashboard = useFeatureFlag('NEW_DASHBOARD')
    const betaFeaturesEnabled = useFeatureFlag('BETA_FEATURES')
  4. Set up A/B test for new features

  5. Configure gradual rollout

Feature Flags to Activate:

  • NEW_DASHBOARD - 10% initial rollout
  • BETA_FEATURES - Opt-in for early adopters
  • ADVANCED_ANALYTICS - Premium tier only

Task 5.3: Enhance Health Monitoring

Owner: DevOps Team
Duration: 3 days
Dependencies: Task 5.1

Improvements:

  1. Increase health check frequency for critical tenants
  2. Add performance metrics collection
  3. Create alerting rules for degraded performance
  4. Build operational dashboard

Phase 6: Documentation & Training (Week 4)

Task 6.1: Update All Documentation

Owner: Technical Writing Team
Duration: 3 days
Dependencies: All previous tasks

Documents to Update:

  • README.md
  • CLAUDE.md
  • API documentation
  • Deployment guides
  • Security policies

Task 6.2: Team Training

Owner: Team Leads
Duration: 2 days
Dependencies: Task 6.1

Training Topics:

  • New security procedures
  • PostgreSQL 17 features
  • Monitoring dashboards
  • Feature flag management

Success Metrics

Security

  • Zero exposed API keys in codebase
  • All secrets rotated and secured
  • Security scans passing

Technical Debt

  • 100% PostgreSQL 17 usage ✅ (Completed August 26, 2025)
  • All Edge Functions have Sentry ✅ (Already implemented)
  • Automated cleanup for failed provisions ✅ (Completed August 26, 2025)
  • No references to unimplemented features

Operational

  • Automated cleanup running daily
  • API monitoring dashboard live
  • Feature flags active and measured

Documentation

  • All changes documented
  • Team trained on new procedures
  • Architecture diagrams updated

Risk Mitigation

RiskMitigation Strategy
Service disruption during key rotationUse rolling update strategy, test in staging first
PostgreSQL 17 compatibility issuesMaintain rollback plan, test thoroughly
Feature flag bugsStart with small rollout percentage
Cleanup job deleting valid resourcesImplement dry-run mode, require manual approval initially

Communication Plan

Weekly Updates

  • Monday: Standup with progress update
  • Wednesday: Mid-week check-in
  • Friday: End-of-week report

Stakeholder Communication

  • Week 1: Security fixes completed notification
  • Week 2: Technical standardization update
  • Week 3: Feature activation announcement
  • Week 4: Project completion report

Budget & Resources

Team Allocation

  • Security Team: 3 days
  • Backend Team: 10 days
  • DevOps Team: 5 days
  • Architecture Team: 2 days
  • Product Team: 3 days
  • Technical Writing: 3 days

Total Effort: ~26 person-days

Project Timeline

Approval & Sign-off

  • Engineering Manager: ****_**** Date: ___
  • Security Lead: ******_****** Date: ___
  • Product Manager: ****___**** Date: ___
  • CTO: ******__****** Date: ___

Next Steps:

  1. Review and approve project plan
  2. Assign team members to tasks
  3. Create tracking tickets in project management system
  4. Schedule kick-off meeting
  5. Begin Phase 1 implementation

Project Tracking: Create GitHub Project board with all tasks Communication Channel: #ocean-issue-resolution in Slack