Skip to main content

Source: ocean/docs/adr/022-observability-and-error-tracking.md | ✏️ Edit on GitHub

ADR-022: Observability and Error Tracking Standards

Date: 2024-08-20

Status

Accepted

Context

We discovered several Edge Functions were missing proper error tracking:

  1. Some functions had no Sentry initialization
  2. Others used console.error instead of structured logging
  3. Inconsistent error handling patterns
  4. Missing context in error reports

This made debugging production issues extremely difficult as errors were not being properly tracked or reported.

Decision

We will implement comprehensive observability standards:

  1. Mandatory Sentry initialization for all Edge Functions
  2. Structured logging using our Logger class
  3. Consistent error handling patterns
  4. Contextual information in all error reports

Implementation Standards

  1. Function Initialization:

    import { initSentry, Logger } from '../_shared/observability.ts'

    // Initialize at function start
    initSentry('function-name')
    const logger = new Logger({ functionName: 'function-name' })
  2. Error Handling:

    try {
    // operation
    } catch (error) {
    logger.error('Operation failed', error, {
    userId,
    organizationId,
    operation: 'specificOperation',
    })
    throw error // Re-throw after logging
    }
  3. Structured Logging:

    // ✅ Good - Structured with context
    logger.info('User provisioned', {
    userId,
    organizationId,
    duration: Date.now() - startTime,
    })

    // ❌ Bad - Unstructured console.log
    console.log(`User ${userId} provisioned`)
  4. Required Context:

    • Function name
    • Organization ID (when available)
    • User ID (when available)
    • Operation being performed
    • Error details and stack traces

Consequences

Positive

  • Faster debugging of production issues
  • Better understanding of error patterns
  • Proactive error detection
  • Improved mean time to resolution (MTTR)

Negative

  • Slight performance overhead
  • Additional code in each function
  • Need to manage Sentry costs
  • Privacy considerations for logged data

Cost Optimization

  1. Sampling Strategy:

    • 100% error sampling
    • 10% transaction sampling in production
    • Filter out non-critical errors (CORS, validation)
  2. Data Scrubbing:

    • Remove PII from error context
    • Redact sensitive headers
    • Mask API keys and passwords

Monitoring Checklist

For each Edge Function:

  • Sentry initialized
  • Logger instance created
  • All errors logged with context
  • No console.error usage
  • Sensitive data scrubbed
  • Performance metrics tracked