Source:
ocean/docs/adr/022-observability-and-error-tracking.md| ✏️ Edit on GitHub
ADR-022: Observability and Error Tracking Standards
Date: 2024-08-20
Status
Accepted
Context
We discovered several Edge Functions were missing proper error tracking:
- Some functions had no Sentry initialization
- Others used console.error instead of structured logging
- Inconsistent error handling patterns
- Missing context in error reports
This made debugging production issues extremely difficult as errors were not being properly tracked or reported.
Decision
We will implement comprehensive observability standards:
- Mandatory Sentry initialization for all Edge Functions
- Structured logging using our Logger class
- Consistent error handling patterns
- Contextual information in all error reports
Implementation Standards
-
Function Initialization:
import { initSentry, Logger } from '../_shared/observability.ts'
// Initialize at function start
initSentry('function-name')
const logger = new Logger({ functionName: 'function-name' }) -
Error Handling:
try {
// operation
} catch (error) {
logger.error('Operation failed', error, {
userId,
organizationId,
operation: 'specificOperation',
})
throw error // Re-throw after logging
} -
Structured Logging:
// ✅ Good - Structured with context
logger.info('User provisioned', {
userId,
organizationId,
duration: Date.now() - startTime,
})
// ❌ Bad - Unstructured console.log
console.log(`User ${userId} provisioned`) -
Required Context:
- Function name
- Organization ID (when available)
- User ID (when available)
- Operation being performed
- Error details and stack traces
Consequences
Positive
- Faster debugging of production issues
- Better understanding of error patterns
- Proactive error detection
- Improved mean time to resolution (MTTR)
Negative
- Slight performance overhead
- Additional code in each function
- Need to manage Sentry costs
- Privacy considerations for logged data
Cost Optimization
-
Sampling Strategy:
- 100% error sampling
- 10% transaction sampling in production
- Filter out non-critical errors (CORS, validation)
-
Data Scrubbing:
- Remove PII from error context
- Redact sensitive headers
- Mask API keys and passwords
Monitoring Checklist
For each Edge Function:
- Sentry initialized
- Logger instance created
- All errors logged with context
- No console.error usage
- Sensitive data scrubbed
- Performance metrics tracked