Data Model¶
11 PostgreSQL tables with full schema definitions, indexes, and constraints.
Entity Relationship Diagram¶
erDiagram
Organization ||--o{ Project : "has many"
Organization ||--o{ ProviderConnection : "has many"
Organization ||--o{ BillingPeriod : "has many"
Organization ||--o{ AuditPack : "has many"
Project ||--o{ Workload : "has many"
ProviderConnection ||--o{ Workload : "has many"
Workload ||--o{ TelemetryEvent : "has many"
TelemetryEvent ||--|| CarbonCalculation : "has one"
BillingPeriod ||--o{ CarbonReceipt : "has many"
BillingPeriod ||--o{ PriorPeriodAdjustment : "has many"
CarbonFactors }o--o{ CarbonCalculation : "referenced by"
CarbonFactors {
uuid id PK
string version
string model_tier
jsonb model_patterns
float energy_per_token_prefill_j
float energy_per_token_decode_j
float energy_per_token_cached_j
float pue
float grid_intensity_kg_per_kwh
float uncertainty_pct
timestamp created_at
}
Organization {
uuid id PK
string clerk_org_id UK
string stripe_customer_id UK
enum plan_tier
timestamp created_at
timestamp updated_at
}
Project {
uuid id PK
uuid org_id FK
string name
boolean is_default
timestamp created_at
timestamp updated_at
}
ProviderConnection {
uuid id PK
uuid org_id FK
enum provider
string secrets_manager_arn
enum status
string sync_cursor
timestamp last_polled_at
int consecutive_failures
uuid active_workload_id FK
timestamp created_at
timestamp updated_at
}
Workload {
uuid id PK
uuid project_id FK
uuid connection_id FK
enum workload_type
boolean is_active
timestamp created_at
timestamp updated_at
}
TelemetryEvent {
uuid id PK
uuid workload_id FK
uuid org_id FK
string provider
string model
string idempotency_hash UK
timestamp bucket_start
timestamp bucket_end
bigint input_tokens_uncached
bigint input_tokens_cached
bigint input_tokens_cache_creation
bigint output_tokens
timestamp event_timestamp
jsonb raw_payload
timestamp created_at
}
CarbonCalculation {
uuid id PK
uuid event_id FK
float energy_joules
float energy_kwh
float co2_kg
float co2_lower_bound_kg
float co2_upper_bound_kg
string model_tier
float pue
float grid_intensity
float uncertainty_pct
string factors_version
timestamp created_at
}
BillingPeriod {
uuid id PK
uuid org_id FK
date period_start
date period_end
enum status
float total_co2_kg
float total_credits_retired
string stripe_invoice_id
timestamp closed_at
timestamp created_at
timestamp updated_at
}
CarbonReceipt {
uuid id PK
uuid billing_period_id FK
uuid org_id FK
string serial_number UK
float co2_retired_kg
jsonb credit_serial_numbers
string payload_hash
string signature
string public_key
int key_version
string pdf_url
string verification_url
timestamp created_at
}
AuditPack {
uuid id PK
uuid org_id FK
int year
int month
string s3_key
string manifest_hash
timestamp generated_at
timestamp created_at
}
PriorPeriodAdjustment {
uuid id PK
uuid org_id FK
uuid original_period_id FK
uuid adjustment_period_id FK
uuid event_id FK
float co2_delta_kg
string reason
timestamp created_at
}
Entity Definitions¶
Core Entities¶
Organization¶
The top-level account entity. Maps 1:1 to a Clerk organization and a Stripe customer.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK, default gen | |
| clerk_org_id | String | Unique, Not Null | From Clerk JWT org_id claim |
| stripe_customer_id | String | Unique, Nullable | Set on first billing interaction |
| plan_tier | Enum(free/starter/growth/scale/enterprise) | Not Null, Default: free | |
| created_at | Timestamp(tz) | Not Null, auto | |
| updated_at | Timestamp(tz) | Not Null, auto-update |
- Indexes: ix_org_clerk_org_id (unique), ix_org_stripe_customer_id (unique)
Project¶
Logical grouping of workloads within an organization. Each org gets a "Default" project auto-created.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| org_id | UUID | FK → Organization, Not Null | |
| name | String(200) | Not Null | |
| is_default | Boolean | Not Null, Default: false | True for auto-created project |
| created_at | Timestamp(tz) | Not Null | |
| updated_at | Timestamp(tz) | Not Null |
- Indexes: uq_project_org_name (unique composite: org_id + name)
Provider Entities¶
ProviderConnection¶
A registered API key for an AI provider. The actual key is stored in AWS Secrets Manager; only the ARN is in the database.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| org_id | UUID | FK → Organization, Not Null | |
| provider | Enum(openai/anthropic/openrouter) | Not Null | |
| secrets_manager_arn | String(500) | Not Null | ARN of the Secrets Manager secret |
| status | Enum(active/error/disabled/validating) | Not Null, Default: validating | |
| sync_cursor | String(500) | Nullable | Last-seen timestamp or page token |
| last_polled_at | Timestamp(tz) | Nullable | |
| consecutive_failures | Integer | Not Null, Default: 0 | Reset on success, disable at 5 |
| active_workload_id | UUID | FK → Workload, Nullable | Currently active workload routing |
| created_at | Timestamp(tz) | Not Null | |
| updated_at | Timestamp(tz) | Not Null |
- Indexes: ix_conn_org_id, ix_conn_status
- State transitions: validating → active (on key validation success) → error (on transient failure) → active (on recovery) or disabled (on 5+ permanent failures)
Workload¶
Links a connection to a project. A connection can have multiple workloads over time (when re-mapped), but only one is active.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| project_id | UUID | FK → Project, Not Null | |
| connection_id | UUID | FK → ProviderConnection, Not Null | |
| workload_type | Enum(inference) | Not Null, Default: inference | Extensible for future types |
| is_active | Boolean | Not Null, Default: true | Only one active per connection |
| created_at | Timestamp(tz) | Not Null | |
| updated_at | Timestamp(tz) | Not Null |
- Indexes: ix_workload_connection_id, ix_workload_project_id
Telemetry Entities¶
TelemetryEvent¶
Usage record from a provider poll. Token counts are split into four categories. Core identity fields (hash, model, timestamps) are immutable; token counts may be updated via upsert during reconciliation.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| workload_id | UUID | FK → Workload, Not Null | |
| org_id | UUID | FK → Organization, Not Null | Denormalized for query performance |
| provider | String(50) | Not Null | Denormalized from connection |
| model | String(200) | Not Null | e.g., "gpt-4o" |
| idempotency_hash | String(64) | Unique, Not Null | SHA-256 hex of provider:org:model:bucket |
| bucket_start | Timestamp(tz) | Not Null | Start of the usage aggregation window |
| bucket_end | Timestamp(tz) | Not Null | End of the usage aggregation window |
| input_tokens_uncached | BigInteger | Not Null, Default: 0 | |
| input_tokens_cached | BigInteger | Not Null, Default: 0 | Anthropic cache_read only |
| input_tokens_cache_creation | BigInteger | Not Null, Default: 0 | Anthropic cache_creation only |
| output_tokens | BigInteger | Not Null, Default: 0 | |
| event_timestamp | Timestamp(tz) | Not Null | For billing period assignment |
| raw_payload | JSONB | Nullable | Original provider response for debugging |
| created_at | Timestamp(tz) | Not Null |
- Indexes: uq_telemetry_idempotency_hash (unique), ix_telemetry_org_event_ts (org_id + event_timestamp), ix_telemetry_workload_id, ix_telemetry_model
Immutability
Only input_tokens_*, output_tokens, and raw_payload may be updated via upsert (on idempotency_hash conflict). The hash, model, bucket_start, and event_timestamp are never modified after creation.
CarbonCalculation¶
1:1 with TelemetryEvent. Stores the full emissions breakdown with the factors version used.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| event_id | UUID | FK → TelemetryEvent, Unique, Not Null | 1:1 relationship |
| energy_joules | Float | Not Null | Total energy across all phases |
| energy_kwh | Float | Not Null | energy_joules / 3,600,000 |
| co2_kg | Float | Not Null | Central estimate |
| co2_lower_bound_kg | Float | Not Null | co2_kg × (1 - uncertainty) |
| co2_upper_bound_kg | Float | Not Null | co2_kg × (1 + uncertainty) |
| model_tier | String(50) | Not Null | Tier from fnmatch mapping |
| pue | Float | Not Null | Power Usage Effectiveness |
| grid_intensity | Float | Not Null | kg CO2 per kWh |
| uncertainty_pct | Float | Not Null | Uncertainty percentage |
| factors_version | String(20) | Not Null | e.g., "v1.0" |
| created_at | Timestamp(tz) | Not Null |
- Indexes: uq_calc_event_id (unique), ix_calc_factors_version
Billing Entities¶
BillingPeriod¶
Monthly aggregation scoped by event_timestamp. Lifecycle managed via state machine.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| org_id | UUID | FK → Organization, Not Null | |
| period_start | Date | Not Null | First day of billing month |
| period_end | Date | Not Null | Last day of billing month |
| status | Enum(open/closing/closed/failed) | Not Null, Default: open | |
| total_co2_kg | Float | Nullable | Aggregated at close |
| total_credits_retired | Float | Nullable | kg CO2 equivalent retired |
| stripe_invoice_id | String(200) | Nullable | |
| closed_at | Timestamp(tz) | Nullable | When receipt was generated |
| created_at | Timestamp(tz) | Not Null | |
| updated_at | Timestamp(tz) | Not Null |
- Indexes: uq_billing_org_period (unique composite: org_id + period_start), ix_billing_status
State machine
open → closing (payment succeeded) → closed (T+48h, receipt generated) or failed (payment failed / no inventory)
CarbonReceipt¶
Signed retirement proof. Immutable after creation.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| billing_period_id | UUID | FK → BillingPeriod, Not Null | |
| org_id | UUID | FK → Organization, Not Null | |
| serial_number | String(20) | Unique, Not Null | Format: CL-YYYYMM-XXXXX |
| co2_retired_kg | Float | Not Null | |
| credit_serial_numbers | JSONB | Not Null | Array of credit serial numbers |
| payload_hash | String(64) | Not Null | SHA-256 hex of canonical payload |
| signature | String(128) | Not Null | Ed25519 signature hex |
| public_key | String(64) | Not Null | Ed25519 public key hex |
| key_version | Integer | Not Null | For key rotation |
| pdf_url | String(500) | Nullable | S3 presigned URL or path |
| verification_url | String(500) | Not Null | Public verification link |
| created_at | Timestamp(tz) | Not Null |
- Indexes: uq_receipt_serial (unique), ix_receipt_org_id, ix_receipt_billing_period_id
Reference & Compliance Entities¶
CarbonFactors¶
Versioned global lookup table. Immutable per version - new versions are appended.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| version | String(20) | Not Null | e.g., "v1.0" |
| model_tier | String(50) | Not Null | e.g., "tier_1" |
| model_patterns | ARRAY(String) | Not Null | fnmatch globs for model matching |
| energy_per_token_prefill_j | Float | Not Null | Joules per prefill token |
| energy_per_token_decode_j | Float | Not Null | Joules per decode (output) token |
| energy_per_token_cached_j | Float | Not Null | Joules per cached token (~10% of prefill) |
| pue | Float | Not Null | Power Usage Effectiveness (typically 1.1-1.4) |
| grid_intensity_kg_per_kwh | Float | Not Null | kg CO2 per kWh |
| uncertainty_pct | Float | Not Null | Uncertainty percentage |
| created_at | Timestamp(tz) | Not Null |
- Indexes: uq_factors_version_tier (unique composite: version + model_tier)
AuditPack¶
Monthly zip bundle for compliance reporting.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| org_id | UUID | FK → Organization, Not Null | |
| year | Integer | Not Null | |
| month | Integer | Not Null | |
| s3_key | String(500) | Not Null | S3 object key for the zip |
| manifest_hash | String(64) | Not Null | SHA-256 of zip contents manifest |
| generated_at | Timestamp(tz) | Not Null | |
| created_at | Timestamp(tz) | Not Null |
- Indexes: uq_audit_org_year_month (unique composite: org_id + year + month)
PriorPeriodAdjustment¶
Late-arriving telemetry that belongs to a closed billing period is recorded here and rolled into the next open period.
| Field | Type | Constraints | Notes |
|---|---|---|---|
| id | UUID | PK | |
| org_id | UUID | FK → Organization, Not Null | |
| original_period_id | UUID | FK → BillingPeriod, Not Null | The closed period this data belongs to |
| adjustment_period_id | UUID | FK → BillingPeriod, Not Null | The open period where adjustment is applied |
| event_id | UUID | FK → TelemetryEvent, Not Null | |
| co2_delta_kg | Float | Not Null | Positive = undercount, negative = overcount |
| reason | String(500) | Not Null | e.g., "late_telemetry" |
| created_at | Timestamp(tz) | Not Null |
- Indexes: ix_ppa_org_id, ix_ppa_original_period, ix_ppa_adjustment_period
Key Design Decisions¶
- UUIDs everywhere: All primary keys are UUIDv4. Avoids sequential ID enumeration attacks. Generated by the application (not the database) for testability.
- Denormalized org_id on TelemetryEvent: Avoids a 3-table join (telemetry → workload → connection → org) on every dashboard query. The org_id is set at ingest time and never changes.
- JSONB for raw_payload and credit_serial_numbers: Flexible schema for provider-specific data and variable-length credit arrays without additional junction tables.
- Timestamp(tz) everywhere: All timestamps are timezone-aware UTC. No naive timestamps.
- Soft state machine over soft delete: BillingPeriod and ProviderConnection use status enums with defined transitions rather than soft delete flags. Records are never deleted - only their status changes.