Cost Calculation

How Lumina calculates per-instance costs using Savings Plans and Reserved Instance allocation

This page describes the algorithms Lumina uses to calculate per-instance costs, including how AWS Savings Plans and Reserved Instances are allocated. It also documents known limitations and differences from AWS’s actual billing.

Overview

Lumina replicates AWS’s cost allocation algorithm to estimate how Savings Plans (SPs) and Reserved Instances (RIs) are applied to running EC2 instances. The goal is to provide real-time cost visibility for Kubernetes capacity management, not to replicate AWS billing exactly.

AWS Documentation References:

Key Concepts

Rate-Based vs Cumulative:

  • Lumina uses an instantaneous rate-based model ($/hour snapshot)
  • AWS billing uses cumulative tracking within each billing hour (AWS Billing documentation)
  • This means Lumina’s costs are estimates based on “if current instances keep running”

ShelfPrice vs EffectiveCost:

  • ShelfPrice: On-demand rate with no discounts (e.g., $1.00/hr for m5.xlarge)
  • EffectiveCost: Actual estimated cost after all discounts (e.g., $0.34/hr with SP)

Priority Order

AWS applies discounts in strict priority order (AWS Savings Plans application order):

  1. Spot Pricing – Spot instances always pay spot market rate (no RIs/SPs apply)
  2. Reserved Instances (RIs) – Applied first to exact instance type + AZ matches
  3. EC2 Instance Savings Plans – Applied to specific instance family + region
  4. Compute Savings Plans – Applied to any instance family, any region
  5. On-Demand – Remaining uncovered usage pays full on-demand rates
graph TD
    classDef spot fill:#E8F0FE,stroke:#4285F4,color:#333
    classDef ri fill:#E6F4EA,stroke:#34A853,color:#333
    classDef sp fill:#FFF3E0,stroke:#FB8C00,color:#333
    classDef od fill:#FCE4EC,stroke:#E91E63,color:#333
    classDef check fill:#F3E5F5,stroke:#9C27B0,color:#333

    START["EC2 Instance"]:::check
    IS_SPOT{"Spot instance?"}:::check
    SPOT_RATE["Pay spot market rate"]:::spot
    HAS_RI{"Matching RI<br/>available?"}:::check
    RI_RATE["RI-covered<br/>EffectiveCost = $0"]:::ri
    HAS_EC2SP{"EC2 Instance SP<br/>with capacity?"}:::check
    EC2SP_RATE["EC2 Instance SP rate"]:::sp
    HAS_CSP{"Compute SP<br/>with capacity?"}:::check
    CSP_RATE["Compute SP rate"]:::sp
    OD_RATE["On-Demand rate"]:::od

    START --> IS_SPOT
    IS_SPOT -->|Yes| SPOT_RATE
    IS_SPOT -->|No| HAS_RI
    HAS_RI -->|Yes| RI_RATE
    HAS_RI -->|No| HAS_EC2SP
    HAS_EC2SP -->|Yes| EC2SP_RATE
    HAS_EC2SP -->|No| HAS_CSP
    HAS_CSP -->|Yes| CSP_RATE
    HAS_CSP -->|No| OD_RATE

Lumina implements all of these priorities correctly.

Reserved Instances Algorithm

AWS Documentation: How Reserved Instances are applied

Matching Rules

Reserved Instances match based on (AWS RI Matching Rules):

  • Instance Type: Exact match (e.g., RI for m5.xlarge only covers m5.xlarge)
  • Availability Zone: Exact match (e.g., RI in us-west-2a only covers us-west-2a)
  • Account: RIs only apply within the same AWS account
  • Lifecycle: RIs do NOT apply to spot instances

Allocation Algorithm

1. Group RIs by (instance_type, availability_zone, account_id)
2. For each group:
   a. Find all matching running instances (not spot)
   b. Sort instances by launch time (oldest first)
   c. Apply RI coverage to oldest instances until RI count exhausted
3. Mark covered instances:
   - EffectiveCost = $0 (RIs are pre-paid)
   - RICoverage = ShelfPrice (what the RI contributed)
   - CoverageType = "reserved_instance"
graph TD
    classDef step fill:#E8F0FE,stroke:#4285F4,color:#333
    classDef decision fill:#F3E5F5,stroke:#9C27B0,color:#333
    classDef result fill:#E6F4EA,stroke:#34A853,color:#333

    GROUP["Group RIs by<br/>instance_type + AZ + account"]:::step
    FIND["Find matching running instances<br/>(exclude spot)"]:::step
    SORT["Sort instances by launch time<br/>(oldest first)"]:::step
    HAS_RI{"RI count<br/>remaining?"}:::decision
    APPLY["Apply RI coverage<br/>EffectiveCost = $0"]:::result
    NEXT["Move to next instance"]:::step
    DONE["Remaining instances<br/>eligible for SP coverage"]:::result

    GROUP --> FIND --> SORT --> HAS_RI
    HAS_RI -->|Yes| APPLY --> NEXT --> HAS_RI
    HAS_RI -->|No| DONE

RI coverage is binary: An instance is either fully RI-covered or not covered at all.

Example

Scenario: 5 RIs for m5.xlarge in us-west-2a, 10 running instances

Result:

  • 5 oldest instances: RI-covered (EffectiveCost = $0)
  • 5 newest instances: Not RI-covered (eligible for SP coverage next)

Savings Plans Algorithm

AWS Documentation: How Savings Plans apply to your AWS usage

SP Types

EC2 Instance Savings Plans (docs):

  • Apply to specific instance family (e.g., “m5”) in specific region (e.g., “us-west-2”)
  • Higher priority than Compute SPs
  • Example: SP for “m5 in us-west-2” covers m5.large, m5.xlarge, m5.2xlarge, etc.

Compute Savings Plans (docs):

  • Apply to ANY instance family in ANY region
  • Lower priority than EC2 Instance SPs
  • Example: SP covers m5, c5, r5, across all regions

Matching Rules

Savings Plans match based on (AWS SP application rules):

  • Instance Family: EC2 Instance SPs require matching family; Compute SPs match all
  • Region: EC2 Instance SPs require matching region; Compute SPs match all regions
  • Account: SPs apply within the same AWS account
  • Lifecycle: SPs do NOT apply to spot instances
  • Existing Coverage: See Simplified Model Decisions

Allocation Algorithm

For each Savings Plan (in priority order: EC2 Instance SPs first, then Compute SPs):

1. Find all eligible instances:
   - Match SP criteria (family, region)
   - Not spot instances
   - Not already RI-covered
   - Not already SP-covered (simplified model)

2. Calculate savings for each instance:
   - ShelfPrice (on-demand rate)
   - SP Rate (discounted rate)
   - Savings % = (ShelfPrice - SP Rate) / ShelfPrice

3. Sort instances by priority:
   a. Highest savings % first (maximize cost reduction) - AWS behavior
   b. Tie-breaker: lowest SP rate first (stretch commitment further) - AWS behavior
   c. Tie-breaker: oldest launch time (stability) - Lumina-specific for stable metrics
   d. Tie-breaker: instance ID (determinism) - Lumina-specific for stable metrics

   Note: Tie-breakers (c) and (d) are Lumina's own decisions, not AWS's
   documented behavior. They ensure SP allocation remains consistent
   across reconciliation loops (every 5 minutes).

4. Apply SP coverage in priority order:
   For each instance:
     a. Calculate SP contribution = min(SP rate, remaining commitment)
     b. If commitment exhausted (partial coverage):
        - SP contributes what's left
        - Instance pays: (ShelfPrice - SP contribution) at on-demand rate
     c. Update instance:
        - EffectiveCost = SP rate (if full) OR on-demand spillover (if partial)
        - SavingsPlanCoverage = SP contribution (what SP paid)
        - CoverageType = "compute_savings_plan" or "ec2_instance_savings_plan"
     d. Consume SP commitment

5. Track SP utilization:
   - CurrentUtilizationRate = commitment consumed
   - RemainingCapacity = commitment - utilization
   - UtilizationPercent = (utilization / commitment) * 100

Full vs Partial Coverage

Full Coverage

Setup: Instance needs $0.72/hr, SP has $60.00/hr remaining

  • SP contributes: $0.72 (full SP rate)
  • Instance pays: $0.72 (fully discounted)
  • SP remaining: $59.28
  • No on-demand spillover

Partial Coverage (SP Exhaustion)

Setup: Instance needs $0.72/hr, but SP only has $0.10/hr remaining

  • SP contributes: $0.10 (all it has left)
  • Instance pays: $0.90 = $0.10 (from SP) + $0.80 (on-demand spillover)
  • SP remaining: $0.00 (exhausted)

The EffectiveCost metric ($0.90) is higher than the SP contribution ($0.10) because it includes on-demand spillover. This is why sum(ec2_instance_hourly_cost) >= sum(savings_plan_current_utilization_rate). The difference represents real on-demand costs from partial coverage.

Example: Large-Scale SP Allocation

Scenario: $60/hr Compute SP commitment, 28% discount (0.72 multiplier), 200 m5.xlarge instances ($1.00 OD, $0.72 SP rate)

Allocation breakdown:

  1. Sort instances by savings priority (highest % first)
  2. Cover first 83 instances fully: 83 x $0.72 = $59.76 consumed
  3. Instance #84 gets partial coverage:
    • SP contributes: $0.24 (all that remains)
    • On-demand spillover: $0.76 (remainder at OD rate)
    • EffectiveCost: $1.00 ($0.24 SP + $0.76 OD)
  4. Remaining 116 instances: On-demand ($1.00 each, no SP coverage)

SP Metrics:

  • Commitment: $60.00/hr
  • Utilization: $60.00/hr (100%)
  • Remaining: $0.00/hr

Instance Cost Metrics:

  • Instances 1-83: ec2_instance_hourly_cost{cost_type="compute_savings_plan"} = 0.72 x 83 = $59.76
  • Instance 84: ec2_instance_hourly_cost{cost_type="compute_savings_plan"} = 1.00 (includes $0.76 OD spillover)
  • Instances 85-200: ec2_instance_hourly_cost{cost_type="on_demand"} = 1.00 x 116 = $116.00

Total instance costs: $59.76 + $1.00 + $116.00 = $176.76/hr

Simplified Model Decisions

Lumina uses a simplified Savings Plans model that differs from AWS’s actual billing in one critical way.

One SP Per Instance Rule

AWS’s actual behavior:

  • Multiple Savings Plans can apply to the same instance
  • Example: EC2 Instance SP covers part, Compute SP covers the rest

Lumina’s simplified model:

  • Once a Savings Plan covers an instance, no other SPs can apply
  • This prevents double-counting and commitment waste

Why this simplification?

  1. Prevents commitment accounting errors – Multiple SPs applying to same instance caused bugs where SP utilization was correct but instance costs were artificially low (double-discounted)
  2. Operational simplicity – Easier to reason about which SP is covering which instance
  3. Rate-based limitation – Lumina’s instantaneous snapshot model makes multi-SP allocation complex
  4. Minimal practical impact – In most AWS organizations, SPs are sized to fully cover instances without overlap

When This Matters

The simplified model under-estimates costs when:

  • You have many small SP commitments
  • Multiple SPs partially cover the same instances
  • Result: Some instances show less SP coverage than they would get from AWS

Example where AWS differs:

Instance: m5.2xlarge, ShelfPrice=$2.00
EC2 Instance SP: Has $0.50 left (not enough for full $1.44 SP rate)
Compute SP: Has $60 left (plenty of capacity)

AWS Behavior:
- EC2 Instance SP contributes: $0.50
- Compute SP contributes: $0.94 (to reach full Compute SP rate of $1.44)
- Instance EffectiveCost: $1.44

Lumina Behavior:
- EC2 Instance SP contributes: $0.50
- Compute SP: BLOCKED (instance already has SP coverage)
- Instance EffectiveCost: $1.50 ($0.50 from SP + $1.00 on-demand spillover)

Impact: Lumina shows $0.06/hr higher cost for this instance than AWS bills ($1.50 vs $1.44).

Known Limitations

1. Rate-Based vs Cumulative Billing

  • Lumina: Instantaneous snapshot, assumes instances keep running
  • AWS: Cumulative tracking within each billing hour
  • Impact: If instances scale up/down during an hour, Lumina’s costs will not match AWS exactly. Lumina may show higher costs if short-lived instances exhaust SP capacity.

2. Simplified SP Model

  • One SP per instance (see above)
  • Impact: Under-estimates costs in edge cases with multiple partial SPs. Typically less than 5% impact on total costs.

3. Regional vs Zonal RIs

  • Lumina treats all RIs as zonal (tied to specific AZ)
  • AWS has “Regional RIs” that can float across AZs in a region
  • Impact: Lumina may under-utilize Regional RIs. Instances in different AZs will not share Regional RI pool.
  • Status: Low priority – most production RIs are zonal for capacity guarantees.

4. RI Instance Size Flexibility

  • Lumina requires exact instance type match for RIs
  • AWS allows some instance size flexibility within same family (e.g., 2x m5.large = 1x m5.xlarge)
  • Impact: Lumina will not apply RI coverage to differently-sized instances in the same family.
  • Status: Medium priority – common in production, but complex to implement correctly.

5. Capacity Reservations

  • Lumina does not track AWS Capacity Reservations
  • Impact: Capacity Reservation usage is treated as on-demand. No cost impact (same rate), but capacity planning metrics may be affected.
  • Status: Low priority – Capacity Reservations are relatively rare.

Metrics and Invariants

Critical Invariants

These invariants must always hold true. If they do not, there is a bug in the cost calculation logic.

Invariant 1: SP-covered costs >= SP utilization

sum(ec2_instance_hourly_cost{cost_type="compute_savings_plan"}) +
sum(ec2_instance_hourly_cost{cost_type="ec2_instance_savings_plan"})
  >=
sum(savings_plan_current_utilization_rate)

SP-covered instances may have on-demand spillover from partial coverage.

Invariant 2: SP utilization <= SP commitment

sum(savings_plan_current_utilization_rate) <= sum(savings_plan_hourly_commitment)

Cannot consume more SP capacity than exists.

Invariant 3: No negative costs

All ec2_instance_hourly_cost values >= 0

Useful PromQL Queries

# Total compute cost
sum(ec2_instance_hourly_cost)

# SP utilization rate
sum(savings_plan_current_utilization_rate) / sum(savings_plan_hourly_commitment) * 100

# On-demand spillover from partial SP coverage
(sum(ec2_instance_hourly_cost{cost_type=~".*savings_plan"}) -
 sum(savings_plan_current_utilization_rate))

# Wasted SP capacity
sum(savings_plan_hourly_commitment) - sum(savings_plan_current_utilization_rate)

Test Scenarios

The cost calculation algorithms are validated by comprehensive test scenarios using simple, easy-to-understand pricing.

Test Location: pkg/cost/calculator_comprehensive_test.go

Test Pricing Scheme

Instance TypeOn-DemandCompute SP (28% discount)Spot Market
m5.2xlarge$2.00/hr$1.44/hrN/A
m5.xlarge$1.00/hr$0.72/hr$0.50/hr
c5.xlarge$1.00/hr$0.72/hr$0.40/hr
t3.medium$0.50/hr$0.36/hr$0.20/hr

Run the tests:

# Run all comprehensive scenarios
go test -v -run TestCalculatorComprehensiveScenarios ./pkg/cost

# Run specific scenario
go test -v -run "TestCalculatorComprehensiveScenarios/Scenario_1" ./pkg/cost

References