Cost Calculation

How Lumina calculates per-instance costs using Savings Plans and Reserved Instance allocation

This page describes the algorithms Lumina uses to calculate per-instance costs, including how AWS Savings Plans and Reserved Instances are allocated. It also documents known limitations and differences from AWS’s actual billing.

Overview

Lumina replicates AWS’s cost allocation algorithm to estimate how Savings Plans (SPs) and Reserved Instances (RIs) are applied to running EC2 instances. The goal is to provide real-time cost visibility for Kubernetes capacity management, not to replicate AWS billing exactly.

AWS Documentation References:

Key Concepts

Rate-Based vs Cumulative:

Lumina uses an instantaneous rate-based model ($/hour snapshot)
AWS billing uses cumulative tracking within each billing hour (AWS Billing documentation)
This means Lumina’s costs are estimates based on “if current instances keep running”

ShelfPrice vs EffectiveCost:

ShelfPrice: On-demand rate with no discounts (e.g., $1.00/hr for m5.xlarge)
EffectiveCost: Actual estimated cost after all discounts (e.g., $0.34/hr with SP)

Priority Order

AWS applies discounts in strict priority order (AWS Savings Plans application order):

Spot Pricing – Spot instances always pay spot market rate (no RIs/SPs apply)
Reserved Instances (RIs) – Applied first to exact instance type + AZ matches
EC2 Instance Savings Plans – Applied to specific instance family + region
Compute Savings Plans – Applied to any instance family, any region
On-Demand – Remaining uncovered usage pays full on-demand rates

graph TD
    classDef spot fill:#E8F0FE,stroke:#4285F4,color:#333
    classDef ri fill:#E6F4EA,stroke:#34A853,color:#333
    classDef sp fill:#FFF3E0,stroke:#FB8C00,color:#333
    classDef od fill:#FCE4EC,stroke:#E91E63,color:#333
    classDef check fill:#F3E5F5,stroke:#9C27B0,color:#333

    START["EC2 Instance"]:::check
    IS_SPOT{"Spot instance?"}:::check
    SPOT_RATE["Pay spot market rate"]:::spot
    HAS_RI{"Matching RI<br/>available?"}:::check
    RI_RATE["RI-covered<br/>EffectiveCost = $0"]:::ri
    HAS_EC2SP{"EC2 Instance SP<br/>with capacity?"}:::check
    EC2SP_RATE["EC2 Instance SP rate"]:::sp
    HAS_CSP{"Compute SP<br/>with capacity?"}:::check
    CSP_RATE["Compute SP rate"]:::sp
    OD_RATE["On-Demand rate"]:::od

    START --> IS_SPOT
    IS_SPOT -->|Yes| SPOT_RATE
    IS_SPOT -->|No| HAS_RI
    HAS_RI -->|Yes| RI_RATE
    HAS_RI -->|No| HAS_EC2SP
    HAS_EC2SP -->|Yes| EC2SP_RATE
    HAS_EC2SP -->|No| HAS_CSP
    HAS_CSP -->|Yes| CSP_RATE
    HAS_CSP -->|No| OD_RATE

Lumina implements all of these priorities correctly.

Reserved Instances Algorithm

AWS Documentation: How Reserved Instances are applied

Matching Rules

Reserved Instances match based on (AWS RI Matching Rules):

Instance Type: Exact match (e.g., RI for m5.xlarge only covers m5.xlarge)
Availability Zone: Exact match (e.g., RI in us-west-2a only covers us-west-2a)
Account: RIs only apply within the same AWS account
Lifecycle: RIs do NOT apply to spot instances

Allocation Algorithm

1. Group RIs by (instance_type, availability_zone, account_id)
2. For each group:
   a. Find all matching running instances (not spot)
   b. Sort instances by launch time (oldest first)
   c. Apply RI coverage to oldest instances until RI count exhausted
3. Mark covered instances:
   - EffectiveCost = $0 (RIs are pre-paid)
   - RICoverage = ShelfPrice (what the RI contributed)
   - CoverageType = "reserved_instance"

graph TD
    classDef step fill:#E8F0FE,stroke:#4285F4,color:#333
    classDef decision fill:#F3E5F5,stroke:#9C27B0,color:#333
    classDef result fill:#E6F4EA,stroke:#34A853,color:#333

    GROUP["Group RIs by<br/>instance_type + AZ + account"]:::step
    FIND["Find matching running instances<br/>(exclude spot)"]:::step
    SORT["Sort instances by launch time<br/>(oldest first)"]:::step
    HAS_RI{"RI count<br/>remaining?"}:::decision
    APPLY["Apply RI coverage<br/>EffectiveCost = $0"]:::result
    NEXT["Move to next instance"]:::step
    DONE["Remaining instances<br/>eligible for SP coverage"]:::result

    GROUP --> FIND --> SORT --> HAS_RI
    HAS_RI -->|Yes| APPLY --> NEXT --> HAS_RI
    HAS_RI -->|No| DONE

RI coverage is binary: An instance is either fully RI-covered or not covered at all.

Example

Scenario: 5 RIs for m5.xlarge in us-west-2a, 10 running instances

Result:

5 oldest instances: RI-covered (EffectiveCost = $0)
5 newest instances: Not RI-covered (eligible for SP coverage next)

Savings Plans Algorithm

AWS Documentation: How Savings Plans apply to your AWS usage

SP Types

EC2 Instance Savings Plans (docs):

Apply to specific instance family (e.g., “m5”) in specific region (e.g., “us-west-2”)
Higher priority than Compute SPs
Example: SP for “m5 in us-west-2” covers m5.large, m5.xlarge, m5.2xlarge, etc.

Compute Savings Plans (docs):

Apply to ANY instance family in ANY region
Lower priority than EC2 Instance SPs
Example: SP covers m5, c5, r5, across all regions

Matching Rules

Savings Plans match based on (AWS SP application rules):

Instance Family: EC2 Instance SPs require matching family; Compute SPs match all
Region: EC2 Instance SPs require matching region; Compute SPs match all regions
Account: SPs apply within the same AWS account
Lifecycle: SPs do NOT apply to spot instances
Existing Coverage: See Simplified Model Decisions

Allocation Algorithm

For each Savings Plan (in priority order: EC2 Instance SPs first, then Compute SPs):

1. Find all eligible instances:
   - Match SP criteria (family, region)
   - Not spot instances
   - Not already RI-covered
   - Not already SP-covered (simplified model)

2. Calculate savings for each instance:
   - ShelfPrice (on-demand rate)
   - SP Rate (discounted rate)
   - Savings % = (ShelfPrice - SP Rate) / ShelfPrice

3. Sort instances by priority:
   a. Highest savings % first (maximize cost reduction) - AWS behavior
   b. Tie-breaker: lowest SP rate first (stretch commitment further) - AWS behavior
   c. Tie-breaker: oldest launch time (stability) - Lumina-specific for stable metrics
   d. Tie-breaker: instance ID (determinism) - Lumina-specific for stable metrics

   Note: Tie-breakers (c) and (d) are Lumina's own decisions, not AWS's
   documented behavior. They ensure SP allocation remains consistent
   across reconciliation loops (every 5 minutes).

4. Apply SP coverage in priority order:
   For each instance:
     a. Calculate SP contribution = min(SP rate, remaining commitment)
     b. If commitment exhausted (partial coverage):
        - SP contributes what's left
        - Instance pays: (ShelfPrice - SP contribution) at on-demand rate
     c. Update instance:
        - EffectiveCost = SP rate (if full) OR on-demand spillover (if partial)
        - SavingsPlanCoverage = SP contribution (what SP paid)
        - CoverageType = "compute_savings_plan" or "ec2_instance_savings_plan"
     d. Consume SP commitment

5. Track SP utilization:
   - CurrentUtilizationRate = commitment consumed
   - RemainingCapacity = commitment - utilization
   - UtilizationPercent = (utilization / commitment) * 100

Full vs Partial Coverage

Full Coverage

Setup: Instance needs $0.72/hr, SP has $60.00/hr remaining

SP contributes: $0.72 (full SP rate)
Instance pays: $0.72 (fully discounted)
SP remaining: $59.28
No on-demand spillover

Partial Coverage (SP Exhaustion)

Setup: Instance needs $0.72/hr, but SP only has $0.10/hr remaining

SP contributes: $0.10 (all it has left)
Instance pays: $0.90 = $0.10 (from SP) + $0.80 (on-demand spillover)
SP remaining: $0.00 (exhausted)

The EffectiveCost metric ($0.90) is higher than the SP contribution ($0.10) because it includes on-demand spillover. This is why sum(ec2_instance_hourly_cost) >= sum(savings_plan_current_utilization_rate). The difference represents real on-demand costs from partial coverage.

Example: Large-Scale SP Allocation

Scenario: $60/hr Compute SP commitment, 28% discount (0.72 multiplier), 200 m5.xlarge instances ($1.00 OD, $0.72 SP rate)

Allocation breakdown:

Sort instances by savings priority (highest % first)
Cover first 83 instances fully: 83 x $0.72 = $59.76 consumed
Instance #84 gets partial coverage:
- SP contributes: $0.24 (all that remains)
- On-demand spillover: $0.76 (remainder at OD rate)
- EffectiveCost: $1.00 ($0.24 SP + $0.76 OD)
Remaining 116 instances: On-demand ($1.00 each, no SP coverage)

SP Metrics:

Commitment: $60.00/hr
Utilization: $60.00/hr (100%)
Remaining: $0.00/hr

Instance Cost Metrics:

Instances 1-83: ec2_instance_hourly_cost{cost_type="compute_savings_plan"} = 0.72 x 83 = $59.76
Instance 84: ec2_instance_hourly_cost{cost_type="compute_savings_plan"} = 1.00 (includes $0.76 OD spillover)
Instances 85-200: ec2_instance_hourly_cost{cost_type="on_demand"} = 1.00 x 116 = $116.00

Total instance costs: $59.76 + $1.00 + $116.00 = $176.76/hr

Simplified Model Decisions

Lumina uses a simplified Savings Plans model that differs from AWS’s actual billing in one critical way.

One SP Per Instance Rule

AWS’s actual behavior:

Multiple Savings Plans can apply to the same instance
Example: EC2 Instance SP covers part, Compute SP covers the rest

Lumina’s simplified model:

Once a Savings Plan covers an instance, no other SPs can apply
This prevents double-counting and commitment waste

Why this simplification?

Prevents commitment accounting errors – Multiple SPs applying to same instance caused bugs where SP utilization was correct but instance costs were artificially low (double-discounted)
Operational simplicity – Easier to reason about which SP is covering which instance
Rate-based limitation – Lumina’s instantaneous snapshot model makes multi-SP allocation complex
Minimal practical impact – In most AWS organizations, SPs are sized to fully cover instances without overlap

When This Matters

The simplified model under-estimates costs when:

You have many small SP commitments
Multiple SPs partially cover the same instances
Result: Some instances show less SP coverage than they would get from AWS

Example where AWS differs:

Instance: m5.2xlarge, ShelfPrice=$2.00
EC2 Instance SP: Has $0.50 left (not enough for full $1.44 SP rate)
Compute SP: Has $60 left (plenty of capacity)

AWS Behavior:
- EC2 Instance SP contributes: $0.50
- Compute SP contributes: $0.94 (to reach full Compute SP rate of $1.44)
- Instance EffectiveCost: $1.44

Lumina Behavior:
- EC2 Instance SP contributes: $0.50
- Compute SP: BLOCKED (instance already has SP coverage)
- Instance EffectiveCost: $1.50 ($0.50 from SP + $1.00 on-demand spillover)

Impact: Lumina shows $0.06/hr higher cost for this instance than AWS bills ($1.50 vs $1.44).

Known Limitations

1. Rate-Based vs Cumulative Billing

Lumina: Instantaneous snapshot, assumes instances keep running
AWS: Cumulative tracking within each billing hour
Impact: If instances scale up/down during an hour, Lumina’s costs will not match AWS exactly. Lumina may show higher costs if short-lived instances exhaust SP capacity.

2. Simplified SP Model

One SP per instance (see above)
Impact: Under-estimates costs in edge cases with multiple partial SPs. Typically less than 5% impact on total costs.

3. Regional vs Zonal RIs

Lumina treats all RIs as zonal (tied to specific AZ)
AWS has “Regional RIs” that can float across AZs in a region
Impact: Lumina may under-utilize Regional RIs. Instances in different AZs will not share Regional RI pool.
Status: Low priority – most production RIs are zonal for capacity guarantees.

4. RI Instance Size Flexibility

Lumina requires exact instance type match for RIs
AWS allows some instance size flexibility within same family (e.g., 2x m5.large = 1x m5.xlarge)
Impact: Lumina will not apply RI coverage to differently-sized instances in the same family.
Status: Medium priority – common in production, but complex to implement correctly.

5. Capacity Reservations

Lumina does not track AWS Capacity Reservations
Impact: Capacity Reservation usage is treated as on-demand. No cost impact (same rate), but capacity planning metrics may be affected.
Status: Low priority – Capacity Reservations are relatively rare.

Metrics and Invariants

Critical Invariants

These invariants must always hold true. If they do not, there is a bug in the cost calculation logic.

Invariant 1: SP-covered costs >= SP utilization

sum(ec2_instance_hourly_cost{cost_type="compute_savings_plan"}) +
sum(ec2_instance_hourly_cost{cost_type="ec2_instance_savings_plan"})
  >=
sum(savings_plan_current_utilization_rate)

SP-covered instances may have on-demand spillover from partial coverage.

Invariant 2: SP utilization <= SP commitment

sum(savings_plan_current_utilization_rate) <= sum(savings_plan_hourly_commitment)

Cannot consume more SP capacity than exists.

Invariant 3: No negative costs

All ec2_instance_hourly_cost values >= 0

Useful PromQL Queries

# Total compute cost
sum(ec2_instance_hourly_cost)

# SP utilization rate
sum(savings_plan_current_utilization_rate) / sum(savings_plan_hourly_commitment) * 100

# On-demand spillover from partial SP coverage
(sum(ec2_instance_hourly_cost{cost_type=~".*savings_plan"}) -
 sum(savings_plan_current_utilization_rate))

# Wasted SP capacity
sum(savings_plan_hourly_commitment) - sum(savings_plan_current_utilization_rate)

Test Scenarios

The cost calculation algorithms are validated by comprehensive test scenarios using simple, easy-to-understand pricing.

Test Location: pkg/cost/calculator_comprehensive_test.go

Test Pricing Scheme

Instance Type	On-Demand	Compute SP (28% discount)	Spot Market
m5.2xlarge	$2.00/hr	$1.44/hr	N/A
m5.xlarge	$1.00/hr	$0.72/hr	$0.50/hr
c5.xlarge	$1.00/hr	$0.72/hr	$0.40/hr
t3.medium	$0.50/hr	$0.36/hr	$0.20/hr

Run the tests:

# Run all comprehensive scenarios
go test -v -run TestCalculatorComprehensiveScenarios ./pkg/cost

# Run specific scenario
go test -v -run "TestCalculatorComprehensiveScenarios/Scenario_1" ./pkg/cost