How to Build an AI Chargeback Model Your CFO Will Approve

The CFO's question isn't "how much did we spend on AI?" It's "which business unit spent it, on what, and does it map to revenue?"

Those are different questions. The first has been answerable for a while — add up your provider invoices. The second requires attribution, and most engineering organizations haven't built it yet.

A chargeback model is the mechanism that takes AI spend from "operations cost" to "business unit cost." Here's how to build one that finance will actually use.

What finance needs from an AI chargeback model

Finance needs four things:

Attribution. Which cost center, GL code, or business unit generated this spend? Not "which API key" — which part of the business. The AI key is often shared. The business unit is what the chart of accounts tracks.

Period alignment. AI spend in the chargeback model needs to match the period in the underlying invoice. If your Anthropic invoice covers April 1–April 30 and your chargeback CSV covers April 3–May 2, reconciliation becomes a quarterly headache.

Verifiability. Finance will eventually want to trace a chargeback number back to individual transactions. "The April charge for the search team was $3,200" needs to be supportable by a drill-down into the underlying calls. If the number is a black box, the model won't survive a budget meeting where someone challenges it.

Consistency. The methodology for calculating AI cost needs to be consistent month-over-month. If you switch how you estimate costs mid-year, prior periods become incomparable and the variance analysis breaks.

The three-layer attribution model

The cleanest AI chargeback model has three layers:

Layer 1: Provider cost. The actual USD cost from your provider's invoice. This is your source of truth. Everything else is an allocation of this number.

Layer 2: Gateway allocation. The gateway attributes each call to an org, project, and task class at the time of the call. This is the authoritative per-transaction record — not an estimate, not an after-the-fact analysis.

Layer 3: GL mapping. Finance maintains a rule table that maps (org, project, task_class) tuples to GL codes. This is what converts the gateway's technical attribution to the financial system's language.

The gateway layer plus the GL mapping layer together give you per-GL-code cost, per-period, derived from the same transaction record that generated the actual API call.

Building the rule table

The GL mapping rule table is the piece that requires the most coordination with finance. The structure:

org       | project      | task_class | model     | gl_code       | cost_center
----------|--------------|------------|-----------|---------------|------------
acme      | search       | retrieval  | *         | 7400-AI-PROD  | ENG-SEARCH
acme      | search       | *          | *         | 7400-AI-PROD  | ENG-SEARCH
acme      | *            | *          | *         | 7400-AI-MISC  | ENG
*         | *            | *          | *         | 7400-AI-MISC  | UNALLOC

Matching is most-specific-first. A call from org acme, project search, task class retrieval matches the first rule. A call that only has org and project matches the second. A wildcard-only call falls through to the last rule.

The wildcard catch-all (7400-AI-MISC, UNALLOC) is important: it captures calls that don't match any specific rule. Finance should be able to see how much spend is unallocated — a high unallocated percentage signals that the rule table needs updating.

Finance maintains this table. Engineering doesn't touch it after the initial setup. When a new project launches, the project's engineering lead provides the org/project metadata and finance writes the rule.

Period close procedure

At the end of each accounting period, the procedure is:

Export the gateway's transaction ledger for the period (all calls with timestamps, token counts, estimated costs, org/project/task class, resolved GL code)
Cross-check the total estimated cost against the actual provider invoice (small variance expected due to estimation rounding; document the reconciliation)
Produce the chargeback CSV: one row per GL code, total cost, call count, for the period
Distribute to finance for posting to the GL

The cross-check step is the one most teams skip, and it's the one that causes the most problems. Estimated costs at call time and actual billed costs can diverge slightly due to rounding, model version changes, or provider pricing updates. If you don't reconcile, the cumulative chargeback over a year can be materially different from the actual invoice total.

Handling the embedded AI cost allocation

Your direct API costs can be attributed to the call level. Your Copilot and Cursor costs can't — they're per-seat.

For per-seat embedded AI costs, the allocation approach is simpler: assign cost to the cost center that employs the seat holder. A developer in the search team whose Copilot seat costs $19/month gets $19 charged to the search team's GL code. Multiply across all seats, segment by team.

This is an approximation — it assumes equal utilization per seat. If you have usage data from the vendor API, you can refine it: allocate based on active days or request count rather than raw seat count. For most organizations, the per-seat approximation is accurate enough for a chargeback model.

The important thing is to include it. A chargeback model that covers direct API costs but excludes Copilot and Cursor understates the engineering team's AI spend by whatever those tools cost.

What the monthly report looks like

The output finance actually wants:

| Business Unit | GL Code | Direct API ($) | Dev Tools ($) | SaaS AI ($) | Total ($) | Budget ($) | Variance | |--------------|---------|---------------|--------------|------------|----------|----------|---------| | Engineering — Search | 7400-AI-PROD | 3,240 | 1,520 | — | 4,760 | 5,000 | +240 | | Engineering — Platform | 7400-AI-PROD | 1,890 | 2,280 | — | 4,170 | 4,000 | -170 | | Sales | 7400-AI-CRM | — | — | 4,100 | 4,100 | 4,500 | +400 | | Unallocated | 7400-AI-MISC | 410 | — | — | 410 | — | — | | Total | | 5,540 | 3,800 | 4,100 | 13,440 | 13,500 | +60 |

This is the conversation finance wants to have. Not "we spent $5,540 on Anthropic" but "the search team spent $4,760 on AI in May, which was $240 under budget."

The platform team is over budget. That's a conversation. The search team is under. That's potentially a sign that the search team's AI feature is underperforming (not generating enough requests to justify the budget allocation) — or that the budget was conservatively set.

Making the model survive the first year

Three things that kill AI chargeback models in their first year:

Inconsistency in the rule table. Finance updates the GL codes mid-year for a reorg and doesn't backfill the old records. Prior periods now look different from current periods for the same team. Document the effective date of every rule change.

Unallocated spend that nobody investigates. The UNALLOC line grows over time as new projects launch without corresponding rules. Finance notices but doesn't know who to ask. The fix: the engineering lead who ships a new project owns the rule table update, before the feature goes live.

Estimation variance that accumulates. If you're not reconciling estimated costs to actual invoices each period, you'll drift. A 2% variance per month is 24% over a year — enough to make the chargeback model look unreliable. Monthly reconciliation is a 20-minute procedure. Do it.

Visionality's gateway records every allocation decision at call time and exports clean chargeback CSVs. See how it works →