← All articlesCost Control

Why Your AI Bill Is One Line Item (And How to Fix It)

What it actually takes to get per-project cost attribution in place — from someone who's built it.

·7 min read

Every month, the same conversation: finance sends the invoice, engineering gets tagged, nobody knows what $47,000 of OpenAI usage actually bought.

It's not an accounting problem. It's an architecture problem. The provider doesn't know which team, which product feature, or which budget line called the API. It just knows your organization's API key did. And your API key is shared across everything.

This is the most common AI cost situation in 2026, and it's almost completely preventable.

Why attribution is harder than it looks

The naive fix is: "just add a tag or a header." OpenAI and Anthropic both have user metadata fields. You can pass a user ID or project name per request. Some teams do this.

The problem is consistency. In a codebase with five engineers and three AI integrations, someone will forget. Someone will put the wrong tag. Someone will copy-paste an old feature's project code. The metadata is unenforceable at the API key level — it's just a field in a JSON body that the model ignores.

Worse, tagged metadata doesn't solve the embedded AI problem. Your Microsoft 365 Copilot usage doesn't pass through your API key at all. Neither does Cursor. Those tools have their own billing relationships with Microsoft and xAI. You see those bills in entirely different places, if you see them at all.

So you end up with:

  • Some spend you can partially attribute (direct API calls, with inconsistent tagging)
  • Some spend you can't attribute at all (embedded AI in SaaS tools)
  • No unified view of either

What per-project attribution actually requires

To get attribution that finance trusts, you need four things:

1. A single ingress point for all direct API traffic. A gateway that all your API clients route through. Not "most" clients — all of them. The moment any client bypasses the gateway, your attribution is incomplete and finance knows it.

2. Enforcement at the transport layer, not the application layer. Tagging metadata in your application code is a convention. A gateway that requires a project header — and rejects requests without one — is a contract. One approach blocks the request; the other just logs the gap.

3. Allocation rules that finance controls. The mapping from "this project called this model" to "this GL code" shouldn't live in your codebase. Finance will change it. It should be a config that finance can edit without touching a PR.

4. A unified ingestion path for SaaS AI tools. Copilot, Cursor, AgentForce — these need connectors that pull usage from each vendor's admin API and land it in the same schema as your direct traffic. If the schemas are different, you're back to manual reconciliation every month.

The GL code problem

Most engineering teams, when they think about cost attribution, think in terms of projects or features. Finance thinks in GL codes.

The gap matters. A "project" in your system might map to three different GL codes depending on which business unit owns it, whether it's capex or opex, and what the accounting period looks like. Getting that mapping right — and keeping it current as the org changes — is genuinely hard.

The right model: the gateway receives traffic tagged with an org, project, and task class. An allocation rule table maintained by finance maps those tuples to GL codes. Most-specific-match-first: if there's a rule for (org=acme, project=search, model=claude-3-5-sonnet), it wins over the generic (org=acme, project=search) rule. Finance can update the rules without touching the gateway.

This is a small detail that becomes a very large argument preventer.

What the ledger should look like

When you have proper attribution, the end-of-month export looks like this:

| Date | Org | Project | Task Class | Model | Input Tokens | Output Tokens | Cost USD | GL Code | |------|-----|---------|-----------|-------|-------------|--------------|---------|---------| | 2026-06-01 | acme | search | retrieval | claude-3-haiku | 1,240,000 | 48,000 | $4.12 | 7400-AI-OPS | | 2026-06-01 | acme | copilot-assist | generation | gpt-4o | 890,000 | 210,000 | $11.70 | 7400-AI-PROD |

Finance can pivot that by GL code. They can compare it to budget. They can build a chargeback model from it. They can answer the compliance question "show me every AI call made on behalf of this business unit in Q2."

That's the ledger. Not a dashboard screenshot — an actual export you can drop into a spreadsheet.

The embedded AI problem isn't going away

Microsoft is selling Copilot into every enterprise contract renewal. Salesforce baked AgentForce into their platform pricing. ServiceNow has Now Assist. Cursor is in every developer laptop purchase decision.

Each of these has a separate billing relationship. Each one shows up as a line item in a different system. None of them pass through your API gateway.

The only way to get unified visibility is to pull from each vendor's admin API. Microsoft has a Copilot usage report API. Salesforce has Einstein usage data. You write a connector, schedule it nightly, normalize the output to your request log schema, and suddenly the Copilot spend is sitting next to the direct Anthropic spend in the same table.

It's not glamorous work. But it's the only thing that gives finance a single number.

Every token counts

The point isn't just cost savings — it's visibility. An organization that can see its AI spend by project, team, model, and GL code is in a position to make intentional decisions about it. An organization staring at a single invoice line is reacting, always.

The infrastructure to get here isn't complex. A gateway, an allocation rule table, a set of vendor connectors, and a chargeback export. Thirty minutes to deploy. An afternoon to configure. And then finance and engineering are finally looking at the same number.


Visionality is the gateway and dashboard that makes this work. See a demo →

Visionality.AI

See how Visionality handles this.

30-minute demo. Live deployment. Your questions answered directly — no slides, no pitch.