The question most AI spend reviews miss: how much of your AI cost doesn't go through your API?
For most organizations in 2026, the answer is: a lot. GitHub Copilot is in your Microsoft enterprise agreement. Cursor is on every developer laptop. Salesforce AgentForce is baked into your CRM contract. ServiceNow Now Assist shipped in your last platform upgrade. None of these costs pass through your Anthropic or OpenAI API key. They're invoiced separately, reported in separate dashboards, and nobody has ever tried to add them up.
What "embedded AI" means in practice
When you call the Anthropic API directly, you control the request. You choose the model, the prompt, the max tokens. The cost is visible in your API account.
Embedded AI is different. Microsoft Copilot decides, internally, which model to use for a given task. You don't control the prompt format. You don't choose the model. You don't see the token counts. You just pay the per-seat license fee and get a monthly PDF invoice that says "Microsoft 365 Copilot — 200 seats — $6,000."
That $6,000 is AI spend. It's just invisible to the engineering team, the AI governance dashboard, and the finance team trying to build a chargeback model.
Why the number is bigger than you think
Per-seat pricing hides utilization variance. You're paying the same per seat regardless of whether that seat runs 10 AI requests per day or 1,000.
But for compliance and governance purposes, the per-seat cost is actually the easier part. The harder question is: what data is that seat sending to Microsoft's AI infrastructure? What prompts? What documents? What customer data?
The $6,000 invoice doesn't tell you. Microsoft's Copilot dashboard tells you usage counts — requests, active users, feature breakdown. It doesn't tell you prompt content or data classification. For a regulated organization, that's an incomplete answer to a compliance question.
The three cost categories you need to unify
A complete view of AI spend for most organizations in 2026 covers:
Direct API costs: Your OpenAI, Anthropic, Bedrock, and Azure OpenAI API keys. These are the most controllable — you choose the model, you see the token counts, you can run a gateway. This is what your AI monitoring dashboard shows today, if you have one.
Embedded AI in developer tools: Copilot (GitHub), Cursor, Tabnine, Codeium. These are usually per-seat, billed monthly, with usage data available through each tool's admin API. For developer-hours-as-a-cost-center, this spend belongs in your engineering cost allocation.
Embedded AI in business SaaS: AgentForce (Salesforce), Now Assist (ServiceNow), Copilot for Microsoft 365 (not GitHub — the enterprise productivity suite), Duet AI (Google Workspace). These are often line items inside larger platform contracts and may not be broken out clearly in invoices.
A chargeback model that only covers the first category is missing whatever the second and third categories represent for your org. For some organizations, that's 40-60% of total AI spend.
Getting the numbers from vendor APIs
Most enterprise AI SaaS tools have admin APIs that report usage. The data quality varies, but it's better than nothing.
Microsoft 365 Copilot: Microsoft Graph API has a reports/getMicrosoft365CopilotUsageUserDetail endpoint that returns per-user usage broken down by feature (Teams, Word, Outlook, etc.). It doesn't give you token counts, but it gives you active days and feature usage counts. Good enough for a per-user attribution model.
GitHub Copilot: The GitHub Copilot Business API reports per-seat usage: suggestions shown, acceptances, languages used. Exportable as JSON. Clean data.
Salesforce AgentForce: Einstein Platform credits track agent usage. The Einstein Analytics API surfaces consumption data. Getting it normalized requires more work than GitHub's API but it's accessible.
Cursor: Less structured. Cursor's admin panel has usage data, but the API surface for programmatic export is thinner. Some organizations just pull the monthly invoice total and allocate it by seat to the relevant teams.
The pattern for each connector: authenticate with an admin credential, call the usage endpoint on a schedule (daily or weekly), normalize the output to your standard request log schema (org, project, tokens, cost, model_family, date), insert into your AI cost ledger. The vendor's per-call cost isn't always visible, so you often work backward from invoice total ÷ usage count to get a per-unit cost.
The unified ledger view
Once your connectors are running, the ledger looks like:
| Date | Source | Vendor | Seats / Calls | Attributed Cost | Team / GL Code | |------|--------|--------|---------------|----------------|----------------| | 2026-06-01 | direct | Anthropic | 24,000 calls | $182 | Engineering / 7400-AI | | 2026-06-01 | connector | GitHub Copilot | 85 active seats | $765 | Engineering / 7400-DEV | | 2026-06-01 | connector | M365 Copilot | 200 seats | $6,000 | All BUs / 7400-PROD | | 2026-06-01 | connector | AgentForce | 12,000 agent calls | $340 | Sales / 7400-CRM |
Total June 1 AI spend: $7,287 — not $182.
Finance sees a real number. The chargeback model has something to work with. The compliance team can ask "what AI systems handled data in our Sales org?" and get an answer that includes AgentForce.
The governance gap in embedded tools
Direct API traffic is governable: you put a gateway in front of it. Embedded AI in SaaS tools isn't governable in the same way — you can't put a gateway in front of Copilot. Microsoft controls the request path.
What you can govern: who has access, what data sources they can connect to the AI features, and what your data processing agreement says about training use. For compliance purposes, documenting your configuration choices and the applicable DPA is the evidence of governance.
This is a weaker control than a gateway with PII detection. It's the right conversation to have with your compliance team: "For embedded tools, our governance is access control + DPA documentation. For direct API traffic, we have a technical control that inspects every request."
Both should be in your control inventory. They're different tiers of control, and that's okay to document explicitly.
Starting with what you have
You don't need all the connectors on day one. Start with:
- Get your direct API traffic into the gateway. That's the baseline.
- Pull GitHub Copilot data — it's the cleanest vendor API of the three.
- Add M365 Copilot if that's a significant line item.
- Add the rest as they become worth the connector effort.
The goal isn't perfection on day one. It's a meaningful improvement over "we have no idea." A view that covers your top two or three AI spend sources, unified into one number, is already a different kind of conversation with finance.
Visionality's SaaS connectors bring Copilot, Cursor, AgentForce, and ServiceNow spend into the same ledger as your direct API costs. See it in a demo →