The monthly "where did the AI spend go" meeting, and the gateway pattern that ends it

The first OpenAI invoice at a Fortune 500 is a novelty. The procurement team forwards it to finance with a smile. By the twentieth invoice, the smile is gone, the finance partner for the AI platform team has taken over the review, and the CIO is asking for a breakdown by team and by feature before the monthly close.

The AI platform lead does not have the breakdown. They have provider dashboards. The dashboards show spend per API key. That is not per team. That is not per feature. One API key serves five applications. One application calls three providers. The billing data exists. The attribution does not.

Everyone loses a week every month. The close runs late. The CIO starts asking harder questions. By the third quarter of this pattern, procurement freezes new AI contracts until the attribution story is clean.

Why provider dashboards cannot solve this

Every major LLM provider offers a spend dashboard. Usage per API key. Usage per model. Time-series charts. Export to CSV. At the engineering-team level, these dashboards are perfectly adequate for debugging a cost spike.

At the enterprise level, they fail in two ways.

The first is dimensional mismatch. The dimensions the provider records are the provider's dimensions. API key, model name, token counts, timestamps. The dimensions procurement and finance need are the enterprise's dimensions. Business unit, team, application, feature, cost center, environment. The enterprise dimensions are not in the provider's data. They were never captured.

The second is cross-provider sprawl. An enterprise with four pilot teams usually has three providers, four model families, and eight API keys by the end of year one. Unifying spend across providers requires reconciling four different invoice formats, four different usage APIs, and four different definitions of what a token costs. The reconciliation job, when it exists, is an intern's quarterly project. When the provider changes the invoice format, the reconciliation breaks. The CIO does not learn about the break until the close slips.

The gap is not the quality of the provider dashboards. The gap is that the provider data is missing the dimensions the enterprise needs, and no amount of dashboard design fixes that.

Enforce attribution at the gateway

The pattern that works is to move attribution upstream. Every LLM call in the enterprise flows through a gateway. The gateway enforces structured metadata on every call. Without metadata, the call does not go out.

The metadata is simple but non-optional. The team owning the call. The application the call is coming from. The feature inside the application. The environment. The caller identity. The purpose. Each of these is a dimension the gateway can slice spend by.

The metadata is tied to the authentication token the application uses to call the gateway. A team onboards its applications once. The onboarding attaches the structured metadata to the token. The application's calls carry the metadata automatically. The gateway validates the metadata on every request. A missing or malformed field is a 400, not a silently-attributed call.

Every token consumed, every call made, every policy applied writes a row into a cost ledger. The row carries the provider, the model, the tokens in, the tokens out, the computed cost at current provider rates, the policy applied, and all the enterprise dimensions. The ledger is the source of truth for AI spend inside the enterprise.

What the ledger unlocks

With the ledger in place, the monthly close is a query.

Procurement queries "total AI spend by business unit for March." The query returns in seconds. Finance queries "forecasted cost for the Q2 feature rollout, assuming 20% traffic growth." The query runs. The CIO queries "top ten teams by AI spend growth month-over-month, with commentary on what changed." The query runs, with the commentary coming from tags the platform team attaches to notable cost events.

Beyond the close, three more surfaces open up.

Budget enforcement. The gateway can cap per-tenant spend monthly. When a team hits 80% of its budget, the gateway sends a warning event. At 100%, it returns a structured "budget exceeded" error that the application handles gracefully. The application degrades rather than surprising finance with a 40% overage on the next invoice.

Rate limit governance. Provider rate limits are a shared resource. A misbehaving batch job in one team can starve the provider quota for the rest of the enterprise. The gateway enforces per-tenant rate limits separately from the per-provider rate limits. The batch job slows down. The customer-facing application does not see a 429.

Model-list governance. The gateway enforces which models each tenant is allowed to call. A prototype team cannot silently migrate to the newest, most expensive frontier model mid-quarter without procurement review. The gateway returns a structured "model not allowed" error. The team asks for the upgrade through the documented path, procurement reviews the budget impact, and the upgrade lands with a paper trail.

What the CIO's ask becomes

Before the gateway pattern, the CIO's ask is "give me a breakdown of AI spend by team and by feature." The platform lead's answer is "give us a week, we need to reconcile the invoices." The close slips.

After the gateway pattern, the CIO's ask is "give me a breakdown of AI spend by team and by feature." The platform lead's answer is a dashboard URL. The CIO gets the breakdown in the meeting. The close closes on time.

This is not a massive engineering lift. It is a single architecture decision made at the right time: all LLM calls in the enterprise go through a gateway, and the gateway enforces structured metadata. Once that decision is made and implemented, every downstream question becomes answerable.

The pattern scales with provider volatility

The most underrated benefit of the gateway pattern is that it survives provider volatility.

When a provider deprecates a model, the gateway swaps the underlying endpoint without a code change in any application. When a provider announces a price increase, the gateway's cost ledger updates and the finance team sees the forward projection immediately. When a new provider launches a model the enterprise wants to try, the platform team onboards the provider at the gateway, the ledger picks up the new provider's pricing, and usage flows through with the same attribution as every other provider.

The applications do not change. The attribution does not change. The close does not change. The only thing that changes is the mix of providers underneath, which was going to shift every six months regardless.

Next step

If your AI spend is over six figures a month and the monthly close is taking longer than a day, an architecture review takes your current provider footprint, your application inventory, and your existing attribution gaps, and produces a findings document your CIO and CFO can act on together.

Book an architecture review →