Key takeaways:
- Tokenomics is now a named discipline, with the token as the atomic unit of AI. Every AI cost question now traces back to the token, and each party sees it differently: the data center sees compute, the model sees cognition, the lab sees price, the enterprise sees value.
- The token bill is one of nine cost buckets. One is metered, eight are not, and any forecast that anchors on the invoice alone is wrong.
- The metric to chase is value per token, not cost per token, and you earn it only by optimizing across every layer. Miss one and the others can’t make up the difference.
- The FinOps Foundation expanded its mission from cloud value to technology value: from shift-left to shift-wild, reported in executive units rather than FinOps ones. FinOps is now a strategic function in the age of AI.
Best for: FinOps practitioners, cloud engineers, finance and FP&A, leadership.
FinOps X 2026 Recap
FinOps X 2026 opened with a chart that did most of the arguing for the week: token consumption climbing from single-digit trillions to tens of trillions in under a year. Against that backdrop, the FinOps Foundation closed a debate that started at FinOps X 2025, carried through a session in Washington, DC, and ended in a vote. The community’s mission expanded from “cloud value” to “technology value.” The practical signal for practitioners is that AI cost sat at the center of the conference this year, and the week was spent building the vocabulary to manage it.
Tokenomics Became a Discipline With a Name
Tokenomics, or AI token economics, is the discipline of converting energy and capital into AI tokens, consuming those tokens efficiently to produce intelligence, and driving value from what they produce. The organizing idea is that the token is the atomic unit of AI, and every cost or value question resolves to it. The same token means different things to different people: the data center sees compute, the model sees cognition, the lab sees price, the enterprise sees value. Misalignment between those four lenses is where economic problems start, and where FinOps has room to work.
The mental model that held the discipline together was “it’s tokens all the way down”: the token sits at the center, wrapped by the model layer, the platform and API layer, the application layer, and a sovereign layer. Fully loaded cost means accounting for every layer.
Your Token Bill Is One of Nine Cost Buckets
The clearest reframe of the week was that the metered token bill is a single bucket, and eight more sit next to it unmetered. You pay all nine. The other eight are:
- Retrieval and data, driven by corpus size and query volume
- Orchestration, scaling with the number of agents per task
- Inference infrastructure, from model size and idle GPUs, including the KV cache the invoice never shows
- Eval and monitoring, the cost of test runs and tracing
- Governance, set by regulatory exposure
- Human labor, the prompt engineering and review behind the system
- Failure and waste, from retries and shadow AI
- Integration, the cost of model churn and rebuildsext.
A forecast built on the token bill alone leaves out eight of the nine costs, so it’s wrong
Optimization Compounds, or It Doesn’t: The Consumption Layer Cake
Pinterest’s tokenomics consumption layer cake gave practitioners a model for where optimization lives, organized around value per token. Five layers, top to bottom:
- Routing and governance: send each request to the right-sized model
- Model and quantization: right model, right precision
- Inference stack: right engine and caching strategy
- Capacity: right hardware, right region
- Silicon: the newest chip generation, roughly a 30x improvement per generation
The wins compound across layers, but if you miss one, you can’t optimize your way out at the others. Silicon gives the biggest single gain, but you only capture it if the upper layers are working too.
Measuring AI: From Spend Visibility to Value Per Token
The AI Unit Economics maturity board laid out a progression in three stages. Spend visibility comes first: total AI spend, average daily cost. Economics comes next: cost per token normalized, cost per model, token input/output ratio, cached token ratio, token-to-spend drift, GPU utilization. Value is the destination: cost per AI use case, inference cost against AI revenue.
Five candidate consumption metrics gave that progression teeth, framed openly as the first draft of many to come:
- Cost per verified outcome: fully loaded AI cost divided by outcomes that were verified.
- Direct-allocation %: the share of AI spend that has a named domain owner.
- Route win rate: the share of requests sent to a cheaper utility-tier model without quality loss, targeted around 70% and up.
- Cache and batch leverage: how much repeated context is served from cache and how much non-urgent work runs in batch rather than real time.
- Sovereignty hit rate: the share of regulated workloads running in the correct data-residency zone, enforced by policy.
FinOps Moved From Shift-Left to Shift-Wild
The scope of FinOps was mapped as four stages:
- FinOps controls (shift-left): cost answers in the developer’s flow
- Workload pricing (shift-left): forecasting total cost of ownership at planning time
- FinOps for tech (shift-up and across): SaaS sprawl, colocation, and the line items finance never tracked
- FinOps AI and AI for FinOps (shift-wild): using AI to govern AI, and to run FinOps itself
The first two are table stakes now, the last two are becoming the strategic bet.
Because none of this has a settled playbook yet, the Tokenomics Foundation is building one in the open. It pairs two groups:
- Practitioner minds: the large token consumers and AI-native enterprises solving this at scale
- Supplier minds: the hardware vendors and neoclouds, frontier model providers, and inference platforms behind the spend
Together they produce shared best practices, frameworks, and guidance, aimed at intelligent outcomes.
Speak in the Executive’s Units, Not FinOps Units
The communication point ran through every practitioner session: when you walk into the room, translate. A CEO, CRO, CMO, CPO, or CIO doesn’t think in tokens or chargeback. They think in unit margins, basis points, cost per room booking, cost per call, cost per transaction, cost per dollar of profit, legal hold.
The number you report should match the listener’s P&L vocabulary. A metric that doesn’t map to the executive’s P&L numbers can’t support a budget decision, regardless of how accurate it is.
How Four Roles Are Shifting
Finance and FP&A: The job moves from approving an AI line item to owning its margin impact. With cost per AI use case and inference cost against revenue on the table, finance can hold a defensible number.
Engineering: Cost answers move into the developer’s flow. Routing decisions, quantization, and caching strategies become engineering levers with a dollar value attached.
FinOps practitioners: The remit widens from cloud to the full technology stack, including the eight unmetered buckets. The work shifts toward attribution and outcome measurement, and toward building the playbook the discipline doesn’t have yet.
Leadership: AI cost becomes a board-level number. Leaders set the mandate that everything has a named owner, and they ask for value per token.
Where Mavvrik Fits
The conference named direct-allocation %, the share of AI spend with a named owner, as a maturity marker. That’s a visibility problem before it’s an optimization problem, and it’s exactly where AI-assisted development tends to go dark. Claude Code and Cowork spend can grow across teams with no clear line back to who used it, which sessions drove it, or which models it ran on.
Mavvrik ingests OpenTelemetry from Claude Code and Cowork and attributes that spend down to agent, user, session, and model, surfaced in its Agentic Cost and Sessions views. That turns Claude spend into an allocated, owned number instead of a lump in the AI bucket, which is direct-allocation % moving in the right direction.
Where FinOps Goes Next
Two signals from this year point to where the field is heading:
- AI cost is becoming its own discipline rather than a subset of cloud cost. It has dedicated vocabulary, dedicated metrics, and starting next year a dedicated event, with Tokenomicon running alongside FinOps X.
- The emphasis is shifting toward value. Spend visibility is now assumed, and the harder questions are about what each outcome costs to produce and whether that cost is justified.
The granularity of measurement keeps increasing alongside it. Cost attribution has moved from the cloud bill down to the token, and now down to the individual agent run and session. Each step makes spend easier to assign to an owner and to a result.
None of this changes what FinOps is for. The Foundation expanded its scope from cloud value to technology value, but the underlying work is the same: make spend visible, give it an owner, and tie it to the value it produces.

