Key takeaways:
- AI workloads multiply existing data platform costs through repeated compute, indexing, serving, storage, and retrieval.
- Databricks AI costs are driven by DBU consumption across jobs, SQL analytics, Vector Search, model serving, Delta Lake workloads, and storage.
- Snowflake AI costs extend beyond warehouse credits into Cortex AI token consumption, Cortex Search indexing, serving, storage, and data sharing activity.
- The core FinOps challenge is attribution: connecting Databricks DBUs and Snowflake credits to the products, teams, customers, and agents that generated them.
- AI cost control starts by separating batch from serving workloads, assigning ownership, and tracking unit costs such as cost per embedded document, retrieval, response, agent run, or customer interaction.
Best For: Finance leaders & FP&A | FinOps Practitioners | IT & Engineering leaders | MSP and cloud partners
How AI Workloads Multiply Data Platform Costs
A standard analytics query combines the relevant data and returns a result.
An AI workload:
- Prepares data
- Generates embeddings
- Retrieves context
- Calls a model
- Stores the output
- May loop back through an agent
- Delivers an answer to the user
That is why AI workloads amplify data platform costs. The data may be the same, but the number of times it gets processed goes up.
A helpful way to explain it:
| Traditional Analytics | AI Workload |
|---|---|
| Query the data | Query, chunk, embed, retrieve, infer, log, refresh |
| Usually scheduled | Often triggered by users or agents |
| Cost tied mainly to compute time | Cost tied to compute, tokens, indexes, serving, storage, and refreshes |
| Easier to allocate by dashboard or team | Harder to allocate without workload-level context |
This is why a finance team can look at Snowflake or Databricks spend and struggle to explain what changed. The platform bill increased, but the business activity behind that increase is hidden unless the organization tracks it carefully.
Databricks Cost Drivers
Databricks is built on a lakehouse architecture and measures compute consumption using Databricks Units (DBUs) as the unit for processing usage. DBUs show up across jobs, notebooks, SQL warehouses, serverless workloads, model training, serving, and other workloads. Databricks describes DBUs as the unit used to measure processing power, with pricing that varies by workload type, cloud, region, and product.
AI workloads also expand how enterprises use the Databricks ecosystem. A workload may touch Delta Lake as the storage layer, Unity Catalog for governance, SQL analytics for exploration, open source Spark jobs for transformation, open data formats for interoperability, and serving infrastructure for model responses. At large scale, those layers create a wider cost surface than a standard data pipeline.
The market is pricing that role accordingly.Data Centre Magazine reported that Databricks raised $4B at a $134B valuation, while Sacra’s analysis of Databricks’ ARR frames the company as a major enterprise data and AI platform. That context matters because Databricks is not sitting at the edge of AI infrastructure. It is becoming part of the cost base for how AI gets built, served, and governed.
A DBU total from an AI workload could mean a batch embedding job that ran overnight, a GPU serving endpoint that stayed warm all day, or a fine-tuning run that nobody scheduled to stop. Each one has a different owner, a different optimization path, and a different conversation with finance.
| Cost Driver | Simple Explanation | Why It’s Expensive | What to Watch |
|---|---|---|---|
| DBU compute | The processing meter behind Databricks workloads | AI jobs often process large datasets multiple times | DBUs by job, cluster, notebook, endpoint, and workspace |
| Compute type | The type of Databricks compute used for the workload | Interactive compute can stay attached to production work after a prototype grows up – often because data scientists are still iterating. | All-purpose vs. jobs compute |
| Embedding jobs | Turning text into numerical representations for AI search | Large document sets can require repeated processing when data changes | Refresh frequency and data volume |
| Vector Search | Keeping searchable vector indexes available | Endpoints can create an ongoing cost floor, even before query volume grows | Endpoint count, index size, unused endpoints |
| Model serving | Keeping models available for users or applications | Low-latency serving can require always-on capacity | Idle endpoints, traffic patterns, GPU usage |
| Foundation model inference | Sending prompts to models and receiving generated outputs | Long prompts, long answers, and high usage add up quickly | Input tokens, output tokens, model choice |
| Storage | Keeping embeddings, artifacts, logs, outputs, and derived datasets | AI creates many copies and byproducts of data | Stored data, artifacts, checkpoints, logs, data lakes |
Databricks Vector Search is a good example of how AI changes the cost model. Databricks explains that vector search endpoints host indexes for serving queries, have a base price, and scale based on index size. Endpoints with indexes can still create serving costs. Databricks recommends identifying unused endpoints because they can continue to incur costs.
Databricks also recommends sizing compute based on workload needs, including data consumed, computational complexity, source location, partitioning, and required parallelism. For machine learning workloads, Databricks calls out GPU instances sized by model and data volume. The guidance is clear, but it also shows why AI cost control cannot be left to platform totals.
The cost exposure is a series of small decisions:
- a prototype stays on interactive compute
- an embedding job refreshes too often
- a vector endpoint gets forgotten
- A serving endpoint stays warm for a feature with low usage.
No villain is necessary, only a meter that has gone unnoticed.
Snowflake Cost Drivers
Snowflake uses credits for compute. Virtual warehouses use credits when they run queries, load data, or handle other data operations. Snowflake also uses serverless compute for features that Snowflake manages directly, instead of using a warehouse you manage yourself.
AI adds more ways to consume credits. A Snowflake AI workload can involve warehouse compute, Cortex AI usage, Cortex Search indexing, embedding refreshes, serving compute, storage, data sharing, and cloud services activity. A data warehouse that once supported analytics may now support retrieval, semantic search, product features, and customer-facing AI workflows.
Case Study: B2B fintech company connected cloud, SaaS, Snowflake, MongoDB, AI, and agentic workload costs into a cost-to-serve model. The result was $1.75M in savings over 20 months and a cost per payment of $0.08, giving finance and product teams a number they could use for pricing, margin management, and investment decisions.
| Cost Driver | Simple Explanation | Why It’s Expensive | What to Watch |
|---|---|---|---|
| Warehouse credits | Credits used by Snowflake warehouses | AI pipelines still need data prep, joins, filtering, and post-processing (often at high concurrency). | Warehouse size, runtime, concurrency |
| Serverless compute | Snowflake-managed compute for managed services | AI services may run outside the warehouse model teams are used to tracking | Serverless usage by service |
| Cortex AI | Snowflake AI features that use models, tools, messages, or tokens | Usage can scale with prompt length, output length, users, and agents | Token usage, tools, messages, feature usage |
| Cortex Search | Snowflake’s search service for AI and semantic search | It includes embedding, indexing, serving, and storage costs | Indexed data size, refresh behavior, service status |
| Embedding refreshes | Updating vector representations when data changes | Frequent small updates can cost more than batched changes | Target lag, change rate, primary keys |
| Serving compute | Keeping search services available for low-latency queries | Costs can continue while the service is available, even during quiet periods | Running services with low or no query traffic |
| Storage | Storing search tables, indexes, embeddings, and outputs | AI creates derived data that lives beyond the first job | Indexed data, materialized tables, retained outputs |
Snowflake also notes that changing the schema of the source query can trigger a full refresh of embeddings and indexes, and recommends bundling changes together since each update carries a fixed cost component. Snowflake provides usage views across Cortex features to give finance and engineering a shared reporting foundation. What those views do not answer is who owns the usage.
The Cost is in the Repetition
The easiest way to explain AI cost growth is this: AI makes the platform do the same expensive work again and again. The default state of an AI workload is inefficiency. That changes when the operating model is designed to account for how these workloads run.
TechTarget’s analysis of Nvidia’s $78B quarter points to the same market pattern. Enterprise AI is becoming an infrastructure buildout. The costs do not stop at the model. They move through GPUs, networking, data platforms, storage, orchestration, and serving systems.
What AI Adds to the Bill
| AI Operation | What Is Happening | Cost Behavior |
|---|---|---|
| Embedding generation | Text is converted into vectors | Scales with data volume and refresh frequency |
| Vector search | The system searches for similar or relevant records | Scales with index size, endpoint setup, and query demand |
| LLM inference | A model reads prompts and writes responses | Scales with prompt size, output length, and request volume |
| Fine-tuning | A model is customized with company data | Adds training, evaluation, storage, and deployment costs |
| Model serving | A model is kept available for applications | Creates capacity and idle-time considerations |
| Agent workflows | AI systems plan, call tools, retry, and branch | Adds variable and harder-to-predict cost paths |
| Logs and traces | The system records what happened | Adds storage and observability costs |
Think about it: The model is the performer on stage. The data platform is the venue, lighting crew, sound system, ticket scanner, security team, and cleanup crew. The invoice includes more than the singer.
Batch vs. Serving: The Core AI Cost Divide
One of the fastest ways to control AI costs is to separate batch workloads from serving workloads.
Batch workloads run on a schedule. This includes embedding refreshes, offline scoring, summarization, data enrichment, and model evaluation. These are easier to plan, tune, and shut down when the work is done.
Serving workloads stay available for real-time responses, users or applications. This includes AI assistants, semantic search, copilots, fraud workflows, and customer-facing AI features. These workloads need low latency, uptime, and capacity, which usually makes them more expensive to run.
What To Do About The Cost Drivers
Databricks
- All-purpose compute used for production jobs → Move repeatable workloads to Jobs compute instead of interactive clusters.
- Embedding jobs → Batch embeddings, reduce unnecessary refresh cycles, and track cost per embedded document.
- Vector Search endpoints → Review unused endpoints and stale indexes; avoid paying for idle serving capacity.
- Model serving → Validate traffic patterns and GPU utilization before keeping endpoints always-on.
Snowflake
- Warehouse credits → Right-size warehouses, enable auto-suspend, and separate AI prep from analytics workloads.
- Cortex AI usage → Track usage by model, feature, and token volume to detect spikes from prompts or agents.
- Cortex Search → Separate embedding, indexing, and serving costs for clearer attribution.
- Embedding refreshes → Batch source-data changes; frequent updates can trigger expensive re-embedding cycles.
Key Insight: Break spend into workload types first, then decide which costs support value and which ones are drifting.
Build the FinOps Control Model
FinOps Control Model
AI cost management does not need to start with a large transformation project. It can start with a simple operating model.
- Classify each workload. Is it batch, serving, search, inference, training, or experimentation?
- Assign an owner. Every warehouse, job, endpoint, index, and search service should belong to a team that understands why it exists.
- Track a unit cost. Platform totals help with reconciliation, but business decisions need metrics like cost per embedded document, cost per retrieval, cost per model response, cost per agent run, or cost per customer interaction. The FinOps Foundation’s work on cloud unit economics is a useful reference for this approach.
Review costs when workload behavior changes. A new model, larger prompt, schema change, serving endpoint, search index, or refresh schedule can all change the cost profile.
Final Note
AI workloads make Databricks and Snowflake costs harder to manage because they add more compute, serving, search, storage, token, and refresh activity to data platforms that were already complex.
The path is straightforward:
- Classify the workload
- Assign ownership
- Track unit costs
- Connect spend back to the product, customer, team, or workflow it supports
For a broader look at how hybrid AI infrastructure moves from owned data centers to consumption-based platforms like Databricks and Snowflake, every layer adds cost complexity.
How Mavvrik Approaches AI Cost Drivers
Mavvrik treats AI cost drivers as an attribution challenge across Databricks, Snowflake, cloud, SaaS, GPU, LLM, and agentic workloads. The platform brings these signals into a single cost model so teams can understand where spend is increasing.They can also see who owns it, and how it connects to products, customers, features, workflows, or agents.
For Databricks, Mavvrik tracks costs by workspace, cluster, service type, and time. For Snowflake, Mavvrik tracks costs by account, service, region, and trend. These signals can then support showback, chargeback, cost-to-serve, forecasting, anomaly detection, and business-level allocation.
Three ways to continue from here:
- Get visibility into AI data platform costs. Start by identifying which workloads are driving Databricks and Snowflake spend across compute, storage, indexing, serving, and refresh activity.
- Connect spend to business ownership. Map DBUs, credits, and AI-related usage to the teams, products, customers, tenants, features, workflows, or agents that generated them.
- See how Mavvrik supports this in practice. Explore Mavvrik’s AI Cost Governance platform, take the product tour, or book a demo to see how workload-level tracking and cost-to-serve visibility apply to your environment.
FAQs
Why do AI workloads increase Databricks costs?
AI workloads increase Databricks costs because they use compute for more than analytics. Embedding jobs, model serving, Vector Search, fine-tuning, batch inference, and storage all consume resources. Costs rise further when production workloads keep using compute patterns that were originally meant for experimentation.
Why do AI workloads increase Snowflake costs?
AI workloads increase Snowflake costs because they add Cortex AI usage, Cortex Search, embedding refreshes, serverless compute, storage, and serving costs on top of warehouse credits. A search or AI assistant workflow can create cost even when it does not look like a traditional SQL workload.
Are tokens the biggest AI cost driver?
Tokens are important, but they are not always the largest surprise. The State of AI Cost Governance report found that data platform usage was cited more often as an unexpected AI cost than LLM token costs.
What is the simplest way to control these costs?
Start by separating batch and serving workloads, assigning owners, and tracking cost-to-serve. From there, teams can tune warehouse size, compute type, refresh frequency, endpoint usage, and search service availability.
How should finance measure AI costs in Databricks and Snowflake?
Finance should measure cost by business unit, product, feature, customer, tenant, workload, and agent. Platform totals are useful for reconciliation, but cost-to-serve is the metric that supports pricing, margin management, showback, and chargeback.

