All articles
DATA

What is a data platform? (And do you need one?)

A primer for non-technical decision-makers: what a data platform actually is, the four layers it consists of, when you need one and when a spreadsheet beats a Snowflake bill.

18 Mar 2026·9 min read·Productized Team

"Data platform" is one of those phrases that sounds technical enough that nobody asks what you mean by it. Vendors love it. So do consultants. As a result, every mid-market company in 2026 has been pitched a data platform — usually built on Snowflake, Databricks or some custom assembly — without a clear picture of what they'd actually get. This article is for non-technical decision-makers who want to know what a data platform is, what it isn't, and whether they need one.

We write this as a software vendor that builds data platforms in production for mid-market companies. We've shipped platforms that genuinely changed how teams work. We've also walked away from data platform projects where a tidy set of dashboards on top of the existing systems would have done 90% of the job at a tenth of the cost.

What a data platform actually is

A data platform is a layered system that collects data from multiple source systems, cleans and combines it, and serves it to the people and applications that need it — reliably, repeatably, and with someone responsible when it breaks.

Critically, it is not:

  • A fancy spreadsheet. A spreadsheet is one person's view; a data platform is the company's shared truth.
  • Just a dashboard. A dashboard is the visible tip; the platform is everything underneath that makes the numbers correct.
  • A database with extra steps. A database stores transactional data; a platform stores analytical, modelled, cross-source data.
  • A magic AI feature. A platform is plumbing — boring, important, and the prerequisite for serious AI work, not a substitute for it.

The four layers in practice

Every serious data platform has the same four layers. Different vendors put them in different boxes, but the responsibilities don't change.

1. Extract & load

Pull data from source systems — your CRM, ERP, finance system, marketing tools, log files, IoT devices — and land it in a central store. Tools: Fivetran, Airbyte, custom connectors, or platform-native loaders. Boring but critical: 60% of data platform pain comes from broken extracts.

2. Transform & model

Raw data is rarely useful. Transformation cleans it, joins related sources, applies business rules and produces "models" — well-named, documented tables that match how the business thinks. Tools: dbt is the de facto standard. Without modelling, you're stuck with raw data and a thousand contradictory dashboards.

3. Serve & visualise

Make the modelled data available to humans (dashboards, reports) and applications (APIs, AI agents, downstream systems). Tools: Metabase, Looker, Power BI, embedded analytics. This is the layer most stakeholders see; it's also the easiest to over-invest in before the layers below are right.

4. Govern & monitor

Lineage (where does this number come from?), access control (who can see what?), monitoring (did the pipeline run last night?), retention (what do we keep, for how long?), and ownership (who owns this dataset?). Often skipped in the first version, often the reason version two is needed.

When you need a data platform

Four signals that point to a real platform need:

SignalWhat it looks like
Multiple data sources you can't combineCRM says one thing, finance says another, marketing has its own numbers — nobody trusts the dashboards
AI ambitionsYou want to build RAG, agents, or analytics — they need clean, modelled data
Compliance and audit needsRegulators or auditors want lineage, retention, access control
Cross-team analyticsMultiple teams need shared metrics with one definition of "customer", "order", "revenue"

When you don't need one

If you have one source system, occasional reports, and a small team — you probably don't need a data platform. A weekly export to a spreadsheet, or a few SQL queries against your operational database, will get you 90% of the way at 1% of the cost. Skip the platform. Build it later, when the pain is real.

We've talked plenty of clients out of data platform projects. The signs that you don't need one yet:

  • Your reporting fits in one tool (Stripe dashboard, Shopify reports, HubSpot dashboards). Adding a platform layer just creates a copy that drifts.
  • Your team is small (under ~30 people). The shared truth fits in a few spreadsheets and weekly Slack updates.
  • You don't have AI or analytics ambitions yet. Building a platform "for the future" usually means building the wrong thing — wait until the use case is real.
  • Your data is genuinely simple. One ERP, one CRM, no behavioural data, no IoT, no log streams.

Build vs buy: the real choices

Once you've decided you do need a platform, three meaningful architectural options:

Snowflake (or BigQuery, Redshift) + dbt + Metabase

Cloud data warehouse, open-source modelling layer, off-the-shelf BI tool. Our default for mid-market companies in 2026. Predictable cost, broad ecosystem, easy to hire for. Boring, in the good way.

Lakehouse (Databricks, Iceberg)

When data volumes are large enough that warehouse pricing hurts, or when you have heavy ML/AI workloads. Genuinely useful at scale; usually overkill below tens of terabytes. Don't pick this because it sounds modern.

Custom on cloud primitives

Postgres + an orchestrator + custom code. Cheapest at very small scale, painful as it grows. We sometimes use this when a client has unusual privacy requirements or an existing footprint that constrains tool choices. Rarely the right default.

What makes a data platform "production"

The difference between a demo platform and a production one is what happens at 03:00 on a Tuesday when something breaks. A production platform has:

  • Lineage: every metric you can trace back to its source columns and the transformations applied.
  • Retention rules: what's kept, what's deleted, when, why — documented and enforced.
  • Monitoring: pipeline failures alert someone within minutes, not when a stakeholder notices the dashboard is stale.
  • Documented ownership: every dataset has an owner. "It's the data team's" is not an answer.
  • A handover plan: someone outside the original build team can maintain it. If only the original vendor can debug it, you don't have a platform — you have a dependency.
Eval discipline applies here too: a platform without monitoring is a platform that's broken half the time and nobody knows. Build the monitoring before you build the dashboards.

Cost ranges

Realistic ranges for mid-market data platform builds in 2026:

  • Light platform — 2–3 sources, basic dbt models, one BI tool: €30K–€60K, 6–10 weeks. Plus around €500–€1.5K/month in cloud and tooling.
  • Production platform — 5–10 sources, monitoring, lineage, multiple data marts: €80K–€150K, 12–20 weeks. Plus €1.5K–€5K/month in cloud and tooling.
  • Strategic platform — bespoke modelling, data products, ML serving, advanced governance: €150K–€250K+, 4–8 months. Plus €5K+/month in cloud.

Plus 15–25% per year in maintenance — sources change, business rules change, retention rules change, models drift. A platform without ongoing investment becomes a liability within 18 months.

An example: Energy Data Xchange

We built Energy Data Xchange as a data platform for the Dutch energy sector — combining grid data, ESDL energy models, and forecasting outputs into a shared, governed source of truth. Multiple sources, regulatory constraints, AI workloads on top, and several stakeholder organisations needing one trustworthy view. That's the canonical case for a data platform. A spreadsheet would have died in week one.

How we work

We build data platforms for mid-market companies. We start with a 1–2 week discovery to confirm the four signals are present and to size the right shape. We're equally happy telling a client they don't need a platform yet. More about our approach is on our service page for data.

Suspect you do (or don't) need a data platform? Describe your situation in a few sentences via our contact form — we'll respond within one working day with an honest read.

Relevant pages