Most companies don’t know what data they have.
Even fewer know what data they need next.

Data is everywhere. In tools, in spreadsheets, in systems that don’t talk to each other.
People work around it. They guess. They rebuild things.

The problem is not lack of data.
The problem is not seeing it clearly.

The real gap

Data exists in most organisations.
But it is hard to find and harder to trust.

Teams depend on memory.
“Someone in finance might have that.”
“Check with ops, they built something last year.”

There is no shared view.
No clear direction.

That slows everything down.

What is a data catalog

A data catalog is a simple idea.

It is a place where you list what data you have.

Not just names.
But also:

  • what the data is
  • where it lives
  • who owns it
  • how to access it

It helps people stop guessing.

Instead of asking around, they can look things up.

Tools like Atlan and Amundsen are built for this.
But even a simple document is a good start.

What is data lineage

Knowing what exists is not enough.

You also need to know where it came from.

That is data lineage.

It shows how data moves.
From source systems to reports.
From raw inputs to final numbers.

It helps answer simple but important questions.

Where are we now?
Where did this data come from?
Why does it look like this?

Without lineage, trust is low.
With lineage, people can follow the path.

What is a data roadmap

A data roadmap is about direction.

It answers a different question.

What should we build next?

It helps teams:

  • prioritise important datasets
  • plan pipelines
  • align with business needs

Without a roadmap, data work becomes random.
Teams build what feels urgent, not what matters most.

Why you need all three

Each part solves a different problem.

  • The catalog shows what exists
  • Lineage shows how it got there
  • The roadmap shows where to go next

If one is missing, things feel incomplete.

You might know what data you have, but not trust it.
You might trust it, but not know what to build next.

Together, they create clarity.

A simple example

Think of a mid-sized New Zealand company.

They use Xero for finance.
A CRM for customers.
Some product data in another system.

Now the problems start.

Two teams report different revenue numbers.
No one is sure which dataset is correct.
A new dashboard gets built from scratch again.

This is common.

Not because teams are bad.
Because there is no clear system.

No catalog.
No lineage.
No roadmap.

How to start small

You do not need a big setup.

Start simple.

List your key datasets.
Write down who owns them.
Add a short description.

Then map basic flows.
Where does this data come from?
Where does it go?

Finally, pick one or two priorities.
Focus on what will help the business most.

Keep it small. Keep it useful.

Why this matters

Clear data leads to better decisions.

People spend less time searching.
Less time fixing numbers.
More time using them.

It also sets the base for AI.

AI works best when data is clear and trusted.
Without that, it just adds noise faster.

The goal is not more data.

The goal is better understanding of the data you already have.


Written for KiwiGPT.co.nz — Generated, Published and Tinkered with AI by a Kiwi