A Quick Guide to Data Relationship Diagrams
Before reading this page, read about Data Catalogue Prototyping first. It’ll give you a preliminary understanding of how data relationships correspond to data flows and data lineages, and how it all comes together in a prototype data catalogue.
What Is a Data Relationship Diagram?
A data relationship diagram is a visual representation of the data links that exist between the various systems and processes in the business. For example, we may know or see that data moves between our website and CRM system. That’s a data relationship.
If you’re non-technical, think of a data relationship diagram as a map of all the ‘data pipes’ in the business. If you’re in a technical role or have a technical background, then you might prefer to think of a data relationship diagram as an enterprise conceptual data model.
Why Do We Need a Data Relationship Diagram?
A data relationship diagram is a visual tool which reveals the ‘anatomy’ of our data infrastructure, to enhance our understanding of how and why data moves through our business. A data relationship diagram can:
- Inform decisions about changes or additions to our data infrastructure.
- Highlight potential opportunities for improving our data infrastructure.
- Reveal potential issues such as duplicated or conflicting data integrations.
- Support data discovery, by guiding investigations into where to look for particular data.
- Support data literacy, by providing contextual awareness of the source and purpose of data.
- Support data compliance, by triggering investigation into poorly governed branches of data infrastructure.
One way of understanding the role of a data relationship diagram, is to think of it as a metaphorical ‘MRI scan’ of the business. Prior to operating on a patient, a surgeon will use the visibility provided by an MRI to decide and plan a course of action. Similarly (but thankfully not literally!) a data relationship diagram shows where data is in a business and how everything is connected, so that we he business ‘operate’ on its data with clarity and confidence, not trial and error.
How Do I Use It?
As soon as you look at a data relationship diagram, you’ll see that it’s formed of shapes and lines.
Each shape in the data relationship diagram represents a data ‘component’. A data component is anything that interacts with data in any way. Be that storing, transacting, creating, reading, updating or deleting data. A data component is typically an IT system, but it can equally be a process, a team, or even a person. Quite simply, if we see that any data moves to or from a particular person, place or thing, then we record it as a data component.
Each line in the data relationship diagram is a data relationship. A data relationship is the movement of any type or volume of data moving at any frequency by any method between two components in either direction. So a data relationship doesn’t specify what data is moving, why, how or when. It solely indicates that some data moves for some reason. It’s this non-specific nature of data relationship analysis, which allows us to build a high-level picture without getting consumed by too much detail. Each data relationship is labelled with a number which corresponds to an entry in the prototype data catalogue. It’s from there that we are able to dig into and record details about the data flows inside each data relationship.