Home Artificial Intelligence The way to Discover Your Business-Critical Data Why it is best to discover your business-critical data What data is business-critical Identifying your business-critical dashboards Importance based on business-critical use case Importance based on dashboard usage Importance based on dashboard C-suite usage Identifying your business-critical data models Data models with many downstream dependencies Data models on the critical path The way to keep your critical data model definitions updated Defining criticality labels Where to define criticality Defining criticality within the tool where you create the information asset Defining criticality in an information catalog Acting based on criticality Summary

The way to Discover Your Business-Critical Data Why it is best to discover your business-critical data What data is business-critical Identifying your business-critical dashboards Importance based on business-critical use case Importance based on dashboard usage Importance based on dashboard C-suite usage Identifying your business-critical data models Data models with many downstream dependencies Data models on the critical path The way to keep your critical data model definitions updated Defining criticality labels Where to define criticality Defining criticality within the tool where you create the information asset Defining criticality in an information catalog Acting based on criticality Summary

0
The way to Discover Your Business-Critical Data
Why it is best to discover your business-critical data
What data is business-critical
Identifying your business-critical dashboards
Importance based on business-critical use case
Importance based on dashboard usage
Importance based on dashboard C-suite usage
Identifying your business-critical data models
Data models with many downstream dependencies
Data models on the critical path
The way to keep your critical data model definitions updated
Defining criticality labels
Where to define criticality
Defining criticality within the tool where you create the information asset
Defining criticality in an information catalog
Acting based on criticality
Summary

Practical steps to identifying business-critical data models and dashboards and drive confidence in your data

Towards Data Science
Source: synq.io

This text has been co-written with Lindsay Murphy

Not all data is created equal. When you work in an information team you understand that if a certain dashboard breaks you drop all the things and jump on it, whereas other issues can wait until the top of the week. There’s reason for this. The primary may mean that your entire company is missing data whereas the previous may haven’t any significant impact.

Nonetheless, keeping track of all of your business-critical data as you scale your team and grow the number of information models and dashboards will be difficult. This is the reason situations corresponding to these ones occur

“I had no idea finance was counting on this dashboard for his or her monthly audit report”

or

“What the heck, did our CEO bookmark this dashboard that I made in a rush as a one-off request six months ago”

In this text we’ll look into

  • Why it is best to discover your critical data assets
  • The way to discover critical dashboards and data models
  • Making a culture of uptime for critical data

When you’ve mapped out your business-critical assets you may have an end-end overview across your stack that shows which data models or dashboards are business-critical, where they’re used, and what their latest status is.

This will be really useful, in a variety of alternative ways:

  • It could actually turn into a vital piece of documentation that helps drive alignment across the business on a very powerful data assets
  • It breeds confidence in the information team to make changes and updates to existing models or features, without fear of breaking something critical downstream
  • It enables higher decision making, speed, and prioritisation when issues arise
  • It gives your team permission to focus more of your energy on the highly-critical assets, and let some less essential things slide
Example of seeing essential impacted data models and dashboards for an incident. Source: synq.io

In this text we’ll take a look at the right way to discover your business-critical data models and dashboards. You may apply many of the same principles to other forms of data assets that could be critical to your online business.

Data used for decision-making is vital and if data is wrong it might result in flawed decisions and over time a lack of trust in data. But data-forward businesses have data that is actually business-critical. If this data is flawed or stale you’ve a hair-on-fire moment and there’s a right away business impact for those who don’t fix it corresponding to…

  • Tens of 1000’s of shoppers may get the flawed email because the reverse ETL tool is reading from a stale data model
  • You’re reporting incorrect data to regulators and your C-suite will be held personally liable
  • Your forecasting model just isn’t running and lots of of employees in customer support can’t get their next shift schedules before the vacations
Source: synq.io

Mapping out these use cases requires you to have a deep understanding of how your organization works, what’s most vital to your stakeholders and what potential implications of issues are.

Looker exposes metadata about content usage in pre-built Explores which you could enrich along with your own data to make it more useful. In the next examples, we’ll be using Looker, but latest BI tools enable usage-based reporting in some form (Lightdash also has in-built Usage Analytics, Tableau Cloud offers Admin Insights, and Mode’s Discovery Database offers access to usage data, simply to name just a few).

Once you speak with your online business leaders you may ask questions corresponding to:

  • What are your top priorities for the following three months?
  • How do you measure success to your area?
  • What are probably the most critical issues you’ve had prior to now 12 months?

Your small business leaders may not know that the explanation why average customer support response times jumped from two hours to 24 hours over Christmas was because of a forecasting error from stale upstream data, but they’ll describe the painful experience to you. When you can map out probably the most critical operations and workflows and understand how data is used you’ll start uncovering the truly business-critical data.

Essentially the most obvious essential dashboards are ones that everybody in the corporate uses. Most of those you might already pay attention to corresponding to “Company wide KPIs”, “Product usage dashboards”, or “Customer support metrics”. But you’ll sometimes be surprised to find that dozens of persons are counting on dashboards you had no idea existed.

Source: synq.io

Normally it is best to filter for recent usage to not include dashboards that had quite a lot of users six months ago but no usage within the last month. There are exceptions to this corresponding to a quarterly OKR dashboard that’s only used every three months.

Prefer it or not, in case your CEO uses a dashboard commonly it’s essential, even when there’s only a handful of other users. Within the worst case scenario you realise that a member of the C-suite has been using a dashboard for months with incorrect data without you having any idea this dashboard existed.

“We discovered that our CEO was religiously taking a look at a day by day email delivered with a report on revenue, but it surely was incorrectly filtered to incorporate a particular segment, so it didn’t match the centralised company KPI dashboard.” — Canadian healthcare startup

If you’ve an worker system of record, you might give you the chance to simply get identifiers for peoples’ titles and enrich your usage data with this. If not, you may maintain a manual mapping of those and update them when the manager team changes.

Source: synq.io

While usage by seniority is very correlated with importance, your first priority must be mapping out the business-critical use cases. For instance, a bigger fintech company has a dashboard utilized by the Head of Regulatory Reporting to share critical information with regulators. The accuracy of this data will be of upper importance to your CEO than the dashboard they give the impression of being at on daily basis.

With many dbt projects exceeding lots of or 1000’s of information models, it’s essential to know which of them are business-critical so you understand when it is best to prioritise a run or test failure, or construct extra robust tests.

You likely have a set of information models where in the event that they break all the things else is delayed or impacted. These are typically models that all the things else relies on corresponding to users, orders or transactions.

You might already know which of them these are. If not, you may also use the manifest.json file that dbt produces as a part of the artifacts at each invocation and the depends_on property for every node to loop through all of your models and count the whole variety of models that rely on them.

Normally you’ll discover a handful of models with disproportionately many dependencies. These must be marked as critical.

Data models are rarely critical on their very own, but most frequently due to importance of their downstream dependency, corresponding to a vital dashboard or a machine learning model used to serve recommendations to users in your website

All data models upstream of a business-critical dashboard are on the critical path. Source: synq.io

When you’ve passed through the labor of identifying your business-critical downstream dependencies and use cases you should utilize exposures in dbt to manually map these or use a tool that routinely connects your lineage across tools.

Anything upstream of a critical asset must be marked as critical or as on the critical path.

Automate as much as possible around tagging your critical data models. For instance:

  • Use check-model-tags from the pre-commit dbt package to implement that every data model has a criticality tag
  • Construct a script, or use a tool, that routinely adds a critical-path tag to all models which can be upstream of a business-critical asset

There’s nobody right answer to the right way to define criticality but it is best to ask yourself two questions

  1. What are your plans for the way you treat critical data assets in another way
  2. How do you maintain a consistent definition across what’s critical in order that everyone seems to be on the identical page

Most firms use a tiered approach (e.g. bronze, silver, gold) or a binary approach (e.g. critical, non-critical). Each options can work and the most effective solution relies on your situation.

Source: synq.io

You need to be consistent in the way you define criticality and write these up as a part of your onboarding for new-joiners and avoid postponing this. For instance, the definition of tiering may very well be

  • Tier 1: Data model utilized by a machine learning system to find out which users are allowed to enroll in your product
  • Tier 2: Dashboard utilized by the CMO for the weekly marketing review
  • Tier 3: Dashboard utilized by your product manager to trace monthly product engagement

When you’re not consistently updating and tagging your assets it results in a scarcity of trust and an assumption which you could’t depend on the definition.

There’s nobody right place to define criticality but it surely’s mostly done either within the tool where you create the information asset, or in an information catalog, corresponding to Secoda.

In dbt you may keep your criticality definitions in your .yml file alongside your data model definition. This has several benefits corresponding to having the ability to implement criticality when merging a PR or easily carrying over this information across tools corresponding to an information catalogue or observability tool

models:
- name: fct_orders
description: All orders
meta:
criticality: high

Example of defining criticality in a .yml file

In BI tools, one option that makes it transparent to everyone seems to be to label the title of a dashboard with e.g. “Tier 1” to point that it’s critical. This data can typically be extracted and utilized in other tools.

Source: synq.io

In an information catalog you may easily access all your organization data and find answers to common questions by searching across your stack, which makes it easier to align on metrics and models

Tagging critical data. Source: secoda.co

Mapping your business-critical assets will only repay for those who act in another way due to it. Listed here are some processes to construct in quality by design.

Dashboards:

  • Tier 1 dashboards need a code reviewer before being pushed to production
  • Tier 1 dashboards should adhere to specific performance metrics around load time and have a consistent visual layout
  • Usage of Tier 1 dashboards must be monitored monthly by the owner

Data models:

  • Test or run failures on critical data models must be acted on throughout the same day
  • Issues on critical data models must be send to PagerDuty (an on-call team member) in order that they will be quickly actioned
  • Critical data models must have a minimum of unique and never null tests in addition to an owner defined

You may read more about the right way to act on data issues in our guide Designing severity levels for data issues

When you discover and map out your business-critical data assets you may act faster on issues which can be essential and be intentional about where you construct prime quality data assets.

  • To discover dashboards which can be business critical, start by taking a look at your online business use cases. Then consider usage data corresponding to variety of users or if anyone from the C-suite are using a dashboard
  • Data models which can be business-critical often have many downstream dependencies and/or critical downstream dependencies
  • Define criticality, either directly within the tools where you create the information assets, or use an information catalog
  • Be explicit about the way you act on issues inside business-critical assets and put in procedures for constructing quality by design

LEAVE A REPLY

Please enter your comment!
Please enter your name here