Moving from Reactive to Proactive Data Observability

I spoke with the CTO of a unicorn data startup, who said, “We are really good at gathering data, but we are not the best at knowing what to do with it.”

Jan 07, 2025

I spoke with the CTO of a unicorn data startup, who said, “We are really good at gathering data, but we are not the best at knowing what to do with it.”

This observation rings true for data platforms. After speaking with over 200 data teams, it became clear that scaling platforms face a common challenge: cutting through the noise to prioritize work that ensures stability and optimal costs is incredibly difficult, whether it's untangling complex dbt models to fix a schema change or determining which of the 15 Airflow pipelines needs immediate attention.

When it comes to enterprise, it's more like thousands of Airflow pipelines a day. This brittleness is only further driven by the fact that teams are stuck in a reactive state. It is really hard to future-proof your data stack.

The first step teams take to solve this problem is to purchase a data observability tool. At least you know what is happening in your stack. At first, you pour time into setting up all the alerts, funnels and tables to monitor, and it's fantastic. However, what typically happens as you scale your dbt models and warehouse spending?

Notification Overload: Your Slack channel gets spammed, and you stop responding or caring as false positives appear.
Lack of Context: You notice the alerts are coming in without context and provide no information on why a problem has occurred, how to fix it, or how it impacts other parts of your stack.
Capacity Shortage: As you get more alerts, your backlog expands, your stress heightens, and you feel that you can't solve all the problems you have. Each alert means you will get distracted and spend 3 to 4 hours figuring out what went wrong and solving the issue, taking you away from delivering new objectives.

The current rigid structure of observability tools doesn't allow for custom context, flexibility, and real value to be generated. Because teams have either struggled to implement them or implemented and experienced the pain points above, we hear this a lot:

"dbt is out of control."
"We don’t know why our Airflow jobs that keep failing”
"We are swamped with new objectives, so we can't dedicate time to refactoring our current setup."

These are all reactive responses to a challenging problem for teams. How do you know if something will break until it breaks? Often times in in the architecture, or the fact you have years of different engineers doing different things.

This is why we are building Artemis. We are building the next generation of observability, context-relevant insights which drive automated resolution. We help you turn your dbt mess into a manageable model layer. We accomplish this by turning your platform from reactive to proactive.

Surface Tasks, not alerts: The insights we surface leverage context from your stack to give you a task to solve, leading you to a solution faster, not creating more work.
Personalized Context: Our personalized knowledge graph provides deep context and root cause analysis to explain the issues, what caused them, how to fix them, and their impacts on downstream or upstream models and tables.
Auto-Resolve: Our action engine takes your tasks, makes the changes, updates pipelines, and refactors your models for you. All you have to do is review the PR and push it to production! You can be as little or as much involved as you want. Interested in setting and forgetting? You can do that! Want to be in the nitty gritty? That's also not a problem. You are always in control.

We give you total control, context, and solutions, so your Slack channel isn't annoying, and Jira doesn’t get bloated; it shows your team and managers how much you are getting done.

Artemis Blog

Discussion about this post

Ready for more?