The Bloated dbt Repo
As we shifted from ETL to ELT, transformation workloads have become the main driver of data warehouse costs and teams are more bloated then ever. How did it get this bad?
Costs of transformation workloads do add up quickly. Data teams consistently tell us that dbt-orchestrated workloads drive the vast majority of their data warehouse spend.
This is a quote from an article I wrote with Tino. There are a lot of reasons why Snowflake, Fivetran, and dbt are worth billions, but the explosion of SaaS is a big one. The amount of data with different schemas and tools used within organizations forces teams to transform hundreds of datasets. As we shifted from ETL to ELT, transformation workloads have become the main driver of data warehouse costs and Fivetran has the data to back it up. So how does this happen?
The Model Bloat
When teams rolled out dbt core in their organizations, it was fantastic. They finally had a central place to write transformations, version control and more. As Chris Riccomini referenced in his interview, a lot of large organizations have built a simple version of dbt internally. So when dbt labs came knocking, it made sense to adopt. However, slowly but surely, over the years, as teams added contributors, wrote more models and used their data platform, the dbt model sprawl became unmanageable. It's a familiar story that we hear all the time. “We started with 30 dbt models, and now we have over 400.” For what? Sometimes, there are little mistakes, configurations set incorrectly, or dated logic that has yet to be updated. Sometimes, it’s because the original architect of the models left and the new one has a new way of building.
It's important to remember this is happening across the stack. This post from Ryan Janssen, while funny, is quite accurate. The MDS was incredible at lowering the barrier to working with data; however, with that, we lost a lot of fundamentals, and teams were not disciplined in their approach.
The Alert Crisis
Alongside the bloat issue, data observability tools constantly ping you when things break down. The more junk in the system, the more likely it is to fail. Most of these alerts aren't helpful for two key reasons. First, they lack context—you only get the issue without information about why it occurred, how to fix it, or how it affects other parts of your stack. Second, as alerts pile up, your backlog grows, and stress mounts until you feel overwhelmed by unsolvable problems. Each alert disrupts your work, forcing you to spend 3 to 4 hours diagnosing and fixing issues instead of focusing on your primary tasks. While monitoring stack issues is important, it's time for tools that detect problems and automatically resolve them, allowing teams to focus on strategic initiatives rather than constant optimization and maintenance.
Insights & Resolution
Engineers should focus on building solutions that help businesses grow and scale—not on optimizing tools like Snowflake, dbt, or Airflow. Yet data teams today fall into two categories: those spending over a quarter of their week on maintenance tasks with limited context and those avoiding maintenance altogether. The latter group burns through their budget, accumulates technical debt, and moves at a snail’s pace.
Teams crave insight and visibility into what's going wrong in their data platforms. The exciting part is that uncovering these issues gives us the fantastic opportunity to automate the work in one experience. Insights lead to tasks, which lead to tasks resolved automatically, which leads to a merged PR. These automated workflows aren't built on rigid rules—they're driven by customized insights from your environment! This means less time on maintenance and more time spent on work, which makes an impact.
This is the world we are building. Come join us on the ride!
About Artemis
Artemis monitors your data stack, finds issues, and automatically resolves them. Our users approve over 120 insights, merge 60+ PRs, and save over 20 hours a week. There is no need to migrate; our platform integrates with your data stack within 15 minutes!