<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Artemis Blog]]></title><description><![CDATA[Behind the Scenes — Updates from the Artemis team]]></description><link>https://blog.artemisdata.io</link><image><url>https://substackcdn.com/image/fetch/$s_!U4NJ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd43dd7d7-1e8c-4302-a9df-12aecfac7b99_408x408.png</url><title>Artemis Blog</title><link>https://blog.artemisdata.io</link></image><generator>Substack</generator><lastBuildDate>Mon, 20 Apr 2026 15:12:12 GMT</lastBuildDate><atom:link href="https://blog.artemisdata.io/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Josh Gray]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[artemisdata@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[artemisdata@substack.com]]></itunes:email><itunes:name><![CDATA[Josh Gray]]></itunes:name></itunes:owner><itunes:author><![CDATA[Josh Gray]]></itunes:author><googleplay:owner><![CDATA[artemisdata@substack.com]]></googleplay:owner><googleplay:email><![CDATA[artemisdata@substack.com]]></googleplay:email><googleplay:author><![CDATA[Josh Gray]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[The Future of Data Engineering]]></title><description><![CDATA[Will AI replace data engineers? No. Will AI fundamentally change data engineering work, yes.]]></description><link>https://blog.artemisdata.io/p/the-future-of-data-engineering</link><guid isPermaLink="false">https://blog.artemisdata.io/p/the-future-of-data-engineering</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Fri, 17 Jan 2025 18:15:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!dNVD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A lot of LinkedIn chatter is about whether AI will replace data engineers.</p><p>I believe the answer to that question is no. </p><p>Will AI fundamentally change data engineering work, and how will it be done? Yes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dNVD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dNVD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png 424w, https://substackcdn.com/image/fetch/$s_!dNVD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png 848w, https://substackcdn.com/image/fetch/$s_!dNVD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png 1272w, https://substackcdn.com/image/fetch/$s_!dNVD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dNVD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png" width="392" height="360.28215767634856" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:886,&quot;width&quot;:964,&quot;resizeWidth&quot;:392,&quot;bytes&quot;:426063,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dNVD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png 424w, https://substackcdn.com/image/fetch/$s_!dNVD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png 848w, https://substackcdn.com/image/fetch/$s_!dNVD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png 1272w, https://substackcdn.com/image/fetch/$s_!dNVD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F526a7e1a-07bf-45b2-b722-b779e9bf9d3a_964x886.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>The Context</h3><p>I spoke with the CTO of a unicorn data startup, who said, &#8220;We are really good at gathering data, but we are not the best at knowing what to do with it.&#8221; <a href="https://x.com/bennstancil">Benn Stancil</a> hints at this weekly in his blog, which is both sarcastically realistic and pessimistic. Based on <a href="https://benn.substack.com/p/searching-for-insight">his posts</a>, he would agree that we are great at gathering data, not creating insights.</p><p>We all saw the promise of <a href="https://www.youtube.com/watch?v=Tzin1DgexlE">Moneyball</a> and wanted to re-enact the concept of &#8220;winning the game&#8221; through data within our businesses to reap massive rewards. I started my data journey analyzing alternative data in finance, where my job was to find Moneyball-type insights for alpha-generating investment decisions. I was naive to think every data organization operated like this.</p><p>The first step in that journey was to capture every data point. This thinking aligned with "what you can't measure, you can't manage" (fun fact: this famous quote is misattributed and was part of a longer quote <a href="https://deming.org/myth-if-you-cant-measure-it-you-cant-manage-it/">which states the opposite</a>). The belief was that small teams could quantify everything through sheer force of will to find that needle in the haystack. Data was supposed to be the deciding factor between becoming the next Netflix or Blockbuster.</p><p>Data tools profited immensely from automating the nuts-and-bolts process of finding alpha, eliminating the complex technical skills of working with and manipulating data points and turning it into a seamless and automated system. Redshift may have accelerated this in the cloud, but billions of dollars have been created around this idea. I mean, Databricks just raised $10 billion based on this concept.</p><p>However, working with data is still incredibly manual, even these years later. Data platforms are fragile and nuanced and are only becoming more vital to teams (<a href="https://blog.artemisdata.io/p/why-the-data-platform-is-the-most">check out why I think the data platform is the most important internal tool</a>).</p><p>What is the ultimate destination for data practitioners if we aim to enable everyone in our organizations to work with data effortlessly through automation and AI? And in the future, what role will we data engineers play?</p><h3>We Are Becoming Pilots</h3><p>Data engineers are trending to have a similar role to commercial airline pilots. Like pilots, data engineers will manage complex systems (i.e. data platforms) that automatically handle hundreds of tasks. As the pilot of this data platform, you&#8217;ll be required to have the technical chops to deal with any issues that pop up while also being responsible for ensuring the platform and insights get to their destination.</p><p>We already see this transition with orchestrators taking on more complex work; however, AI will enable tools to go two or three layers deeper. It will fix pipelines on the fly, optimize workloads, and even provide analysis for end users. Your job is to manage the work and ensure the work is moving smoothly, in the correct order, and to the right people.</p><p>On top of this, once these systems are set up, you won&#8217;t need a team of 20 engineers. You&#8217;ll need two or three. They will all understand the platform and ensure that it is still performing at its optimal rate.</p><h3>How do we get there?</h3><p>Tools like&nbsp;<a href="https://www.artemisdata.io/?utm_source=linkedin&amp;utm_medium=social&amp;utm_campaign=post&amp;utm_content=future-of-de">Artemis</a>&nbsp;help you experience this future by isolating issues within your stack, prioritizing what needs to be solved, and surfacing insights. Once you approve an issue, our AI agents will resolve it. Artemis is the automated mechanic in your plane, ensuring you move as necessary.</p><p>A ton of innovation is still needed around the data stack. Automating the flow and cohesion of these systems is still tricky, but there are many opportunities to solve this.</p><p>The exciting part about this future is that with autopilot, you will have time to work on the Moneyball questions of the world. You have collected a lot of data. It's time to see what it says.</p><p>Where will you and your platform fly?</p><div><hr></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.artemisdata.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Artemis Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[The Natural Evolution of Data Platforms]]></title><description><![CDATA[Why do data platforms always get bloated and chaotic at scale? Are processes and tools the right solution to fix this problem?]]></description><link>https://blog.artemisdata.io/p/the-natural-evolution-of-data-platforms</link><guid isPermaLink="false">https://blog.artemisdata.io/p/the-natural-evolution-of-data-platforms</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Tue, 14 Jan 2025 21:05:09 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!6FlI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6FlI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6FlI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png 424w, https://substackcdn.com/image/fetch/$s_!6FlI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png 848w, https://substackcdn.com/image/fetch/$s_!6FlI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png 1272w, https://substackcdn.com/image/fetch/$s_!6FlI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6FlI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png" width="512" height="619.0849673202614" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1110,&quot;width&quot;:918,&quot;resizeWidth&quot;:512,&quot;bytes&quot;:870982,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6FlI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png 424w, https://substackcdn.com/image/fetch/$s_!6FlI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png 848w, https://substackcdn.com/image/fetch/$s_!6FlI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png 1272w, https://substackcdn.com/image/fetch/$s_!6FlI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb74a0edb-4af6-474c-b229-0f62ec732329_918x1110.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A familiar picture is drawn whenever I talk with data engineers or managers. They talk about how they have multiple BI tools, years of accumulated dbt models, and warehouse costs steadily climbing to reflect the bloat. What strikes me is how this happens so predictably across organizations. As data teams grow and evolve, they often end up with:</p><ul><li><p>Multiple BI tools serving different purposes.</p></li><li><p>Years of dbt models written by different team members, often solving similar problems differently.</p></li><li><p>A steady stream of ad-hoc queries and analysis, some of which could be consolidated into proper data models.</p></li><li><p>Significant compute costs which are not necessary</p></li></ul><p>I posted about this about a month ago, and the typical response was that this could be solved with the proper foundation and process.</p><p>While there is some truth in those statements, why does this still happen to almost every team that has scaled a data platform? How do data platforms become so messy in the first place?</p><p>After thinking about this for a while, the answer, while annoyingly so, could be that it is <em><strong>human nature</strong></em>. Looking beyond data engineering to other fields like software development and project management&#8212;or even our own homes&#8212;we see how clutter, mess, and disorganization naturally accumulate over time.</p><p>A large portion is because it takes focus and energy to stay on top of the mess. It&#8217;s also not fun work. It is not fun taking the trash out, just as it&#8217;s not fun combing through a model someone else wrote to find bugs and fixes. The benefit (and curse) is unlike in real life; the trash can continue to pile up on our data platforms without many people noticing.</p><p>As people, we almost always take the path of least resistance. We don&#8217;t finish the documentation because it's annoying. We don&#8217;t spend the time to see if a model already accomplishes what we need, so instead, we build a new one. This message has also been broadcast across the entire modern data stack. The answer for the longest time has been to add. Add another model, buy another solution, hire more analytics engineers.</p><p>So, sure, having a good model foundation, principles, and processes helps reduce or alleviate the chaos within a data platform. However, tech debt and bloat are going to happen.</p><p>Now that we know this, how can data teams stay on top of these issues and reduce them over time? This is a question a lot of data teams are asking themselves. From the teams we have spoken to, there are a few outcomes.</p><ol><li><p>You can do nothing and hope it&#8217;ll figure itself out one day. (Spoiler: it won&#8217;t).</p></li><li><p>You commission a team of 1 or 2 engineers to focus on this project. They spend 6-9 months on it and are pulled from other projects.</p></li><li><p>They do the opposite and commission 1-2 to focus on new work while the rest of the team is set on fixing their foundation. This is a costly outcome.</p></li><li><p>They use <a href="http://www.artemisdata.io/">Artemis</a>.</p></li></ol><p>Teams need a maniacal focus on maintaining a simple yet effective data platform. They must be diligent in their work to ensure the platform scales appropriately. This is the hard part of the modern data stack, and with the new trends around cost cutting, budget realignment, and vendors being cu, this complexity issue is brought to the forefront.</p><p>However, the issue for many teams is that they are being asked to trim a platform they did not build themselves while fulfilling all the work they had to do regardless. This lack of context makes fixing issues and cutting tech debt extremely challenging. You need either a person to do this or a tool to monitor your platform and surface issues and resolve them.</p><p>The reality is that this isn't about disorganization&#8212;it's about the natural evolution of data teams and platforms. As teams scale, "accidental complexity" appears, where solving immediate business needs gradually creates a more complicated system than anyone intended.</p><p>If your platform is bloated, <a href="https://cal.com/joshgray/15min">reach out!</a></p><h3>About Artemis</h3><p><a href="http://www.artemisdata.io/">Artemis</a> monitors your data stack, finds issues, and automatically resolves them. Our users approve over 120 insights, merge 60+ PRs, and save over 20 hours a week. There is no need to migrate; our platform integrates with your data stack within 15 minutes!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.artemisdata.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Artemis Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Pain of Research]]></title><description><![CDATA[At the end of last year, I asked one of our customers, &#8220;What was your aha moment when using Artemis?&#8221;]]></description><link>https://blog.artemisdata.io/p/the-pain-of-research</link><guid isPermaLink="false">https://blog.artemisdata.io/p/the-pain-of-research</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Wed, 08 Jan 2025 18:05:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!U_vV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!U_vV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!U_vV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png 424w, https://substackcdn.com/image/fetch/$s_!U_vV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png 848w, https://substackcdn.com/image/fetch/$s_!U_vV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png 1272w, https://substackcdn.com/image/fetch/$s_!U_vV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!U_vV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png" width="561" height="494" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/910a84f3-bb08-4362-b041-90be93469f34_561x494.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:494,&quot;width&quot;:561,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:322692,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!U_vV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png 424w, https://substackcdn.com/image/fetch/$s_!U_vV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png 848w, https://substackcdn.com/image/fetch/$s_!U_vV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png 1272w, https://substackcdn.com/image/fetch/$s_!U_vV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F910a84f3-bb08-4362-b041-90be93469f34_561x494.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At the end of last year, I asked one of our customers, &#8220;What was your aha moment when using Artemis?&#8221;</p><p>Without skipping a beat, he said, &#8220;The moment I combed through the insights and realized I no longer needed to keep all this context in the back of my mind, Artemis does the research and then surfaces issues for me.&#8221;</p><p>This was not what I was expecting. I thought he would highlight all the cool AI agent tech we had built that takes the insights we surface and then resolves the tasks for him. Instead, the value that got him was that we researched for him.</p><p>In this context, research related to the work needed to understand what problems were occurring in his data platform, why they were happening, and what a suitable fix was to resolve those issues.</p><p>This pain of research is deep, and the more we dig into it, it is a huge cost driver for data teams. A <a href="https://www.getdbt.com/resources/reports/state-of-analytics-engineering-2024">dbt labs survey</a> in 2024 found that data teams spend 26% of their week fixing and maintaining their data platform. This is in addition to the fact that engineers spend 55% of their time maintaining or organizing data sets; a combined 80% of their time is spent on some form of maintenance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T_XA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T_XA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png 424w, https://substackcdn.com/image/fetch/$s_!T_XA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png 848w, https://substackcdn.com/image/fetch/$s_!T_XA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png 1272w, https://substackcdn.com/image/fetch/$s_!T_XA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T_XA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png" width="1043" height="547" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:547,&quot;width&quot;:1043,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:172769,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T_XA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png 424w, https://substackcdn.com/image/fetch/$s_!T_XA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png 848w, https://substackcdn.com/image/fetch/$s_!T_XA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png 1272w, https://substackcdn.com/image/fetch/$s_!T_XA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d40f154-55c6-4fc5-96ec-92904eed8f93_1043x547.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If teams spend 80% of their week on maintenance, a considerable amount of that is spent on research. Why are teams spending all this time on research?</p><h3>A Few Possible Reasons for So Much Research</h3><ul><li><p><strong>You didn&#8217;t write it: M</strong>ost practitioners do not build the platform they hold the keys to. Data engineers spend their time understanding, reverse engineering, and dissecting the work of others to fix, maintain, and improve data platforms.</p></li><li><p><strong>Fragmentation means no one is a master:</strong> When building and maintaining a platform of 8-12 tools, all of which have quirks and ways of becoming efficient, there is much to learn. As an engineer, you won&#8217;t know every tool and won&#8217;t have the time to understand the intricacies of each. So, you learn enough to get by and take little refresher courses when you need to solve a specific problem.</p></li></ul><p>This convoluted web of work leads to tech debt, but it's also the workspace most data practitioners work in daily.</p><p>The other side of this is the mental drain on you. We like the path of least resistance, and it&#8217;s discouraging when we see a ticket that should take 15 minutes to resolve to take three or more hours due to the added complexity and research needed.</p><p>The emotional side of this problem is deep-rooted. It can crush momentum and motivation for a team performing at a high level and seriously derail progress.</p><p>Research is necessary for solving problems, and data is no different. However, in the long run, will we continue to do the research ourselves as practitioners, or will AI do it for us?</p>]]></content:encoded></item><item><title><![CDATA[Moving from Reactive to Proactive Data Observability ]]></title><description><![CDATA[I spoke with the CTO of a unicorn data startup, who said, &#8220;We are really good at gathering data, but we are not the best at knowing what to do with it.&#8221;]]></description><link>https://blog.artemisdata.io/p/moving-from-reactive-to-proactive</link><guid isPermaLink="false">https://blog.artemisdata.io/p/moving-from-reactive-to-proactive</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Tue, 07 Jan 2025 16:02:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NmB4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NmB4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NmB4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png 424w, https://substackcdn.com/image/fetch/$s_!NmB4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png 848w, https://substackcdn.com/image/fetch/$s_!NmB4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png 1272w, https://substackcdn.com/image/fetch/$s_!NmB4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NmB4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png" width="559" height="448" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:448,&quot;width&quot;:559,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:368244,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NmB4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png 424w, https://substackcdn.com/image/fetch/$s_!NmB4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png 848w, https://substackcdn.com/image/fetch/$s_!NmB4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png 1272w, https://substackcdn.com/image/fetch/$s_!NmB4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85b96119-2355-4473-9b0f-91907ec0a90c_559x448.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I spoke with the CTO of a unicorn data startup, who said, &#8220;We are really good at gathering data, but we are not the best at knowing what to do with it.&#8221;</p><p>This observation rings true for data platforms. After speaking with over 200 data teams, it became clear that scaling platforms face a common challenge: cutting through the noise to prioritize work that ensures stability and optimal costs is incredibly difficult, whether it's untangling complex dbt models to fix a schema change or determining which of the 15 Airflow pipelines needs immediate attention.</p><p>When it comes to enterprise, it's more like thousands of Airflow pipelines a day. This brittleness is only further driven by the fact that teams are stuck in a reactive state. It is really hard to future-proof your data stack.</p><p>The first step teams take to solve this problem is to purchase a data observability tool. At least you know what is happening in your stack. At first, you pour time into setting up all the alerts, funnels and tables to monitor, and it's fantastic. However, what typically happens as you scale your dbt models and warehouse spending?</p><ol><li><p><strong>Notification Overload:</strong> Your Slack channel gets spammed, and you stop responding or caring as false positives appear.</p></li><li><p><strong>Lack of Context:</strong> You notice the alerts are coming in without context and provide no information on why a problem has occurred, how to fix it, or how it impacts other parts of your stack.</p></li><li><p><strong>Capacity Shortage:</strong> As you get more alerts, your backlog expands, your stress heightens, and you feel that you can't solve all the problems you have. Each alert means you will get distracted and spend 3 to 4 hours figuring out what went wrong and solving the issue, taking you away from delivering new objectives.</p><p></p></li></ol><p>The current rigid structure of observability tools doesn't allow for custom context, flexibility, and real value to be generated. Because teams have either struggled to implement them or implemented and experienced the pain points above, we hear this a lot:</p><blockquote><p>"dbt is out of control."</p><p>"We don&#8217;t know why our Airflow jobs that keep failing&#8221;</p><p>"We are swamped with new objectives, so we can't dedicate time to refactoring our current setup."</p></blockquote><p>These are all reactive responses to a challenging problem for teams. How do you know if something will break until it breaks? Often times in in the architecture, or the fact you have years of different engineers doing different things.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c3Os!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c3Os!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png 424w, https://substackcdn.com/image/fetch/$s_!c3Os!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png 848w, https://substackcdn.com/image/fetch/$s_!c3Os!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png 1272w, https://substackcdn.com/image/fetch/$s_!c3Os!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c3Os!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png" width="689" height="415" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:415,&quot;width&quot;:689,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:416106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c3Os!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png 424w, https://substackcdn.com/image/fetch/$s_!c3Os!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png 848w, https://substackcdn.com/image/fetch/$s_!c3Os!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png 1272w, https://substackcdn.com/image/fetch/$s_!c3Os!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3a5ddc-15f6-4f3e-b875-eacb98f98ed7_689x415.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is why we are building <a href="https://www.linkedin.com/company/artemis-data/">Artemis</a>. We are building the next generation of observability, context-relevant insights which drive automated resolution. We help you turn your dbt mess into a manageable model layer. We accomplish this by turning your platform from reactive to proactive.</p><ol><li><p><strong>Surface Tasks, not alerts:</strong> The insights we surface leverage context from your stack to give you a task to solve, leading you to a solution faster, not creating more work.</p></li><li><p><strong>Personalized Context:</strong> Our personalized knowledge graph provides deep context and root cause analysis to explain the issues, what caused them, how to fix them, and their impacts on downstream or upstream models and tables.</p></li><li><p><strong>Auto-Resolve:</strong> Our action engine takes your tasks, makes the changes, updates pipelines, and refactors your models for you. All you have to do is review the PR and push it to production! You can be as little or as much involved as you want. Interested in setting and forgetting? You can do that! Want to be in the nitty gritty? That's also not a problem. You are always in control.</p></li></ol><p>We give you total control, context, and solutions, so your Slack channel isn't annoying, and Jira doesn&#8217;t get bloated; it shows your team and managers how much you are getting done.</p>]]></content:encoded></item><item><title><![CDATA[Fragmentation Hell]]></title><description><![CDATA[I had a call with a data engineer who talked about how the fragmentation in the data stack is crushing his team.]]></description><link>https://blog.artemisdata.io/p/fragmentation-hell</link><guid isPermaLink="false">https://blog.artemisdata.io/p/fragmentation-hell</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Fri, 13 Dec 2024 15:15:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!bxmb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bxmb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bxmb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png 424w, https://substackcdn.com/image/fetch/$s_!bxmb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png 848w, https://substackcdn.com/image/fetch/$s_!bxmb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png 1272w, https://substackcdn.com/image/fetch/$s_!bxmb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bxmb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png" width="383" height="325.9574468085106" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:520,&quot;width&quot;:611,&quot;resizeWidth&quot;:383,&quot;bytes&quot;:408424,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bxmb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png 424w, https://substackcdn.com/image/fetch/$s_!bxmb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png 848w, https://substackcdn.com/image/fetch/$s_!bxmb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png 1272w, https://substackcdn.com/image/fetch/$s_!bxmb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6db7e7aa-7896-40f4-ba20-f0d995e5cf4e_611x520.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On Tuesday, I had a call with a data engineer who talked about how the fragmentation in the data stack is crushing his team. His team of six engineers use 13 different tools; the majority of them are point solutions with overlapping feature sets. Over the years, there has been a lack of consistency in how work is done (e.g., scheduling some jobs in Airflow and others with in-tool scheduling).</p><p>The platform's chaos is caused by the entirety of the platform, not by individual models or tools. So why is this a problem for teams, and what are the ripple effects?</p><h3>Overhead</h3><p>For starters, there's the obvious operational overhead. When different parts of your platform operate independently, you're essentially maintaining multiple mini-platforms rather than one cohesive system. Each requires monitoring, maintenance, and expertise. The expectation is to become experts in Snowflake, dbt, Airflow, Looker, and many other tools to execute your stack perfectly. This doesn&#8217;t happen, so over the years, little decisions have added up to create larger issues and cracks. This constant context switching means its overall tasks take longer.</p><h3>Cross Workload Optimization</h3><p>Since most aren&#8217;t experts, there's a hidden cost of missed optimization opportunities. When workloads run in isolation, you could have a metric in dbt already calculated elsewhere or run transformations on data that hasn't been updated since the last run. These issues multiply across your platform, which leads to massive increases in compute costs and inefficient processing times. The other problem is that error logs and alerts are siloed. An airflow job might fail which will trigger a dashboard to break, with every tool providing vague error logs, it can be difficult to find the real source of the problem.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xWdY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xWdY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png 424w, https://substackcdn.com/image/fetch/$s_!xWdY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png 848w, https://substackcdn.com/image/fetch/$s_!xWdY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png 1272w, https://substackcdn.com/image/fetch/$s_!xWdY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xWdY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png" width="377" height="545.8710743801653" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/abe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:876,&quot;width&quot;:605,&quot;resizeWidth&quot;:377,&quot;bytes&quot;:709576,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xWdY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png 424w, https://substackcdn.com/image/fetch/$s_!xWdY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png 848w, https://substackcdn.com/image/fetch/$s_!xWdY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png 1272w, https://substackcdn.com/image/fetch/$s_!xWdY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabe21529-ded6-4385-9794-d1bc4dae3ac6_605x876.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>How to fix it?</h3><p>The solution isn't necessarily to force everything into a single, monolithic platform. Rather, as a data industry, we need to adopt a mentality of &#8216;doing more with less&#8217;. Since AI came along, data teams have been moved from R&amp;D with large budgets to a cost center needing to bring ROI to the business. <a href="https://substack.com/home/post/p-152996776?source=queue&amp;autoPlay=false">The benefit is that data platforms</a> are the most important tool for companies to win in AI.</p><p>So, what is the solution? There are several. One key is to develop a platform-wide understanding of how your various tools interact and affect each other. The goal isn&#8217;t eliminating all complexity&#8212;it's making it manageable through better visibility and understanding.</p><p>In a world where data platforms are becoming increasingly complex and distributed, this holistic understanding isn't just nice to have&#8212;it's essential. For 2025, we are seeing a trend toward building systems that can scale efficiently and cost-effectively. The alternative is continuing to optimize locally while missing the global picture.</p><h3>Artemis Comes To You</h3><p><a href="http://www.artemisdata.io">We work</a> with teams on various stacks; the best part is that it meets organizations where they are. We consolidate all your metadata and logs into one place and then surface insights into what is broken or improved. This allows you to focus more on delivering value rather than fixing your stack.</p><blockquote><p>&#8220;That&#8217;s what I like about Artemis. You don&#8217;t need to move off your data warehouse, you don&#8217;t need to move off of dbt, and you don&#8217;t need to adopt some vertically integrated solution. We&#8217;ve dug ourselves into a pretty big hole with the MDS tech stack, and tools like Artemis will play a role in helping us wrangle all this stuff.&#8221;<br> &#8212; <a href="https://www.linkedin.com/in/riccomini/">Chris Riccomini</a> &#8212; Contributor of Apache Samza and Airflow</p></blockquote><p>Our users, on average, resolve 120 insights a week, merge 60+ PRs and save 20 hours! If you want to simplify your stack and get dbt under control, reach out!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.artemisdata.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Artemis Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why The Data Platform is The Most Important Internal Tool]]></title><description><![CDATA[I was inspired to write this post while reading Packy McCormick's Not Boring article on Rox, a new investment of his. Here is my case for why the data platform is the most important internal tool.]]></description><link>https://blog.artemisdata.io/p/why-the-data-platform-is-the-most</link><guid isPermaLink="false">https://blog.artemisdata.io/p/why-the-data-platform-is-the-most</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Wed, 11 Dec 2024 23:47:32 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ujB6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I was inspired to write this post while reading <a href="https://www.linkedin.com/in/packym/">Packy McCormick</a>'s <a href="https://www.linkedin.com/company/not-boring-co/">Not Boring</a> article on <a href="https://www.linkedin.com/company/rox-data-corp/">Rox</a>, his new investment. In the <a href="https://www.notboring.co/p/rox?publication_id=10025&amp;post_id=152147356&amp;isFreemail=true&amp;r=1wgam4&amp;triedRedirect=true">article</a>, he mentioned a few stats that stuck out to me. , I was interested in a few stats he mentioned:</p><div class="pullquote"><p>The underlying data that drives Salesforce left Salesforce five years back,&#8221; Ishan explained. It moved to the data warehouse, provided by companies like Snowflake. In fact, <strong>40% of all data in data warehouses is customer data.</strong>&#8221;</p></div><p>I didn&#8217;t realize the amount of customer data was so high. Additionally, the statistic that 90% of 300 companies have built internal CRM tools makes a lot of sense. The first project a lot of data teams have is financial reporting, which creates more insight into sales and CRM. Use Fivetran and dump all 800 of those Salesforce tables into Snowflake to get an accurate picture of what is truly going on in your business. The question I then ask myself is, does this mean the Modern Data Stack won? <a href="https://news.ycombinator.com/item?id=39338626">I thought it was dead.</a> While it was a marketing term which inspired a &#8216;movement,&#8217; a great test to determine if it is valuable is if the system of record is moving to the data warehouse. This shift makes the case that the data platform will be the most critical tool for companies to win in AI.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ujB6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ujB6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png 424w, https://substackcdn.com/image/fetch/$s_!ujB6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png 848w, https://substackcdn.com/image/fetch/$s_!ujB6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png 1272w, https://substackcdn.com/image/fetch/$s_!ujB6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ujB6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png" width="468" height="417.4192139737991" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:817,&quot;width&quot;:916,&quot;resizeWidth&quot;:468,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!ujB6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png 424w, https://substackcdn.com/image/fetch/$s_!ujB6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png 848w, https://substackcdn.com/image/fetch/$s_!ujB6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png 1272w, https://substackcdn.com/image/fetch/$s_!ujB6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f3511aa-9b98-4462-ac20-4e5d5e13c1b8_916x817.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As teams go to build AI products and internal tools, they want and need a centralized data source. The glorified one source of truth that we are all still trying to find. The issue with most AI tools that we purchase today is that they are siloed. They can only automate or increase value on the small point solution workflow they own. This is why SaaS will struggle; it fought hard to get your data, but with only a small sliver of the workflow, that data can only help so much. A Hubspot copilot is locked to Hubspot, a dbt copilot is locked to dbt and a text-to-SQL tool in Snowflake is locked to that ecosystem. With every data platform wanting to own <a href="https://www.getdbt.com/blog/coalesce-2024-product-announcements">the entire stack</a>, those tools will stay that way.</p><p>Where does this lead us? It means that teams need to centralize their tools into the warehouse. The incentive to build a well-governed, holistic data platform to leverage AI has never been higher. Centralize, clean, and use all that data to empower the tools and people to move mountains. This is where data teams can bring huge ROI to businesses. This is what makes the data platform the most important internal tool. Those who can deliver on this vision will have a competitive advantage. Mixing external data with clean and trustworthy internal datasets will create billions of dollars in opportunities for teams.</p><p>This is also true for the tools across your data stack, which are fragmented. A large tailwind we see with Artemis is that data teams are hungry for tools that work across the stack. With Artemis, you centralize the metadata and error logs siloed across individual tools and combine them into one tool to automate workflows across the stack. We don&#8217;t look at your tools in isolation. We take a platform approach.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9z8t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9z8t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png 424w, https://substackcdn.com/image/fetch/$s_!9z8t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png 848w, https://substackcdn.com/image/fetch/$s_!9z8t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png 1272w, https://substackcdn.com/image/fetch/$s_!9z8t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9z8t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png" width="548" height="411.4114114114114" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:500,&quot;width&quot;:666,&quot;resizeWidth&quot;:548,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!9z8t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png 424w, https://substackcdn.com/image/fetch/$s_!9z8t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png 848w, https://substackcdn.com/image/fetch/$s_!9z8t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png 1272w, https://substackcdn.com/image/fetch/$s_!9z8t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91b56e0b-1163-476e-bdcb-214b50a2f860_666x500.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When teams were built with the MDS, no guard rails were built. The amount of bloat is staggering. We speak with teams with 500+ data models; in reality, they only need 60. The average team we speak with spends 1.5- 3 times their warehouse budget. This bloat not only makes platforms more expensive but also brings complexity. This complexity leads to teams not understanding what their data is doing and how it impacts the organization. It means no one trusts the data. Our motto is simple is best. The less you can do, the better. Clearing out the junk and getting data platforms in shape should be step 1 in using AI.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.artemisdata.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Artemis Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[BigQuery Slots: What You Need to Know]]></title><description><![CDATA[A few weeks ago, I posted that Artemis picked up an insight that saved a customer $11k annually in BigQuery costs. A few people asked how we did it, and the answer was optimizing BigQuery Slots.]]></description><link>https://blog.artemisdata.io/p/bigquery-slots-what-you-need-to-know</link><guid isPermaLink="false">https://blog.artemisdata.io/p/bigquery-slots-what-you-need-to-know</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Mon, 09 Dec 2024 15:05:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!U4NJ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd43dd7d7-1e8c-4302-a9df-12aecfac7b99_408x408.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few weeks ago, I posted that Artemis picked up an insight that saved a customer $11k annually in BigQuery costs. One insight, implemented in less than an hour, saved $11k&#8212;not bad ROI for a Tuesday morning. </p><p>We work with a lot of data teams that use BigQuery, and we've noticed that most teams only use 40-60% of the compute they pay for due to misusing BigQuery slots. Sometimes, there is a business use case for overspending, but teams are often unaware of what they are spending.</p><h3>What is a Slot?</h3><p>BigQuery slots are virtual CPUs that power your SQL queries by providing compute resources. When you run a query, BigQuery calculates how many slots you need based on its size and complexity.</p><p>So what happens? When you run a query in BigQuery, the system first analyzes your query and creates a plan. Then, it assigns slots based on the query's complexity, available resources, and priorities. These slots work together to process your query efficiently.</p><h3>Contention</h3><p>There are a few edge cases when you can have slot contention, which occurs when more queries are waiting to be executed than available slots. This typically happens when too many queries run simultaneously, a particularly heavy query needs extra horsepower, or everyone's trying to run reports simultaneously. BigQuery will automatically react to this and change the number of slots based on fair scheduling. What&#8217;s fair scheduling?</p><h3>Fair scheduling</h3><p>Fair scheduling is the mechanism BigQuery uses to distribute slots among multiple queries and projects. Essentially, it ensures that all queries have equal access to available slots, regardless of the project or user. When a query is run, it is placed in a queue. The scheduler then fairly allocates slots to the queries in the queue. It ensures no queries are hogging resources. While this means queries are treated the same, if the system is overloaded, some queries can get paused or prioritize the larger queries in the queue, impacting performance, a tradeoff teams must wrestle with.</p><h3>What goes wrong?</h3><p>So now you have a sense of how they work, how do teams misuse them? In most cases, teams are unaware of over-provisioning slots and utilize more than they need. Slots also go hand in hand with poor query design. If a query has a lot of joins, doesn&#8217;t filter the data early in the query, or has a whole host of other minor issues, the query will take more resources and require more slots.</p><p>So, to start, teams first need to examine their queries. Whether it be to split your tables into smaller chunks, filter early to reduce unnecessary work, or keep your joins efficient, all of this impacts your query performance, which affects your slot utilization. Once you can fix and improve your queries, you can start to work on better scheduling. Syncing larger jobs at times of the day when teams use BI tools, dedicated resources for specific teams or workloads, or keeping an eye on when and what is using the most resources will help you use less while moving faster.</p><h3>Final thoughts</h3><p>While you might think that adding more slots will mean more queries get executed faster, it can cover significant performance issues. Like I said at the beginning, most teams we work with who use BigQuery don&#8217;t utilize 1/3rd of the compute they pay for. That is a ton of money!</p><p>Getting the most out of BigQuery means understanding how to use your slots wisely while balancing performance and cost.</p><p>Don't want to deal with the hassle? Our platform has a simulation model that can accurately predict the most optimal slot allocation for you. Let <a href="http://www.artemisdata.io">Artemis</a> handle the heavy lifting - we'll optimize your slots automatically so you can focus on what matters.</p>]]></content:encoded></item><item><title><![CDATA[What the Heck Happened to the MDS?]]></title><description><![CDATA[Not too long ago, the Modern Data Stack (MDS) was hailed as a game-changer in the data engineering world.]]></description><link>https://blog.artemisdata.io/p/what-the-heck-happened-to-the-mds</link><guid isPermaLink="false">https://blog.artemisdata.io/p/what-the-heck-happened-to-the-mds</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Mon, 02 Dec 2024 17:20:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!I7zA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I7zA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I7zA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!I7zA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!I7zA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!I7zA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I7zA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png" width="1456" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3318896,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I7zA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!I7zA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!I7zA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!I7zA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4053c46d-8e64-4e79-8a33-b4207c3bfd82_1792x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Not too long ago, the <strong>Modern Data Stack (MDS)</strong> was hailed as a game-changer in the data engineering world. Its modular approach&#8212;featuring tools like Fivetran for ingestion, dbt for transformations, Snowflake for warehousing, and reverse ETL solutions such as Hightouch&#8212;promised a revolution. It offered flexibility, scalability, and a chance for organizations to <em>democratize</em> data. But today, the buzz has sorta died down. So, what happened?</p><h3><strong>From Game-Changer to Table Stakes</strong></h3><p>The MDS quickly became the go-to architecture for many companies, and in doing so, it transitioned from being modern to simply expected. <strong>What was once revolutionary became routine.</strong> Tools like Snowflake and dbt are now common place, and many of the stack&#8217;s principles have been internalized across the industry.</p><p>Yet, alongside its success came its (many) pitfalls, here are just some:</p><ol><li><p><strong>Complexity:</strong> The "six-vendor problem" emerged, where organizations needed to integrate have a dozen tools from separate vendors. (We will talk about my bet that the pendulum will start swinging in a bit)</p></li></ol><ul><li><p><strong>Cost:</strong> At the beginning it was a bit cool to brag about your fancy tool stack, but now people know that they each came with a fat price tag as well.</p></li><li><p><strong>Operational Overhead:</strong> Managing multiple contracts, SLAs, and integrations introduced friction that undercut the promised efficiency of the MDS.</p></li></ul><h3><strong>The Buzzword Problem</strong></h3><p>Tristan Handy and Benn Stancil, in their <a href="https://roundup.getdbt.com/p/ep-56-the-end-of-the-modern-data">dbt Roundup episode on the MDS</a>, argue that the term "Modern Data Stack" was always more of a <strong>marketing construct</strong> than a clearly defined standard. It promised the allure of best-in-class modularity but often left organizations struggling to piece together a fragmented ecosystem.</p><p>Stancil succinctly summarized the problem: "The Modern Data Stack wasn&#8217;t a product&#8212;it was a collection of vendors selling different pieces of the puzzle." And while modularity was a strength, it became a liability when too many tools needed to be orchestrated together.</p><h3><strong>The Pendulum is Swinging: The Shift Towards Consolidation</strong></h3><p>A key development in the MDS&#8217;s evolution is <strong>platform consolidation.</strong> Organizations began moving away from assembling their stacks tool-by-tool, opting instead for single platforms that could manage multiple layers of the pipeline:</p><ul><li><p><strong>Snowflake</strong>: Once primarily a warehouse, Snowflake now offers tools for app-building and more integrated transformation capabilities.</p></li><li><p><strong>Databricks</strong>: Initially focused on machine learning, it has expanded into data warehousing with Databricks SQL, offering an all-in-one lakehouse solution.</p></li></ul><p>These shifts address the core weaknesses of the MDS&#8212;cost, complexity, and interoperability&#8212;by simplifying vendor relationships and unifying tooling.</p><h3><strong>The AI Effect</strong></h3><p>The rise of AI has also stolen the spotlight. Companies are now focused on LLMs, GenAI, and AI-native data architectures that promise insights far beyond traditional analytics. While the MDS once captured the zeitgeist of innovation, it has been overshadowed by this next wave of excitement.</p><p>In Handy&#8217;s view, "The conversation has shifted." The energy that once fuelled the MDS is now powering AI-driven architectures.</p><p><strong>Before we get too deep into AI let&#8217;s take a look at some lessons learned from the MDS&#8230;</strong></p><ol><li><p><strong>"Modern" Is a Moving Target:</strong> The term modern dates pretty poorly. Tools that feel cutting-edge today will probably be legacy tomorrow.</p></li><li><p><strong>Integration Fatigue Is Real:</strong> Too many vendors and tools create operational friction, even if each tool excels individually.</p></li><li><p><strong>Adaptability Wins:</strong> Platforms that evolve to simplify workflows&#8212;offering consolidated, efficiency and end-to-end solutions&#8212;tend to thrive in the long term.</p></li><li><p><strong>Culture Matters:</strong> No matter how modern your stack is, you still need modern data culture.</p></li></ol><h3><strong>So, Is the Modern Data Stack Dead?</strong></h3><p>Not quite&#8212;but it&#8217;s definitely evolved. The Modern Data Stack isn&#8217;t a failure; it&#8217;s a stepping stone<strong>.</strong> It&#8217;s taught the industry valuable lessons about modularity, scalability, and the importance of flexibility. Today, many organizations still use MDS-inspired architectures, albeit with fewer tools and more reliance on integrated platforms.</p><p>The data landscape will continue to change, whether through the rise of AI-native stacks, further platform consolidation, or entirely new paradigms. But the spirit of the Modern Data Stack&#8212;solving hard problems with creative, scalable solutions&#8212;remains alive and well.</p><p>So, what happened to the MDS? It matured, and in doing so, it paved the way for whatever comes next. Whether that&#8217;s called the "Next-Gen Data Stack" or something else, one thing is clear: <strong>data engineering isn&#8217;t slowing down anytime soon.</strong></p>]]></content:encoded></item><item><title><![CDATA[Reimagining Data Engineering: From Reactive to Proactive Systems]]></title><description><![CDATA[How many of us feel like every time we work on a data and analytical project we are putting out a fire?]]></description><link>https://blog.artemisdata.io/p/reimagining-data-engineering-from</link><guid isPermaLink="false">https://blog.artemisdata.io/p/reimagining-data-engineering-from</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Wed, 27 Nov 2024 17:15:49 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/45e85b66-7ce5-43b7-b4ab-85f2a2a09895_220x220.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>How many of us feel like every time we work on a data and analytical project we are putting out a fire?</strong></p><p>If you've ever found yourself drowning in urgent requests, constantly jumping from one crisis to another, you're experiencing the all-too-familiar world of reactive data engineering. But what if there's a better way?</p><h2>The Reactive Reality: Trapped in the Moment</h2><p>Let me paint a picture that might sound uncomfortably familiar. Your typical day as a data engineer looks like a never-ending game of whack-a-mole. Urgent requests flood your inbox. Dashboards break mysteriously. Stakeholders demand <em>immediate</em> fixes and you're responsive &#8211; maybe <em>too</em> responsive.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9KxU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9KxU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9KxU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9KxU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9KxU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9KxU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg" width="1258" height="664" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:664,&quot;width&quot;:1258,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:107143,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9KxU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9KxU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9KxU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9KxU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8dc544d-ed1e-49b4-b91c-380683b1f3e3_1258x664.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A reactive data team is characterized by constant firefighting:</p><ul><li><p>Responding to problems after they've already occurred</p></li><li><p>Spending more time fixing than preventing</p></li><li><p>Feeling like you're always one step behind</p></li><li><p>Struggling to focus on strategic initiatives</p></li></ul><h3>Why Reactivity Becomes a Trap</h3><p>Why do we find ourselves perpetually reacting instead of proactively planning? The root causes are both systemic and cultural:</p><ul><li><p>Lack of comprehensive data literacy</p></li><li><p>Insufficient strategic planning</p></li><li><p>Organizational structures that prioritize immediate fixes over long-term solutions</p></li><li><p>A culture that celebrates heroic problem-solving rather than preventative thinking</p></li></ul><blockquote><p>The most insidious consequence? Your team is no longer seen as a strategic partner. Instead, <strong>you're viewed as a support function</strong> &#8211; always available, always fixing, but never truly driving innovation.</p></blockquote><h2>The Proactive Paradigm: A New Approach</h2><p>Proactive data management is about anticipation, not just reaction. It's the difference between treating symptoms and preventing the disease.</p><p><strong>A Concrete Example:</strong> Imagine two scenarios for a user experience tracking project:</p><p>Reactive Approach:</p><ul><li><p>Discover a usability issue after customer complaints</p></li><li><p>Spend weeks investigating the problem</p></li><li><p>Develop a retroactive fix</p></li><li><p>Apologize to affected users</p></li></ul><p>Proactive Approach:</p><ul><li><p>Continuously monitor user interaction patterns</p></li><li><p>Develop predictive models that identify potential friction points</p></li><li><p>Create preemptive improvements</p></li><li><p>Enhance user experience before problems arise</p></li></ul><p>The proactive method doesn't just solve problems &#8211; it prevents them from happening in the first place.</p><h2>Changing the Status Quo: A Roadmap to Proactivity</h2><p>Becoming a proactive data team requires a fundamental business shift:</p><ol><li><p>Become Outcome-Focused</p><ul><li><p>Align data initiatives directly with business objectives</p></li><li><p>Create measurable, forward-looking goals</p></li></ul></li><li><p>Embrace Data-Informed Decision Making</p><ul><li><p>Develop predictive analytics capabilities</p></li><li><p>Look beyond historical data to future possibilities</p></li></ul></li><li><p>Increase Data Literacy</p><ul><li><p>Educate stakeholders about data's strategic potential</p></li><li><p>Create clear, understandable data narratives</p></li></ul></li><li><p>Document and Share</p><ul><li><p>Maintain comprehensive project documentation</p></li><li><p>Create knowledge repositories that enable continuous learning</p></li></ul></li><li><p>Be a Strategic Collaborator</p><ul><li><p>Proactively suggest improvements</p></li><li><p>Position your team as innovation partners, not just service providers</p></li></ul></li></ol><h2>The Time Constraint Dilemma</h2><p><strong>But who has time for this?</strong></p><p>We get it, the reason the firefighting happens in the first place is because teams are stretched thin. But, this is where solutions like <a href="http://www.artemisdata.io">Artemis</a> become a game-changer. By automating maintenance and optimization tasks, Artemis gives data teams a critical gift: time.</p><h3>Artemis: Transforming Time into Innovation</h3><p>Imagine reclaiming 500 hours annually. That's what Artemis offers by:</p><ul><li><p>Automatically maintaining and optimizing dbt models</p></li><li><p>Streamlining documentation processes</p></li><li><p>Improving warehouse performance</p></li><li><p>Reducing manual technical inefficiencies</p></li></ul><p>The result? Your team can shift from surviving to thriving, focusing on innovation rather than endless troubleshooting.</p><h2>The Future of Data Engineering</h2><p>The most valuable data engineers won't be those who are fastest at putting out fires, but those who prevent fires from starting.</p><p>By embracing proactive strategies and leveraging intelligent automation, you're not just improving efficiency &#8211; you're reimagining the entire role of data engineering.</p><p><strong>Are you ready to step out of the emergency room and into the innovation lab?</strong></p><p>The future is proactive. And it starts with you.</p>]]></content:encoded></item><item><title><![CDATA[The Bloated dbt Repo]]></title><description><![CDATA[As we shifted from ETL to ELT, transformation workloads have become the main driver of data warehouse costs and teams are more bloated then ever. How did it get this bad?]]></description><link>https://blog.artemisdata.io/p/the-bloated-dbt-repo</link><guid isPermaLink="false">https://blog.artemisdata.io/p/the-bloated-dbt-repo</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Tue, 26 Nov 2024 15:45:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nw68!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p>Costs of transformation workloads do add up quickly. Data teams consistently tell us that dbt-orchestrated workloads drive the vast majority of their data warehouse spend.</p></blockquote><p>This is a quote from an <a href="https://www.linkedin.com/pulse/we-need-talk-dbt-tino-tereshko--zsn5c/">article</a> I wrote with <a href="https://www.linkedin.com/search/results/all/?fetchDeterministicClustersOnly=true&amp;heroEntityKey=urn:li:fsd_profile:ACoAAAEKEf0Bl1tkkC3dtOfhk8VJx7WuukqAL4M&amp;keywords=tino%20tereshko%20%F0%9F%87%BA%F0%9F%87%A6&amp;origin=RICH_QUERY_TYPEAHEAD_HISTORY&amp;position=0&amp;searchId=5884e173-4068-41c3-948e-758ab5fe311a&amp;sid=XiB&amp;spellCorrectionEnabled=true">Tino</a>. There are a lot of reasons why Snowflake, Fivetran, and dbt are worth billions, but the explosion of SaaS is a big one. The amount of data with different schemas and tools used within organizations forces teams to transform hundreds of datasets. As we shifted from ETL to ELT, transformation workloads have become the main driver of data warehouse costs and <a href="https://www.fivetran.com/blog/how-do-people-use-snowflake-and-redshift">Fivetran</a> has the data to back it up. So how does this happen?</p><h3>The Model Bloat</h3><p>When teams rolled out dbt core in their organizations, it was fantastic. They finally had a central place to write transformations, version control and more. As <a href="https://x.com/criccomini">Chris Riccomini </a>referenced in his <a href="https://substack.com/home/post/p-151623503?source=queue">interview</a>, a lot of large organizations have built a simple version of dbt internally. So when dbt labs came knocking, it made sense to adopt. However, slowly but surely, over the years, as teams added contributors, wrote more models and used their data platform, the dbt model sprawl became unmanageable. It's a familiar story that we hear all the time. &#8220;We started with 30 dbt models, and now we have over 400.&#8221; For what? Sometimes, there are little mistakes, configurations set incorrectly, or dated logic that has yet to be updated. Sometimes, it&#8217;s because the original architect of the models left and the new one has a new way of building.</p><p>It's important to remember this is happening across the stack. This post from&nbsp;<a href="https://www.linkedin.com/in/janssenryan/">Ryan Janssen,</a>&nbsp;while funny, is quite accurate. The MDS was incredible at lowering the barrier to working with data; however, with that, we lost a lot of fundamentals, and teams were not disciplined in their approach.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.linkedin.com/posts/janssenryan_2010-tableau-makes-it-easy-for-anyone-to-activity-7253116472829345792-yJ-2?utm_source=share&amp;utm_medium=member_desktop" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nw68!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png 424w, https://substackcdn.com/image/fetch/$s_!nw68!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png 848w, https://substackcdn.com/image/fetch/$s_!nw68!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png 1272w, https://substackcdn.com/image/fetch/$s_!nw68!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nw68!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png" width="586" height="340.929012345679" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:377,&quot;width&quot;:648,&quot;resizeWidth&quot;:586,&quot;bytes&quot;:58835,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.linkedin.com/posts/janssenryan_2010-tableau-makes-it-easy-for-anyone-to-activity-7253116472829345792-yJ-2?utm_source=share&amp;utm_medium=member_desktop&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nw68!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png 424w, https://substackcdn.com/image/fetch/$s_!nw68!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png 848w, https://substackcdn.com/image/fetch/$s_!nw68!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png 1272w, https://substackcdn.com/image/fetch/$s_!nw68!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc376023a-f8d8-4ad7-8e7f-36d072300426_648x377.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>The Alert Crisis</h3><p>Alongside the bloat issue, <a href="https://www.linkedin.com/feed/update/urn:li:activity:7242921713083056128/">data observability tools constantly ping you</a> when things break down. The more junk in the system, the more likely it is to fail. Most of these alerts aren't helpful for two key reasons. First, they lack context&#8212;you only get the issue without information about why it occurred, how to fix it, or how it affects other parts of your stack. Second, as alerts pile up, your backlog grows, and stress mounts until you feel overwhelmed by unsolvable problems. Each alert disrupts your work, forcing you to spend 3 to 4 hours diagnosing and fixing issues instead of focusing on your primary tasks. While monitoring stack issues is important, it's time for tools that detect problems and automatically resolve them, allowing teams to focus on strategic initiatives rather than constant optimization and maintenance.</p><h3>Insights &amp; Resolution </h3><p>Engineers should focus on building solutions that help businesses grow and scale&#8212;not on optimizing tools like Snowflake, dbt, or Airflow. Yet data teams today fall into two categories: those spending over a quarter of their week on maintenance tasks with limited context and those avoiding maintenance altogether. The latter group burns through their budget, accumulates technical debt, and moves at a snail&#8217;s pace.</p><p>Teams crave insight and visibility into what's going wrong in their data platforms. The exciting part is that uncovering these issues gives us the fantastic opportunity to automate the work in one experience. Insights lead to tasks, which lead to tasks resolved automatically, which leads to a merged PR. These automated workflows aren't built on rigid rules&#8212;they're driven by customized insights from your environment! This means less time on maintenance and more time spent on work, which makes an impact. </p><p>This is the world we are building. Come join us on the ride!</p><h3>About Artemis</h3><p><a href="http://www.artemisdata.io">Artemis</a> monitors your data stack, finds issues, and automatically resolves them. Our users approve over 120 insights, merge 60+ PRs, and save over 20 hours a week. There is no need to migrate; our platform integrates with your data stack within 15 minutes! </p>]]></content:encoded></item><item><title><![CDATA[Data Context: The Critical Currency of Modern Data Engineering]]></title><description><![CDATA[In 2006 when British mathematician Clive Humby declared that "data is the new oil", he illuminated a fundamental truth that many organizations are still grappling with: raw data, like crude oil, requires sophisticated (meaning costly and expensive) refinement to deliver rule value.]]></description><link>https://blog.artemisdata.io/p/data-context-the-critical-currency</link><guid isPermaLink="false">https://blog.artemisdata.io/p/data-context-the-critical-currency</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Mon, 25 Nov 2024 19:07:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0IBm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0IBm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0IBm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp 424w, https://substackcdn.com/image/fetch/$s_!0IBm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp 848w, https://substackcdn.com/image/fetch/$s_!0IBm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp 1272w, https://substackcdn.com/image/fetch/$s_!0IBm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0IBm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp" width="768" height="421" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:421,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:114912,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0IBm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp 424w, https://substackcdn.com/image/fetch/$s_!0IBm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp 848w, https://substackcdn.com/image/fetch/$s_!0IBm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp 1272w, https://substackcdn.com/image/fetch/$s_!0IBm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4de2b1c-fca3-43e6-863b-73b513d82f8e_768x421.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>In 2006 when British mathematician Clive Humby declared that "data is the new oil", he illuminated a fundamental truth that many organizations are still grappling with: raw data, like crude oil, requires sophisticated (meaning costly and expensive) refinement to deliver rule value. Fast forward to today: the global &#8220;datasphere&#8221; is projected to grow from 33 zettabytes in 2018 to 175 zettabytes by 2025, according to <a href="https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf">IDC</a>. Yet despite this explosive growth, <a href="https://www.notion.so/3bb69bfc115646178de183c6efb3c923?pvs=21">68% of data</a> goes unused in most organizations.</p><blockquote><p>One zettabyte is equal to a trillion gigabytes.</p></blockquote><p>This underutilization isn&#8217;t for lack of effort. Organizations have poured resources into building sprawling data lakes, warehouses and pipelines, but without the <strong>context</strong> to make sense of it all, most data remains an untapped resource, languishing in obscurity.</p><h2>The Challenge: Contextless Data in the Real World</h2><p>For data engineers, the sheer volume is less daunting than the complexity of working with <strong>contextless data</strong>. Imagine constructing a skyscraper without a blueprint&#8212;just piles of bricks and steel. That&#8217;s the daily reality of working with disorganized datasets lacking proper documentation, lineage, or structure.</p><p>This lack of context manifests in several ways:</p><h3><strong>1. Query Optimization Challenges</strong></h3><ul><li><p>Query performance suffers when relationships between tables are unclear.</p></li><li><p>Data engineers spend an average of <a href="https://www.montecarlodata.com/blog-data-quality-survey">40% of their time on non-engineering work</a> like deciphering poorly documented datasets.</p></li><li><p>Complex JOIN operations often fail or run inefficiently due to misunderstood table relationships.</p></li></ul><h3><strong>2. Schema Evolution Problems</strong></h3><ul><li><p>As business needs evolve, so do data schemas&#8212;but <a href="https://greatexpectations.io/static/abb8fc238738a68f75b0207b21131298/State_of_Data_Quality_Report-MQ.pdf">77% of data teams</a> report struggling with <strong>schema drift</strong>.</p></li><li><p>Resolving data incidents takes an average of <a href="https://boomi.com/content/ebook/esg-state-of-data-ops">40% longer</a> when documentation is sparse.</p></li><li><p>The result? Technical debt accumulates <a href="https://www.notion.so/Data-Context-The-Critical-Currency-of-Modern-Data-Engineering-14316c397eb9802bba13d2589213cd2e?pvs=21">faster</a>, bogging down engineering teams with avoidable problems.</p></li></ul><div><hr></div><h2>A Blueprint for Data Context</h2><p>Modern data engineering success hinges on robust <strong>contextualization frameworks</strong> that transform raw data into actionable insights. Think of this as building a detailed architectural plan for your skyscraper before laying the foundation.</p><h3><strong>1. Metadata Management Infrastructure</strong></h3><p>To create a sustainable framework, data teams need to connect the dots between raw data and its meaning:</p><p>Raw Data &#8594; Technical Metadata &#8594; Business Metadata &#8594; Semantic Layer &#8595; &#8595; &#8595; &#8595; Schema Definitions Business Glossary Knowledge Graph</p><h3><strong>2. Key Components of Context</strong></h3><ul><li><p><strong>Lineage Tracking</strong></p><p>Understand how data flows through your systems:</p><ul><li><p>Source-to-target mappings</p></li><li><p>Documentation of transformation logic</p></li><li><p>Impact analysis and version control</p></li></ul></li><li><p><strong>Business Process Integration</strong></p><p>Align data context with operational needs:</p><ul><li><p>APIs and service mappings</p></li><li><p>SLAs and quality thresholds</p></li><li><p>Clear ownership and accountability</p></li></ul></li></ul><h2>From Pain Points to Performance Gains</h2><p>Organizations that prioritize contextualization consistently outperform their peers. Let&#8217;s contrast the <strong>before</strong> and <strong>after</strong> of implementing robust data context management.</p><h3><strong>The Before:</strong></h3><ul><li><p>Poor data quality costs companies <a href="https://www.gartner.com/smarterwithgartner/how-to-improve-your-data-quality">$12.9 million annually</a>.</p></li><li><p>Data teams spend about <a href="https://greatexpectations.io/static/abb8fc238738a68f75b0207b21131298/State_of_Data_Quality_Report-MQ.pdf">35% of their time</a> firefighting quality issues instead of building innovative solutions.</p></li><li><p>Discovery processes are painfully slow, with teams taking <a href="https://www.dataversity.net/data-analytics-and-bi-trends-in-2023/">30% more time</a> to locate relevant datasets.</p></li></ul><h3><strong>The After:</strong></h3><ul><li><p>Implementing data quality frameworks reduces incident resolution times by <a href="https://greatexpectations.io/static/abb8fc238738a68f75b0207b21131298/State_of_Data_Quality_Report-MQ.pdf">26%</a>.</p></li><li><p>Data lineage documentation makes teams <a href="https://www.dataversity.net/data-analytics-and-bi-trends-in-2023/">23% more likely</a> to make data-driven decisions.</p></li><li><p>Automated data catalogs cut discovery time by <a href="https://www.collibra.com/us/en/resources/collibra-recognized-as-a-leader-in-the-forrester-wave-data-governance-solutions-q-3-2023">40%</a>.</p></li></ul><h2>The Road Ahead: Evolving Context</h2><p>The future of data engineering is context-first. Advancements like AI and real-time processing are pushing the boundaries of what&#8217;s possible:</p><h3><strong>1. AI-Assisted Context Generation</strong></h3><ul><li><p>Algorithms that uncover hidden relationships.</p></li><li><p>Natural language processing for automated documentation.</p></li><li><p>Predictive models to assess the impact of changes.</p></li></ul><h3><strong>2. Real-Time Context Updates</strong></h3><ul><li><p>Stream processing for evolving schemas.</p></li><li><p>Automated lineage updates to reflect changes instantly.</p></li></ul><h3><strong>3. Context-Aware Governance</strong></h3><ul><li><p>Proactive compliance monitoring.</p></li><li><p>Automated privacy controls and secure context sharing.</p></li></ul><h2>Conclusion: Context as a Competitive Advantage</h2><p>In today&#8217;s data-driven economy, <strong>context is everything</strong>. Organizations must treat context not as a nice-to-have but as a <strong>critical currency</strong> for modern data operations. It&#8217;s not the size of your data lake that matters&#8212;it&#8217;s how well you understand and utilize the data within.</p><p>The most successful teams:</p><p>&#10003; Automate metadata collection.</p><p>&#10003; Build clear mappings between technical and business contexts.</p><p>&#10003; Monitor and enhance context quality continuously.</p><p>In the words of a modern data architect:</p><p><em>"Context turns data from a liability into an asset."</em></p><p>By investing in context, you&#8217;re not just managing data&#8212;you&#8217;re unlocking its full potential.</p>]]></content:encoded></item><item><title><![CDATA[Is your Data Warehouse Bleeding Money? Signs You need Optimization]]></title><description><![CDATA[In today's data-driven world, cloud data warehouses are the backbone of modern analytics.]]></description><link>https://blog.artemisdata.io/p/is-your-data-warehouse-bleeding-money</link><guid isPermaLink="false">https://blog.artemisdata.io/p/is-your-data-warehouse-bleeding-money</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Thu, 21 Nov 2024 18:00:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1057fc53-445c-4500-9f2c-d9e31f8ad033_1080x1080.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today's data-driven world, cloud data warehouses are the backbone of modern analytics. But like any powerful tool, they come with a cost&#8212;one that's often higher than necessary. According to <a href="https://www.seagate.com/files/www-content/our-story/trends/files/dataage-idc-report-final.pdf">IDC</a>, global data creation will grow to 175 zettabytes by 2025, putting unprecedented pressure on data infrastructure and budgets.</p><h3><strong>The Hidden Cost Centres</strong></h3><p>Your data warehouse might be silently draining your budget in ways you haven't noticed. That "simple" dashboard query that runs for 20 minutes isn't just frustrating&#8212;it's expensive. Long-running queries consume unnecessary compute resources, and yes, even failed queries cost money.</p><p>Storage bloat is another silent budget killer. Duplicate test tables, outdated data that should have been archived months ago, and unnecessarily wide tables with unused columns&#8212;they're all taking up expensive storage space. Think of it like paying rent for a storage unit full of items you'll never use again.</p><p>Perhaps the most wasteful practice is inefficient resource allocation. Imagine running your air conditioning at full blast in an empty house&#8212;that's what an over-provisioned warehouse looks like. Multiple users running heavy transformations simultaneously without any resource prioritization only compounds the problem.</p><h3><strong>Why Manual Optimization Falls Short</strong></h3><p>Traditional approaches to optimization are like trying to manually balance your chequebook in the age of digital banking&#8212;it's time-consuming, error-prone, and ultimately inefficient. Data teams spend countless hours on manual query analysis and performance tuning, only to fall behind as query patterns change and data volumes grow.</p><p>According to <a href="https://www.notion.so/Is-your-Data-Warehouse-Bleeding-Money-Signs-You-need-Optimization-13f16c397eb9801199f0fbfcf9faa06d?pvs=21">Gartner</a>, poor data quality alone costs organizations an average of $12.9M annually. This figure becomes even more concerning when you consider that much of this cost comes from inefficient data processing and storage.</p><h3><strong>The Power of Automated Optimization</strong></h3><p>This is where automated optimization comes into play. Modern solutions can continuously monitor your warehouse's performance, automatically identifying and optimizing problematic queries, managing resources dynamically, and cleaning up unused assets. It's like having a dedicated team of database performance experts working 24/7, just without the overhead.</p><p>Think of automated optimization as your warehouse's financial advisor&#8212;constantly looking for ways to save money while improving performance. It can identify when queries are consuming excessive resources, automatically adjust warehouse sizing based on actual usage patterns, and alert you to cost anomalies before they become budget disasters.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kuNR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kuNR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png 424w, https://substackcdn.com/image/fetch/$s_!kuNR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png 848w, https://substackcdn.com/image/fetch/$s_!kuNR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png 1272w, https://substackcdn.com/image/fetch/$s_!kuNR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kuNR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png" width="1326" height="631" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:631,&quot;width&quot;:1326,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:208930,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kuNR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png 424w, https://substackcdn.com/image/fetch/$s_!kuNR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png 848w, https://substackcdn.com/image/fetch/$s_!kuNR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png 1272w, https://substackcdn.com/image/fetch/$s_!kuNR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b74a83-3406-40f1-9bfd-d01d82cceff5_1326x631.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Beyond Cost Savings</strong></h3><p>While cost reduction is important, the benefits of optimization extend far beyond just saving money. When your warehouse runs efficiently, queries run faster, data teams become more productive, and your entire data infrastructure becomes more reliable. This means better insights, faster decision-making, and more time for innovation.</p><p>At <a href="http://www.artemisdata.io">Artemis</a>, we've seen firsthand how automated optimization transforms data operations. Our platform gives data teams back an estimated <strong>500 hours per year</strong>&#8212;time that can be invested in strategic initiatives rather than maintenance and troubleshooting.</p><h3><strong>Taking Action</strong></h3><p>Don't wait for your next shocking cloud bill to start thinking about optimization. Begin by understanding your current warehouse usage patterns and biggest cost drivers. Implement automated optimization tools to handle the heavy lifting of performance tuning and resource management. Most importantly, establish clear cost governance policies to prevent future waste.</p><p>Remember: Every dollar saved on warehouse operations is a dollar that can be invested in actual data innovation. In today's competitive landscape, can you afford not to optimize?</p>]]></content:encoded></item><item><title><![CDATA[Data Engineers Deserve better]]></title><description><![CDATA[The Unsung Heroes]]></description><link>https://blog.artemisdata.io/p/data-engineers-deserve-better</link><guid isPermaLink="false">https://blog.artemisdata.io/p/data-engineers-deserve-better</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Tue, 19 Nov 2024 19:00:49 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a8389f6f-2b20-48fc-a92f-8a3d7c59ebb5_764x399.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3><strong>The Unsung Heroes</strong></h3><p><em>Imagine this</em>: You're a data engineer. It's 3pm on a Friday, and your Slack pings with an ad hoc request from a business analyst for a highly specific, rarely-used metric. This requires diving into a dataset you've never touched before, scouring documentation (if it exists), and wrangling a pipeline to produce something coherent&#8212;all for a piece of information that will likely be used once. The response? A quick &#8220;thanks,&#8221; if you&#8217;re lucky.</p><p>Data engineers are the backbone of modern businesses, yet their work often goes unrecognized and they are often eclipsed by their flashy cousin, the software engineer. DE&#8217;s are the ones transforming raw data into actionable insights, ensuring systems run smoothly, and enabling teams to leverage the power of analytics. But where&#8217;s the love for these unsung heroes?</p><h3><strong>The Current State of Data Engineering</strong></h3><p>The data landscape in 2024 is, to put it lightly, a mess. With studies showing <a href="https://www.notion.so/Data-Engineers-Deserve-better-13f16c397eb98024ba1fcda4a0a55f29?pvs=21">97% of data engineers</a> reporting burnout and being unsatisfied with their roles, it's no surprised they're overworked. They're often juggling multiple priorities and are rarely acknowledged for their technical abilities.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4XkO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4XkO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png 424w, https://substackcdn.com/image/fetch/$s_!4XkO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png 848w, https://substackcdn.com/image/fetch/$s_!4XkO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png 1272w, https://substackcdn.com/image/fetch/$s_!4XkO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4XkO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png" width="1104" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:1104,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:107374,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4XkO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png 424w, https://substackcdn.com/image/fetch/$s_!4XkO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png 848w, https://substackcdn.com/image/fetch/$s_!4XkO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png 1272w, https://substackcdn.com/image/fetch/$s_!4XkO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5927be-8f20-4884-918c-ad1d7f4fec89_1104x480.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At the same time, executive teams are clamouring for AI-powered solutions, maybe without fully understanding the resource investment required. With global data creation projected to reach <a href="https://www.seagate.com/files/www-content/our-story/trends/files/dataage-idc-report-final.pdf">175 zettabytes by 2025</a>, implementing AI systems isn't just about flashy models; it's about building robust pipelines, maintaining clean and accessible data, and ensuring everything scales seamlessly. Without proper resourcing, this demand further exacerbates the strain on data teams (If that&#8217;s even possible).</p><p><em>Example:</em> A C-suite decides they want real-time AI powered analytics to power customer dashboards. The expectation? Deliver it in two months. The reality? Data engineers scrambling to retrofit legacy systems, integrating machine learning models with incomplete training data, and building fragile pipelines that barely hold under the strain&#8212;all while racing against the clock.</p><h3><strong>What This Means for New Projects</strong></h3><p>When data engineers are perpetually stuck in firefighting mode&#8212;fixing broken pipelines, addressing tech debt, and fielding endless requests&#8212;it leaves little room for innovation. New projects, which should be exciting opportunities to leverage cutting-edge tools and techniques, instead can feel like nothing more than a headache.</p><p>This constant grind isn&#8217;t sustainable. Without better systems and recognition, businesses risk burning out their most critical technical teams.</p><h3><strong>Building a Better Future for Data Teams</strong></h3><p>It&#8217;s not all doom and gloom though, there&#8217;s hope. By addressing the root causes of these challenges, we can create a more sustainable, rewarding environment for data engineers. Here&#8217;s how I would do it:</p><ol><li><p><strong>Optimization</strong>: Evaluate and streamline your current data stack to reduce inefficiencies.</p></li><li><p><strong>Cleaning Up Stacks</strong>: Invest in modernizing outdated systems and addressing tech debt.</p></li><li><p><strong>Automation</strong>: Offload repetitive, manual tasks like pipeline maintenance and documentation generation.</p></li><li><p><strong>Empowering Teams</strong>: Give data engineers the bandwidth to focus on meaningful, innovative projects rather than just staying afloat.</p></li></ol><h3><strong>How Artemis Makes a Difference</strong></h3><p>At @<a href="http://www.artemisdata.io">Artemis</a>, we believe data engineers deserve better. Our platform is designed to give data teams their time back&#8212;an estimated <strong>500 hours per year</strong>&#8212;by automating and optimizing their stack. Here&#8217;s how:</p><ul><li><p><strong>Automated dbt Model Optimization</strong>: We fine-tune and maintain your models to ensure peak performance.</p></li><li><p><strong>Streamlined Documentation</strong>: Our tools improve and manage documentation effortlessly, so your teams don&#8217;t have to.</p></li><li><p><strong>Warehouse Efficiency</strong>: By continuously monitoring and optimizing your warehouse, we ensure cost savings and enhanced performance.</p></li></ul><p>By reducing the grunt work and improving overall efficiency, Artemis empowers data teams to innovate, save money, and scale seamlessly&#8212;all without drowning in the day-to-day grind.</p><h3><strong>Conclusion</strong></h3><p>Data engineers are the backbone of every data-driven decision&#8212;quietly building, maintaining, and optimizing the systems that power insights.</p><p>It&#8217;s time to recognize not only their impact but the deep technical expertise they bring. With the right support and tools, data engineers can move beyond the firefighting of day-to-day maintenance and focus on what they do best: building scalable, reliable, and efficient data systems.</p>]]></content:encoded></item><item><title><![CDATA[In Data, Why is our Daily Driver an F1 Racecar?]]></title><description><![CDATA[Most data teams build a custom racecar for their analytics&#8212;not because they want to, but because that's just how the modern data stack works.]]></description><link>https://blog.artemisdata.io/p/why-is-our-daily-is-an-f1-racecar</link><guid isPermaLink="false">https://blog.artemisdata.io/p/why-is-our-daily-is-an-f1-racecar</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Tue, 19 Nov 2024 15:45:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KBCr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KBCr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KBCr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KBCr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KBCr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KBCr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KBCr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg" width="736" height="372" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:372,&quot;width&quot;:736,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47690,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KBCr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KBCr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KBCr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KBCr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2373df25-029c-4237-bdf0-da2d062a5277_736x372.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most data teams build a custom racecar for their analytics&#8212;not because they want to, but because that's just how the modern data stack works. You pick Snowflake, BigQuery, or Databricks as your engine and build the rest of the car around it. Whether you're using Fivetran or Airbyte, dbt or Coalesce, you've got to do a ton of custom work to get this car running. But hey, this part of the job is a blast! Teams love getting the budget to design and build a data platform from scratch. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K-lm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K-lm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png 424w, https://substackcdn.com/image/fetch/$s_!K-lm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png 848w, https://substackcdn.com/image/fetch/$s_!K-lm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png 1272w, https://substackcdn.com/image/fetch/$s_!K-lm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K-lm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png" width="1456" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:335877,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K-lm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png 424w, https://substackcdn.com/image/fetch/$s_!K-lm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png 848w, https://substackcdn.com/image/fetch/$s_!K-lm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png 1272w, https://substackcdn.com/image/fetch/$s_!K-lm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e1b3b0-96f1-4741-8e11-dcca51cf8665_2848x1408.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">How simple we think it will be</figcaption></figure></div><p>The issue is that as the company scales, the car gets more complex. You start adding more features to handle all the new requests and edge cases. Before you know it, you've got new dashboards, a reverse ETL tool thrown into the mix, and even a catalogue to tie it all together. You and your team get busy building foundational data models that the analysts then run with. Even though these models pull data from standard data sources such as Quickbooks, Stripe, ERPs, etc., the models are customized to your organization and often overfitted with business logic.</p><p>All these tools are beneficial and serve a purpose, but the complexity grows with them. To make matters worse, while your core team does most of the work occasionally, you let a product team member create a few dashboards and write some queries in the name "self-service." As a result, context is added, or assumptions are made that don't fit your team's standard model. This mess lives in dbt, Airlow, as the core logic is stored there. However, this can expand into BI tools like Looker or upstream integration tools. This customization and maintenance is a huge drag on teams. A dbt labs survey suggests teams spend 26% of their week fixing and maintaining their infrastructure. The point is it gets complicated quickly! </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XChQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XChQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif 424w, https://substackcdn.com/image/fetch/$s_!XChQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif 848w, https://substackcdn.com/image/fetch/$s_!XChQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif 1272w, https://substackcdn.com/image/fetch/$s_!XChQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XChQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif" width="478" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:478,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1411417,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XChQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif 424w, https://substackcdn.com/image/fetch/$s_!XChQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif 848w, https://substackcdn.com/image/fetch/$s_!XChQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif 1272w, https://substackcdn.com/image/fetch/$s_!XChQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd61dc15f-f1fa-4aba-bfb9-d55fec0a782b_478x480.gif 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Before you know it, the models start to break down, and the custom racecar you and the team built is in the shop and takes forever to fix. Why? It's for a multitude of reasons. A collection of people (some not at your company anymore) have written hundreds of models; you have ten workloads calculating similar things in slightly different ways hooked up to &#8216;core sources of truth.&#8217; A lot of times, dbt models are added without rules or assumptions. You start to see duplications, models not referencing the correct upstream table, and models running hourly where the data is only updated every 6 hours.</p><p>The outcome? It's been 2-5 years with the MDS, and your costs are way higher than you predicted; your pipelines are fragile and constantly breaking, and logic is siloed across many tools. This is a reality for data teams.</p><p>So how do you fix it? This is a question a lot of data teams are asking themselves. From the teams we have spoken to, there are a few outcomes.</p><ol><li><p>They do nothing and hope it&#8217;ll figure itself out one day. (Spoiler: it won&#8217;t).</p></li><li><p>They commission a team of 1 or 2 engineers to focus on this project. They spend 6-9 months on it and are pulled from projects.</p></li><li><p>They do the opposite and commission 1-2 to focus on new work while the rest of the team is set on fixing their foundation. This is a costly outcome.</p></li><li><p>They use <a href="http://www.artemisdata.io">Artemis</a>.</p></li></ol><p>Each option is time-consuming, expensive, and has a huge opportunity cost. When trying to solve these issues manually, you discover how hard it is to search for issues across the stack while holding the entire platform's design in one or two people's heads. It&#8217;s a large reason why this work doesn&#8217;t get done.</p><p>Data platforms are held together with duct tape, which is the nature of the beast. Still, teams can invest in tooling that helps maintain those systems so they can focus on delivering value for the business instead of constantly repairing what&#8217;s broken.</p><h3>What is Artemis?</h3><p><a href="http://www.artemisdata.io">Artemis</a> is an end-to-end data platform mechanic. Our platform monitors your stack, finds issues within your warehouse, dbt models, and BI tools, and then auto-resolves them, so all you need to do is approve insights and merge PRs. The platform does all the research and highlights where you can cut bloat, improve performance and save warehouse costs.</p><p>We tune your data platform to be a lean, fast, and smooth machine so your engine is humming, and you fly by your OKRs. The average team approves 120 insights, merges 60 PRs and saves 25 hours a week!</p><p>If you want to optimize costs or cut dbt bloat, reach out!</p>]]></content:encoded></item><item><title><![CDATA[Fragmentation in the Data Stack and cost structure with Chris Riccomini ]]></title><description><![CDATA[Chris Riccomini shares his thoughts on the fragmented data stack, dbt, and how to get data platform costs down.]]></description><link>https://blog.artemisdata.io/p/fragmentation-in-the-data-stack-and</link><guid isPermaLink="false">https://blog.artemisdata.io/p/fragmentation-in-the-data-stack-and</guid><dc:creator><![CDATA[Josh Gray]]></dc:creator><pubDate>Thu, 14 Nov 2024 15:55:50 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/7a1e1cc6-160d-45ff-9f8e-e69badeb5600_2930x1971.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div><hr></div><p>I recently had the chance to interview <a href="https://cnr.sh/">Chris Riccomini</a>. Chris currently manages Materialized View Capital. He spent over 15 years working on infrastructure at major tech companies such as PayPal, LinkedIn, and WePay. He was involved in open-source projects like&nbsp;<a href="https://samza.apache.org/">Apache Samza</a>&nbsp;and&nbsp;<a href="https://airflow.apache.org/">Apache Airflow</a>. </p><div><hr></div><p><em><strong>JG: How do you see the relationship between data engineers and analytics engineers (I am curious how your position has shifted since you tweeted in Nov 2022)?</strong></em></p><p>CR<strong>:</strong> Good question. I realize that these definitions are quite fluid.</p><p>To me, a data engineer is responsible for the E and L in the data pipeline&#8212;extracting and loading data. Analytics engineers are responsible for the T&#8212;transformation. A data engineer&#8217;s job is to build a data integration layer to get data into the system where it&#8217;s needed reliably. One of those systems is a data warehouse. An analytics engineer&#8217;s job is to transform the raw (loaded) data into something more usable for product managers, sales ops, business analysts, and others. Note here that I did not mention deriving value and insight from the data; that&#8217;s a role I would describe as a business analyst.</p><p>In practice, I find that most people actually do more than one of these roles in an organization. For example, someone with an &#8220;analytics engineer&#8221; title might end up doing business analyst stuff, too: creating reports, data visualization, building business review dashboards, fielding questions from the executive team, and so on. </p><p><em><strong>JG: We&#8217;ve spoken about how data warehouse spending became an issue at your last company. What happened, and how much work did it take to get it under control?</strong></em> </p><p>CR<strong>:</strong> The short answer is that we allowed the data warehouse to grow organically for several years with a focus on convenience instead of cost. At first, the data engineering team was just me. For many years, it was just two or three people. With the number of pipelines we had, we simply didn&#8217;t have time to optimize for cost.</p><p>As time went on, though, we grew&#8212;not just in team size but also in data and query size. We had more pipelines and more people running queries. Cost slowly crept up. It wasn&#8217;t overnight, but over a year or two. Then, ZIRP ended, and the CFO came knocking.</p><p>Fortunately for us, there was a lot of low hanging fruit. For example, we exposed views rather than tables to users. When a user queried a view, the view would de-duplicate rows to make sure that users only saw the latest version of each row in a table (we were using Debezium and CDC to load data, so a new table record was inserted for each update to a row). For frequently queried tables, it was cheaper to simply materialize those views rather than keep de-duplicating the entire table each time a user queried the data. Another more basic example was that we loaded all tables from each database into our data warehouse by default. Many of the tables were rarely used. We could remove such tables to save some money. There are many such examples, but you get the idea. </p><p><em><strong>JG: What are your thoughts on dbt? What is it missing? How can it be better?</strong></em> </p><p>CR: <a href="https://www.getdbt.com/">dbt</a> is a very useful tool. We built something internally at my previous job that looked similar, but was much more basic. Nearly every company I interacted with had some version of a Python script that could be used to auto-generate tables and views. And nearly every company had some kind of data workflow orchestrator. The use case for something like dbt was self-evident. Moreover, there was real value in standardizing such a tool. It benefited from network effects as developers contributed integrations and packages. It also, rather sneakily, snuck in SDLC best practices so analytics engineers could build better transformations.</p><p>As for how dbt could be improved, I think the folks at <a href="https://sqlmesh.com/">SQLMesh</a>, <a href="https://www.sdf.com/">sdf</a>, <a href="https://coalesce.io/">Coalesce</a>, and <a href="https://dagster.io/">Dagster</a> would be better positioned to answer that. What I can say is that&#8212;going back to your very first question&#8212;what I&#8217;ve never been able to understand is why we split the transform tool (dbt) out from the extract and load tools (workflow orchestrators, CDC, and such). These two things seem very much related (so much so, we keep them in a single acroynm&#8212;ELT). I think Dagster was early in recognizing this. Another signal of this oddity is that we now have <a href="https://dlthub.com/">dlt</a> as well as dbt; they&#8217;re the same idea but one is for extract/load and the other is for transform. Unifying extract, load, and transform tools back together is something that needs to happen, though we&#8217;re in the infancy of this transition. </p><p><em><strong>JG: What challenges have you experienced or seen data teams experience when optimizing or maintaining their data platform?</strong></em> </p><p>CR: Oh gosh, it&#8217;s really endless. Data quality issues, schema compatibility, data loss, cost, managing Python runtime environments, keeping pipelines within latency SLAs, difficulty getting access to observability data, dealing with an explosion of tools (the so called MDS), and so on. And that&#8217;s just on the data pipeline side.</p><p>Straddling the production and non-production domains presents some unique challenges, too. Many data teams have to act as a support team, security team, and ops team as well. They end up having to help users fix queries, are on-call when the platforms break, and often need to gatekeep access, monitor for sensitive information, and so on. While production engineers often have a single (usually standard) process to manage such things, data teams often get left behind for various reasons. </p><p><em><strong>JG: In your opinion, why does dbt increase spending on data warehouses?</strong></em></p><p>CR: dbt decreases the friction of running transformations in the data warehouse. This is a good thing, provided these new transformations are providing business value. However, given the way cloud data warehouses bill, this necessarily means you&#8217;re spending more money. dbt has attempted to help with this by providing incremental models, and the community has tried to do some transformation outside of the cloud data warehouse (in DuckDB, for example). Still, the default is that things run in an expensive way. </p><p><em><strong>JG: What are your thoughts on increasing the focus on optimizing data warehouse costs, especially in relation to dbt workloads?</strong></em> </p><p>CR: I think the experience I discussed above is relevant to this question. We had to choose whether to optimize our data architecture for simplicity or cost when rolling out our data pipeline. There was no simple, cost-optimal way to build the pipeline at the time.</p><p>The ideal scenario is one in which we can build both simple <em>and</em> cost-effective data pipelines. We need to lower the cost of lowering the cost, so to speak. To do so, I think we need optimal-by-default tooling. Incremental ETL pipelines, integrated EL and T tools that can figure out where transformations should occur (i.e. not just in the data warehouse), cost observability tools, and so on. </p><p><em><strong>JG: How do you see tools like Artemis changing how data engineers work?</strong></em> </p><p>CR: I think Artemis has an opportunity to address the stuff I just mentioned: make it easy for me to build a simple, cost-optimal data pipeline. Leveraging AI to figure out how to clean up waste dbt pipelines is a great first step. I expect a lot more from you guys, though. As I said, I can imagine query monitoring and optimization with AI, query rewriting to move transformations outside of the CDWH and into local (cheaper) engines like DuckDB, and a lot more. </p><p><em><strong>JG: How do you think tools like Artemis will impact how data teams perform maintenance on their data stack?</strong></em><strong> </strong></p><p>CR: I see two approaches to solving the fragmentation in the modern data stack (MDS) space. The first is to provide a fully integrated, opinionated solution about how to manage your data warehouse or data lakehouse. The second is to provide AI-based solutions that help manage the MDS fragmentation and reduce toil for data engineers. The nice thing about the latter approach is that it meets organizations where they are. That&#8217;s what I like about Artemis. You don&#8217;t need to move off your data warehouse, you don&#8217;t need to move off of dbt, and you don&#8217;t need to adopt some vertically integrated solution. I&#8217;m excited to see more tools like this pop up. We&#8217;ve dug ourselves into a pretty big hole with the MDS tech stack, and tools like Artemis will play a role in helping us wrangle all this stuff. </p><p><em><strong>JG: What advice would you give to organizations looking to implement cost optimization strategies for their data warehouses without sacrificing agility and innovation?</strong></em> </p><p>CR: Every organization has 7&#177;2 tables that are really business critical and 7&#177;2 tables that are really expensive. Most organizations know which tables are business critical, but it might not be obvious which tables (and query patterns) are most expensive. So job #1 is to set up some observability to figure out what is costing you the most.</p><p>Once you&#8217;ve got a clear picture of business critical and expensive tables. Take the union of these two table sets and apply the 80-20 rule: figure out how to optimize each of them <em>enough</em> to reap 80% of the savings with 20% of the effort.</p><p>Next, you need to set up some monitoring to alert you when future tables and queries go astray. Once you&#8217;ve established this beachhead, go back and ask yourself what (if anything) needs to be done longer term to manage costs. Do you need to fundamentally re-architect your pipeline? Do you need to change certain tools or eliminate query patterns? You don&#8217;t need to be perfect with this stuff, so it&#8217;s quite possible that the playbook I just outlined will get you where you need to be. Don&#8217;t go overboard.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.artemisdata.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Artemis Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[From Ants to Algorithms: Biomimicry in AI]]></title><description><![CDATA[In 2010, researchers in Japan made headlines with an astonishing demonstration: they used a species of slime mold to design an optimal railway network for Tokyo.]]></description><link>https://blog.artemisdata.io/p/from-ants-to-algorithms-biomimicry</link><guid isPermaLink="false">https://blog.artemisdata.io/p/from-ants-to-algorithms-biomimicry</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Tue, 12 Nov 2024 00:02:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/59510415-f4e4-43e8-a192-923273e87115_634x384.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XxkG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XxkG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg 424w, https://substackcdn.com/image/fetch/$s_!XxkG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg 848w, https://substackcdn.com/image/fetch/$s_!XxkG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!XxkG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XxkG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg" width="860" height="394" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:394,&quot;width&quot;:860,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:77944,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XxkG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg 424w, https://substackcdn.com/image/fetch/$s_!XxkG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg 848w, https://substackcdn.com/image/fetch/$s_!XxkG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!XxkG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c228891-b33e-4661-bfef-478d6cd2f4aa_860x394.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In 2010, researchers in Japan made headlines with an astonishing demonstration: they used a species of <a href="https://www.discovermagazine.com/planet-earth/what-is-slime-mold">slime mold</a> to design an optimal railway network for Tokyo. The slime mold replicated the city's complex rail system with remarkable efficiency, using its natural ability to find the shortest paths between nutrient sources. The Tokyo train system is the host of nearly 40 million daily riders and just a fraction of decreased route efficiency equates to millions of additional load hours on the system a year.</p><p>This example of biomimicry &#8211; where solutions from nature inspire human innovation &#8211; highlights how studying biological systems can lead to groundbreaking advancements. Now, imagine applying these principles to the realm of data science and artificial intelligence (AI). Enter the world of swarm intelligence.</p><p><strong>Understanding Biomimicry</strong></p><p>Biomimicry is the practice of drawing inspiration from nature to solve human problems. Famous examples include Velcro, inspired by burrs that stick to animal fur, and the design of wind turbine blades which are modelled after whale fins. In the tech world, biomimicry is revolutionizing how we approach data processing and artificial intelligence.</p><p><strong>The Concept of Swarms</strong></p><p>Swarm intelligence refers to the collective behaviour of decentralized, self-organized systems, particularly natural ones like ant colonies, bird flocks, and bee hives. Each individual, following simple rules and local interactions, contributes to the complex, intelligent behaviour of the group.</p><p>Imagine the way a school of fish moves in harmony to distract and confuse predators.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Eo8H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Eo8H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Eo8H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Eo8H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Eo8H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Eo8H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg" width="634" height="384" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:384,&quot;width&quot;:634,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:92785,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Eo8H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Eo8H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Eo8H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Eo8H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dbde7d2-d44c-44e4-b09b-996f98f7a7ea_634x384.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Key characteristics of swarm intelligence include:</p><ol><li><p><strong>Decentralization</strong>: No single leader; individuals operate based on local information.</p></li><li><p><strong>Self-Organization</strong>: Order emerges from simple interactions among individuals.</p></li><li><p><strong>Robustness</strong>: The system can adapt to changes and recover from disruptions.</p></li></ol><p><strong>Biomimicry in Data Science</strong></p><p>In data science, biomimicry has lead to innovative algorithms that solve complex problems efficiently. For example, ant colony optimization algorithms simulate the pheromone trails of ants to find the shortest paths in network routing and logistics.</p><p><strong>Other Examples</strong></p><ul><li><p><strong><a href="https://machinelearningmastery.com/a-gentle-introduction-to-particle-swarm-optimization/">Particle Swarm Optimization</a></strong><a href="https://machinelearningmastery.com/a-gentle-introduction-to-particle-swarm-optimization/">:</a> Inspired by bird flocking behaviour, used for optimization problems.</p></li><li><p><strong><a href="https://www.mathworks.com/help/gads/what-is-the-genetic-algorithm.html">Genetic Algorithms</a></strong><a href="https://www.mathworks.com/help/gads/what-is-the-genetic-algorithm.html">:</a> Mimic natural selection to solve optimization and search problems. (See below)</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!42MH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!42MH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png 424w, https://substackcdn.com/image/fetch/$s_!42MH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png 848w, https://substackcdn.com/image/fetch/$s_!42MH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png 1272w, https://substackcdn.com/image/fetch/$s_!42MH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!42MH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png" width="357" height="237" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:237,&quot;width&quot;:357,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5657,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!42MH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png 424w, https://substackcdn.com/image/fetch/$s_!42MH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png 848w, https://substackcdn.com/image/fetch/$s_!42MH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png 1272w, https://substackcdn.com/image/fetch/$s_!42MH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfe3b906-bc6c-4a46-84b8-6c878eacfc27_357x237.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><p><strong>The Power of Swarms and The Implications for Agents</strong></p><p>For AI agents to be effective, they need well-defined roles, much like bees in a hive. Each bee has a specific task, from foraging to caring for the queen, these defined roles ensures the hive's success.</p><p>In nature, specialization allows individuals to excel, contributing to the overall efficiency and success of the group. This principle applies to AI agents as well. When agents are given very specific jobs with clear constraints, they perform these tasks more effectively than generalist. Specialization allows agents to:</p><ul><li><p><strong>Optimize Skills</strong>: Focus on mastering a particular task, leading to higher proficiency.</p></li><li><p><strong>Reduce Complexity</strong>: Simplify problem-solving by narrowing the scope of their responsibilities.</p></li><li><p><strong>Enhance Coordination</strong>: Work seamlessly with other specialized agents, each contributing their expertise to achieve a common goal.</p></li></ul><p>Examples of Role-Specific AI Agents</p><ul><li><p><strong>Foraging Agents</strong>: In logistics, specialized agents could be tasked with finding the best routes for delivery, much like foraging bees.</p></li><li><p><strong>Security Agents</strong>: In cybersecurity, dedicated agents can focus solely on detecting and mitigating specific types of threats.</p></li><li><p><strong>Data Processing Agents</strong>: In data science, agents could be designed to handle specific types of data processing tasks, such as cleaning, transforming, or analyzing data sets.</p></li></ul><p>In a bee hive, communication occurs through dances, pheromones, and direct interactions, guiding the collective behaviour. Similarly, AI agents can use environmental data, peer communication, and hierarchical directives to perform specialized roles. The efficiency and success of natural swarms hinge on the clear division of labor and specialized functions of each member, which can be mirrored in AI systems.</p><p>Adopting swarm principles leads to AI systems that are more efficient, adaptable, and resilient. By giving AI agents specific roles and leveraging their interactions, we can create intelligent systems capable of tackling complex challenges. This approach ensures that each agent contributes optimally to the system&#8217;s objectives, enhancing overall performance.</p><p><strong>Future of AI agents</strong></p><p>Swarm-based AI agents will revolutionize how tasks are managed in dynamic environments, enhancing collaboration and problem-solving. Without a doubt, specialized agents working together can address complex issues more effectively than a single, generalist agent.</p><p>Each agent&#8217;s expertise in a particular domain ensures that the collective system can handle a wide range of tasks with greater proficiency.</p><p><strong>Potential Challenges</strong></p><p>Implementing swarm intelligence in AI systems presents significant challenges, primarily centred around coordination and communication, complexity management, and robustness. As the number of agents increases, ensuring effective coordination and communication becomes more complex, introducing latency and requiring significant bandwidth. Achieving consensus among decentralized agents can be particularly difficult in dynamic or large-scale environments. Additionally, designing systems with multiple specialized agents necessitates careful planning to manage the complexity of their interactions and predict the emergent behaviour of the system.</p><p>Robustness and fault tolerance are also critical concerns. The system must handle individual agent failures without compromising overall performance, and prevent errors from propagating through the network.</p><p>Ensuring security and privacy in decentralized systems is challenging, as they can be more vulnerable to attacks and ensuring data privacy can be difficult. Adapting to rapidly changing environments and unexpected conditions requires sophisticated algorithms that can quickly reconfigure the swarm&#8217;s behavior, balancing the trade-offs between exploration and exploitation to optimize performance. Addressing these challenges is essential for harnessing the full potential of swarm intelligence in AI.</p><p><strong>To Wrap It Up</strong></p><p>Biomimicry, offers powerful insights for the future of AI. By mimicking natural systems, we can develop AI agents that are efficient, resilient, and capable of complex problem-solving. At <a href="https://www.notion.so/641180a280d245658233242a350208c0?pvs=21">Artemis</a>, we are leveraging these methods to build AI agents specifically designed for data engineers. These agents, inspired by the collective behaviour of swarms, are crafted to create robust and optimal systems that enhance data processing and analysis.</p><p>Looking ahead, the principles of biomimicry and swarm intelligence will continue to shape the evolution of AI, driving innovation and unlocking new possibilities and at Artemis, we are committed to harnessing the power of nature&#8217;s code to revolutionize the field of data engineering.</p>]]></content:encoded></item><item><title><![CDATA[The Debt Trap: How Tech Debt Sabotages Innovation]]></title><description><![CDATA[In today&#8217;s fast-paced tech landscape, companies are always striving to be at the cutting edge of innovation.]]></description><link>https://blog.artemisdata.io/p/the-debt-trap-how-tech-debt-sabotages</link><guid isPermaLink="false">https://blog.artemisdata.io/p/the-debt-trap-how-tech-debt-sabotages</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Mon, 11 Nov 2024 23:58:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/08b42983-f3bc-4645-b24c-1fb9c47c9a57_540x360.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today&#8217;s fast-paced tech landscape, companies are always striving to be at the cutting edge of innovation. However, for many organizations, that goal is hampered by a silent enemy&#8212;technical debt. The pressure to release new features quickly and meet market demands often leads to shortcuts in code quality, system architecture, and software design. While these shortcuts may offer short-term wins, they can result in long-term losses, stalling innovation and crippling growth.</p><p>At Artemis, we&#8217;ve seen firsthand how technical debt impacts companies of all sizes. As a data quality company, our mission is to help businesses optimize and alleviate their tech debt, allowing them to innovate without being held back by the limitations of the past. But what exactly is technical debt, how does it accumulate, and what can companies do to mitigate it?</p><h3>What is Technical Debt?</h3><p>Technical debt refers to the implied cost of additional rework caused by choosing an easier, faster solution now instead of a better, more sustainable one that would take longer to implement. Just as with financial debt, technical debt accrues interest over time&#8212;eventually, it must be repaid, and the longer you wait, the more costly that repayment becomes. This could manifest in several ways, such as not maintaining adequate test coverage, failing to document new code, or implementing quick fixes that sacrifice long-term maintainability for short-term reliability. These shortcuts may create robust but hard-to-read code, making it difficult for future developers to troubleshoot or extend, leading to increased costs down the road as systems become harder to manage and optimize</p><blockquote><p>&#8220;<strong>With borrowed money, you can do something sooner than you might otherwise, but then until you pay back that money you'll be paying interest.&#8221; - Ward Cunningham</strong></p></blockquote><p>At its core, technical debt is the result of short-term thinking. When developers make trade-offs to meet immediate needs, they often leave behind incomplete or suboptimal code, patches, or infrastructure. These quick fixes pile up and can hinder future development efforts by increasing complexity, reducing code quality, and making systems harder to maintain or extend.</p><h3>The Hidden Costs of Technical Debt</h3><p>While the metaphor of debt captures the essence of the problem, the impact of technical debt is more than just a matter of technical inefficiency&#8212;it&#8217;s an anchor that weighs down innovation. As technical debt accumulates, development slows, bugs become more frequent, and teams find themselves spending more time fixing old problems than building new features. This creates a vicious cycle where innovation grinds to a halt, and competitive advantage dwindles.</p><p>Here are a few specific ways that tech debt can sabotage innovation:</p><ol><li><p><strong>Slower Development Cycles</strong>: Technical debt creates friction in the development process. As codebases become more complex, it takes longer for developers to implement new features or changes. Every new line of code is an opportunity to introduce new bugs, and older code is prone to breaking under new conditions.s</p></li><li><p><strong>Increased Maintenance Costs</strong>: Maintenance of poorly written or hastily implemented systems requires more resources&#8212;both in terms of time and money. More time spent fixing problems means less time for developing innovative features. As the backlog grows, the weight of unresolved issues can feel like an insurmountable barrier.</p></li><li><p><strong>Lower Morale</strong>: No one enjoys working with buggy, unmanageable systems. Tech debt can drain team morale, causing frustration and burnout. When engineers are consistently forced to work around broken code and inefficient systems, it&#8217;s hard to stay motivated and focused on innovation.</p></li><li><p><strong>Stagnation of Ideas</strong>: When all available resources are used to manage technical debt, there&#8217;s little bandwidth left for creativity or experimentation. Innovation often requires risk-taking and flexibility, but technical debt restricts both, leading to stagnation.</p></li><li><p><strong>Security Vulnerabilities</strong>: Technical debt can also introduce security risks. Outdated and unsupported code undoubtedly exposes the organization to vulnerabilities, making it harder to protect against threats and increasing the risk of a security breach.</p></li></ol><h3>But Kirsten, we all understand best practices and know how to write good code, this would never happen to us&#8230;</h3><p>Nobody, goes into a new venture or project with the plan to create tech debt. But like all things in life, sh*t happens and it starts to accumulate for a variety of reasons, it also turns into a bit of a snowball effect.</p><p>Here&#8217;s some of what I&#8217;ve seen happen to the most well-meaning development teams. As they work under tight deadlines and pressure to deliver, they often find themselves balancing competing priorities, such as <strong>maintainability, scalability, and reliability</strong>&#8212;what can be thought of as the triangle of software development.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZBl_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZBl_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png 424w, https://substackcdn.com/image/fetch/$s_!ZBl_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png 848w, https://substackcdn.com/image/fetch/$s_!ZBl_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png 1272w, https://substackcdn.com/image/fetch/$s_!ZBl_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZBl_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png" width="509" height="394" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9e536db-201e-4098-a683-6837aa3337cf_509x394.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:394,&quot;width&quot;:509,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58695,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZBl_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png 424w, https://substackcdn.com/image/fetch/$s_!ZBl_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png 848w, https://substackcdn.com/image/fetch/$s_!ZBl_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png 1272w, https://substackcdn.com/image/fetch/$s_!ZBl_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9e536db-201e-4098-a683-6837aa3337cf_509x394.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Really what that means for the business case is:</p><ul><li><p><strong>Time-to-Market Pressures</strong>: In the race to get products or features to market quickly, corners are cut, and best practices are ignored.</p></li><li><p><strong>Inconsistent Processes</strong>: Lack of standardized coding practices across teams can lead to inconsistent codebases that are difficult to maintain.</p></li><li><p><strong>Legacy Systems</strong>: Older systems may not integrate well with newer technologies, creating a patchwork of temporary fixes that accumulate over time.</p></li><li><p><strong>Limited Resources</strong>: Teams often face constraints in terms of time, budget, and personnel, leading them to prioritize speed over long-term stability.</p></li></ul><h3>So let&#8217;s talk about the companies that handle it well</h3><p><strong>TLDR: Do: modernize early, build scalable systems, and stay agile to foster continuous innovation. Don&#8217;t: rely on outdated systems, this will kill you 10/10 times.</strong></p><p>Some companies have shown remarkable success in managing their tech debt, turning what could have been a long-term liability into an opportunity for growth and innovation.</p><p><strong>Netflix: A Tech Debt Success Story</strong></p><p>Netflix is a great example of a company that managed tech debt wisely. In its early days as a DVD rental service, Netflix made the strategic decision to embrace <a href="https://www.notion.so/From-Ants-to-Algorithms-Biomimicry-in-AI-aa31a3b9c312472f8923c151dd8cef90?pvs=21">microservices</a> and cloud-native architecture as it transitioned to streaming.</p><p>Today, Netflix continues to manage technical debt by giving developers the freedom to fix it as part of their everyday work, through initiatives like "Freedom &amp; Responsibility" (F&amp;R) time. This allows engineers to focus on refactoring code and improving systems without needing formal approval.</p><p>They also use a "paved road" approach, providing developers with easy-to-use, well-supported tools to avoid bad practices that could lead to debt. Additionally Netflix's <a href="https://medium.com/tag/chaos-engineering">chaos engineering</a> identifies weak spots early, preventing future problems.</p><p>By breaking technical debt into categories like Code, Design, and Documentation, Netflix is able to prioritize and tackle it regularly, ensuring it doesn&#8217;t pile up and slow down innovation. If you are interested in learning more about Netflix&#8217;s technical debt tackling techniques I suggest checking out <a href="https://typeset.io/questions/how-does-netflix-manage-technical-debt-effectively-3hvbkfjprv">this article.</a></p><h3>How can my team fix this problem?</h3><p>Assuming limited resources and the inability to turn back time*</p><p>While the challenges posed by legacy systems and inefficiencies are significant, they are far from insurmountable. In an ideal world, companies would proactively balance fast delivery with long-term sustainability. But for many, accumulated system issues are already weighing them down.</p><p>This is where AI agents, like the ones we develop at <a href="https://www.notion.so/2ec8076fa7fb42df927aa0d7def1cdde?pvs=21">Artemis</a>, step in to help. Our AI agents assist teams in tackling existing inefficiencies by optimizing codebases, automating documentation, and improving system performance. They handle the routine tasks that often create bottlenecks, giving your engineering teams the space to focus on innovation and high-impact work.</p><p>This combination helps clear the way for data engineers to focus on innovation by automating documentation, optimizing data warehouses, and ensuring that as new features roll out, projects remains up-to-date and systems run smoothly. By providing both visibility and automated resolution, Artemis allows teams to streamline their operations, scale efficiently, and stay ahead without being bogged down by past inefficiencies. Ultimately, empowering data teams of all sizes to focus on what truly matters&#8212;innovation&#8212;by removing operational obstacles and enabling them to continuously improve.</p>]]></content:encoded></item><item><title><![CDATA[Will AI Replace Data Engineers?]]></title><description><![CDATA[No, and here's why.]]></description><link>https://blog.artemisdata.io/p/will-ai-replace-data-engineers</link><guid isPermaLink="false">https://blog.artemisdata.io/p/will-ai-replace-data-engineers</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Mon, 11 Nov 2024 23:54:25 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/02c03a7c-e859-4313-a69a-c6da78232329_1280x953.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As we move deeper into this era of AI-driven automation, a critical question looms, and is often the first thing prospective data engineers ask us at <a href="https://www.notion.so/2ec8076fa7fb42df927aa0d7def1cdde?pvs=21">Artemis</a>: <strong>Will AI replace us?</strong></p><p>With automation becoming more and more accessible the role of human data engineers remains simply irreplaceable. The question to ask isn&#8217;t whether will AI replace data engineers&#8212;it&#8217;s how can AI <strong>empower</strong> data engineers to work smarter and more efficiently?</p><h3><strong>AI as an Ally, Not a Replacement</strong></h3><p>We've already proven that AI can automate tasks like schema generation, query writing, warehouse optimization and even basic data analysis. These advances certainly change the day-to-day tasks of data engineers, but they don&#8217;t replace the core skills that engineers bring to the table. Instead, AI becomes an <strong>ally</strong>&#8212;handling repetitive or tedious tasks, freeing engineers to focus on higher-level problem-solving and innovation.</p><p>As noted in <a href="https://www.notion.so/aa31a3b9c312472f8923c151dd8cef90?pvs=21">previous</a> Block Bulletin blogs , AI thrives in predefined environments with clear parameters, but when systems grow in complexity, ambiguity, and scale, humans have to step in. AI can suggest solutions based on algorithms, but it takes a data engineer&#8217;s experience and understanding of business context to determine the right approach.</p><h3><strong>AI&#8217;s Role: Automation of Tedious Tasks, but Not Strategy</strong></h3><p>AI lacks the strategic mindset needed to design resilient, future-proof data architectures. It can&#8217;t sit in a meeting and understand business requirements, nor can it predict how data models need to evolve with a company&#8217;s growth.</p><p>This is where the human element shines. When it comes to understanding the bigger picture&#8212;how different data systems interact, how to optimize for performance over time, or when to prioritize certain datasets&#8212;human engineers are irreplaceable. They bring a level of creativity and adaptability that AI just can&#8217;t match.</p><h3><strong>Preserving the Human Element</strong></h3><p>While there&#8217;s no denying that data engineers provide a level of strategic thinking and problem-solving that AI can&#8217;t replicate, if we are being honest with ourselves, AI is closing the gap. Emerging technologies, such as AI-powered anomaly detection and real-time monitoring, are enabling more autonomous data systems. AI tools can already alert engineers to potential issues before they escalate, but as of now they still rely on humans to fix more complex problems and make high-level decisions.</p><p>Today&#8217;s leading companies are focusing on creating a balance between AI automation and human oversight. For example, our agent <strong><a href="https://www.loom.com/share/e4f62ce823f54a909e91b521245c90be?sid=5aa6211d-b8b3-4528-bf29-a691cd96f19e">Diana</a></strong> is designed to handle routine tasks like optimizing data warehouses, but data engineers remain at the helm, guiding overall strategy and ensuring data integrity.</p><h3><strong>The Human Edge: Problem-Solving and Creativity</strong></h3><p>It&#8217;s important to remember what GenAI actually stands for and means, you would think that <em>generative</em> and <em>artificial</em> makes it pretty self explanatory. But the whole <em>intelligence</em> thing, still seems to be throwing people for a loop. **AI may complete tasks that are traditionally reserved for human levels of intelligence, but that definitely doesn&#8217;t equate it to actually possessing it. It may seem obvious, but to clarify, the term "artificial" indicates that the intelligence is simply simulated, not genuine. Humans are essential in interpreting nuanced requirements, making decisions in complex scenarios, and understanding the intricate details of the data ecosystem that AI hasn&#8217;t yet mastered. Problem-solving in real-time, handling unexpected issues, and making strategic adjustments based on business changes remain firmly within the domain of human engineers.</p><h3><strong>Balancing AI and Human Expertise</strong></h3><p>The best data engineering teams of today are those that combine the efficiency of AI with the strategic, unmatched problem-solving abilities of human engineers. In this new landscape, it&#8217;s not about AI vs. humans&#8212;it&#8217;s about collaboration. AI takes on the heavy lifting and grunt work, like automating routine tasks and running performance optimizations, allowing data engineers to focus on higher-value activities like designing systems, ensuring data governance, and driving business insights.</p><p>In the shifting landscape of data engineering, the question isn&#8217;t whether AI is better than humans or vice versa. The future holds a two-pronged approach, where AI handles the menial tasks of data management, freeing human engineers to focus on higher-order responsibilities like innovation, architecture, and solving complex problems.</p><h3><strong>A Collaborative Future</strong></h3><p>The rise of AI in data engineering is transformative, but it isn&#8217;t an existential threat to data engineers. All though extremist fear mongering around AI taking over humans exists on the fringe, The vast majority, including myself believe, AI is unlikely to replace data engineers but will instead augment their work, taking over tedious, repetitive tasks, allowing us to focus on what matters most.</p><p>So, will AI replace data engineers? <strong>No</strong>&#8212;and the future looks even brighter because of it.</p>]]></content:encoded></item><item><title><![CDATA[The Evolution of Data]]></title><description><![CDATA[How Modular Systems Unlocked Success & A Whole New Wave of Issues]]></description><link>https://blog.artemisdata.io/p/the-evolution-of-data</link><guid isPermaLink="false">https://blog.artemisdata.io/p/the-evolution-of-data</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Mon, 11 Nov 2024 23:39:13 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0e2c6587-0349-44a4-8588-070fbc7377f7_1000x562.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When it comes to data engineering, we&#8217;ve all seen the growing complexity of the data stack&#8212;whether it's new data sources, expanding analytics requirements, or keeping up with new tools. The modern data stack brought modular data stack they&#8217;ve been a game-changer for data teams, but not without it's rues.</p><h3>Quick Recap: What Are Modular Systems?</h3><p>The modern data stack turned end to end monoliths from Oracle and other providers into lego blocks. Instead of being stuck with one fixed pipeline, the modular stack let you break processes into smaller, manageable pieces (or modules). You can then mix and match these to create a workflow that works best for you and your data needs.</p><p>Over the past 10 years or so tools like dbt have popularized this approach, especially when it comes to managing data transformations. Instead of being locked into a complex, monolithic pipeline with scrappy SQL dbt&#8217;s modular techniques allow data engineers to work with more freedom, swapping in new models without breaking the whole thing. It&#8217;s like having a toolbox where you can just grab the right tool at the right time&#8212;no overhauling your entire system just because of a little change.</p><h3>The Initial Promise</h3><p>Modular data stacks <em>were</em> saving teams <strong>a ton of time</strong>. For example, in 2017, <a href="https://www.getdbt.com/case-studies/siemens">Siemens reported that using dbt reduced their dashboard maintenance time by 90%</a>. Picture this&#8212;your team is handling a new data source, and instead of ripping apart your entire pipeline to fit it in, you just pop in a new model. Done. This kind of agility means your team can react quickly to new business requirements, new data sources, or just&#8230; you know, change without sinking weeks into reworking the whole system.</p><p>According to <a href="https://www.gartner.com/en/articles/gartner-strategic-data-and-analytics-predictions-through-2028">Gartner</a>, and based on personal conversations with data professionals, this trend isn't slowing down. Gartner predicts that by 2025, over 75% of enterprise data engineering organizations will adopt these modern architectures. It's easy to understand why. The initial ROI is compelling, and the flexibility <em>seems</em> perfect for scaling operations. What could go wrong?</p><h3>The Dark Side: Costs and Complexity Creep Up</h3><p>But let&#8217;s not get too carried away with the "modularity is magic" idea of the mid 2010s&#8212;there are now obvious downsides that have become more prevalent as companies commit to modular tools.</p><p>On the surface, modularity seems perfect for growth&#8212;just keep adding more pieces, right? But as systems expanded, managing those individual modules have become a nightmare. Each module might need its own versioning, monitoring, and optimization, and suddenly data teams are spending just as much time maintaining the system as they are building it. According to <a href="https://www.rtinsights.com/data-engineers-bad-data-two-days/">Wakefield</a>, data engineers spend up to 40% of their time on maintenance. It&#8217;s ironic, but modular systems actually <strong>introduced more technical debt.</strong></p><p>Secondly, modular systems <strong>can drive up costs fast</strong>. Each new piece or service you add to your stack often comes with its own set of dependencies, resource requirements, and sometimes licensing or operational costs. It's easy to fall into the trap of adding one more tool here or another module there until you realize your bill is way higher than it would&#8217;ve been if you&#8217;d stuck to a more streamlined, less modular setup.</p><p>When you have multiple services or modules interacting, dependencies grow, and things can break in ways you wouldn&#8217;t expect. You might also find that tracking down issues gets harder</p><h3>What This Means</h3><p>In essence, while modularity offers flexibility and agility, <strong>it has also led to fragmentation,</strong> where different parts of the system are updated at different paces, causing friction. Managing the complexity of all these independent modules can be just as challenging as the limitations of a traditional, monolithic system&#8212;especially as your company grows.</p><h3>What We Do At Artemis</h3><p>Artemis gives data teams 500 hours back a year by maintaining and optimizing your stack for you. Our platform optimizes and maintains dbt models automatically, helping teams manage and improve documentation, warehouse performance, and overall system efficiency. By streamlining processes and reducing manual tasks, Artemis empowers data teams to focus on innovation, cut costs, and scale effectively without being bogged down by technical inefficiencies.</p>]]></content:encoded></item><item><title><![CDATA[How Data is Transforming Businesses in 2024]]></title><description><![CDATA[As consumer wallets and attention span shrink, data and AI have become increasingly important for standing out and winning business.]]></description><link>https://blog.artemisdata.io/p/how-data-is-transforming-businesses</link><guid isPermaLink="false">https://blog.artemisdata.io/p/how-data-is-transforming-businesses</guid><dc:creator><![CDATA[Kirsten]]></dc:creator><pubDate>Mon, 11 Nov 2024 23:36:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3f5a1df8-7dcc-4256-ae60-924a4ef7ee63_1145x631.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As consumer wallets and attention span shrink, data and AI have become increasingly important for standing out and winning business.</p><h3>1.Starbucks&#8217; Data-Driven Success: Boosting Revenue Through Analytics</h3><blockquote><p>According to <a href="https://www.linkedin.com/pulse/newvantage-partners-2019-big-data-ai-executive-survey-randy-bean/">New Vantage</a>, 79% of executives report that their organizations' investments in data initiatives are accelerating, and their revenue is growing in tandem.</p></blockquote><p><strong>Case Study: Starbucks</strong> &#9749; Starbucks developed their app and Star Rewards program and used these new data points to analyze customer purchase history, app usage, and even Wi-Fi connection data to understand customer preferences and behaviours.</p><p>This approach allows Starbucks to personalize offerings and create targeted marketing campaigns. The results have been remarkable: As of 2023, <strong><a href="https://s203.q4cdn.com/326826266/files/doc_financials/2023/q1/SBUX_Corrected_Transcript.pdf">mobile orders accounted for 27%</a></strong> <strong><a href="https://s203.q4cdn.com/326826266/files/doc_financials/2023/q1/SBUX_Corrected_Transcript.pdf">of all U.S. Starbucks transactions</a></strong>, and the Stars Rewards program has grown to 32.6 million active members.</p><p>Going one step further, the addition of <a href="https://www.geekwire.com/2016/starbucks-using-artificial-intelligence-connect-customers-boost-sales/">AI-powered personalization has led to a threefold increase in customer response rates to marketing offers</a>, showing the power personal data can have.</p><h3>2.Zara: Making Timely Decisions with Real-Time Data</h3><blockquote><p>According to a <a href="https://www.deloitte.com/middle-east/en/our-thinking/mepov-magazine/dynamic-evolution/demystifying-insights-though-analytics.html">Deloitte survey</a>, <strong>companies with strong data-driven decision-making are twice as likely to exceed business goals.</strong></p></blockquote><p><strong>Case Study: Zara</strong> &#128087; Zara's design-to-store time is the industry's best, at just three weeks, while others take an average of 6 months. They did this using in-store customer data. Zara equipped its store managers with handheld devices to report daily on customer reactions, popular items, and emerging trends. This data is then distributed to centralized design teams so they can make quick decisions about new products and adjustments to existing lines.</p><p>This rapid response to customer preferences has led to an impressive <a href="https://hbr.org/2004/11/rapid-fire-fulfillment">85% full-price sales rate</a> compared to the industry average of <a href="https://hbr.org/2004/11/rapid-fire-fulfillment">60-70%</a>.</p><h3>3.Netflix&#8217;s Predictive Insights Create The Movies of Tomorrow</h3><blockquote><p><strong>The predictive analytics market continues to rise, with it&#8217;s value expected to reach $28.1 billion by 2026</strong>, growing at a compound annual growth rate (CAGR) of 21.7%.</p></blockquote><p><strong>Case Study: Netflix</strong> &#128250; Netflix spends a billion dollars a year on cloud computing costs. They do this to perfect your recommended watch list and drive content creation. The streaming giant analyzes everything from viewing patterns and completion rates to pause and rewind behaviours.</p><p>Netflix understands what people like to watch and <em>how</em> they watch it. For its hit show Stranger Things, Netflix used viewing data that showed a significant overlap between viewers who enjoyed supernatural themes and those who responded positively to nostalgic 80s content. Stranger Things, a combination of both, was <a href="https://www.hollywoodreporter.com/tv/tv-news/stranger-things-2022-most-streamed-tv-show-1235310226/">2022&#8217;s most streamed show on Netflix, pulling in more than 52 billion minutes of views. It had a 36%</a> margin over the second most streamed show.</p><h3>4.Spotify Knows Us Personally</h3><blockquote><p>Personalization <strong><a href="https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/the-value-of-getting-personalization-right-or-wrong-is-multiplying">increases average order value by 40% and customer satisfaction by 44%.</a></strong></p></blockquote><p><strong>Case Study: Spotify</strong> &#127911; Spotify's algorithms that use billions of data points daily have transformed music discovery by creating a personalization engine that analyzes listening habits and preferences. Their insights help them understand what music people like, when they like it and how they prefer to listen to it, and create their <em>Discover Weekly</em> feature.</p><p>The result? Users have discovered over 2.3 billion hours of music through these personalized features, and Discover Weekly users stream 24% more music than non-users. By thoughtfully applying data to enhance the listening experience, Spotify has improved user satisfaction and increased the number of daily active users. 95% of Discover Weekly users return for more personalized recommendations the following week.</p><h3>5.UPS is Finding Efficiencies Through Data</h3><blockquote><p>Data-driven operational improvements have led to significant cost savings across industries. McKinsey reports that <strong>data-driven organizations are <a href="https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/five-facts-how-customer-analytics-boosts-corporate-performance">23 times more likely to acquire customers and six times more likely to retain them</a>.</strong></p></blockquote><p><strong>Case Study: UPS</strong> &#128230; UPS revolutionized delivery logistics by developing ORION (On-Road Integrated Optimization and Navigation), a sophisticated system that uses data to optimize delivery routes. The company installed telematic sensors in its delivery vehicles to collect real-time data on everything from route progress to vehicle performance.</p><p>ORION processes this data and information about package destinations, promised delivery times, and weather conditions to calculate the most efficient routes.</p><p>The results have been remarkable: <strong><a href="https://about.ups.com/ca/en/newsroom/press-releases/innovation-driven/ups-to-enhance-orion-with-continuous-delivery-route-optimization.html">UPS now saves 100 million miles driven annually, reduces fuel consumption by 10 million gallons, and cuts 100,000 metric tons of CO2 emissions each year</a>.</strong> The system has also generated significant cost savings, with an estimated $300-400 million reduced in operational expenses annually.</p><h3>Embracing Data-Informed Decision Making</h3><p>While data is known to create a competitive edge, it can sometimes be hard to find use cases and examples of massive results for organizations. Companies that have embedded data into their decision-making consistently outperform competitors by exceeding revenue targets, accelerating time to market, and improving customer retention. The best way to predict the future is to create it.</p>]]></content:encoded></item></channel></rss>