Run DBT 2x faster
Big DBT jobs involve a lot of queries. Here's what a typical job might look like:
This representation of the queries in a job is called a DAG, or a directed acyclic graph. A DAG is a good way to represent dependencies between different queries. We can use it to understand how much work can be done in parallel, which in turn lets us speed up the job. In a typical DBT job most queries don't depend on each other, so we can run many queries in parallel to get big jobs done quickly.
That said, if you're a DBT user you'll know that big jobs are not exactly fast. With so much parallelizable work, why do you still have to wait hours for things to run?
It’s not DBT's fault - it's Snowflake's. While DBT knows how your job is structured, Snowflake doesn't, which means that when you schedule your jobs on a standard Snowflake warehouse you get a fixed amount of parallelism. You can, of course, speed things up by using a larger warehouse, but you’ll pay for the privilege.
You might think this isn't a problem if you pay for Snowflake’s multi-cluster support, but that's not always the case. While multicluster warehouses provide some scheduling flexibility, they still use a one-size-fits-all algorithm that doesn't take the structure of your DBT job into account. It turns out that you can do much better by analyzing your job.
Let’s take this simple job as an example:
With this job DBT will analyze the DAG, notice that A, B, D, and E have no dependencies, and schedule them immediately.
When run on a standard warehouse these queries split the resources of a single cluster, running in 12 minutes total. You should instead run A, B, D, and E on two separate warehouses, getting all four done in the first 4 minutes. This reduces the total runtime of the job by 33%, to 8 minutes.
A multi-cluster warehouse likely wouldn’t help here. Because Snowflake’s algorithm is unaware of your overall DBT job, it may just put all these queries on the same cluster, performing the same as the standard cluster. Or, it may put 3 queries on a single cluster and put the fourth one on its own cluster. This is faster than using one cluster (it would run in 10 minutes instead of 12), but it's also more expensive because the second cluster will be underutilized with just one query. Even worse, it could put all 4 queries on a cluster and then spin up a new cluster without even using it, costing you money with no improvement in runtime.
(This example assumes that your queries parallelize well on the cluster size you're using. Figuring out when that's true is a topic for a different post and what size cluster to use are topics for a different post.)
We’ve found that extending this analysis to optimize throughput on real DBT DAGs lets us improve runtimes significantly in both standard and multicluster warehouses. Our customers are using this in production today to run DBT jobs up to 50% faster with no increase in costs.
Want to try it out? Contact us: dbt-accelerator@espresso.ai
Discover your Savings
What would your Snowflake bill look like with Espresso?
Our AI can tell you.