How It WorksAboutCareersNewsSavings EstimateLogin
Get A Savings Estimate
Get A Savings Estimate
Login
All Posts

A Script to Find and Delete Unused Snowflake Tables without Enterprise Access History

February 21, 2025
How It WorksAboutCareersNewsSavings EstimateLogin
Get A Savings Estimate
Get A Savings Estimate
Login

If you're managing a Snowflake data warehouse, knowing which tables are gathering digital dust can save you significant storage costs and help maintain a cleaner data environment. While Snowflake's Enterprise Edition offers access detection features out of the box via the ACCESS_HISTORY table, there's a way to get the same data using just the Standard Edition and your query history.

One Snowflake user we shared this with was able to find and delete 150TB of stale data - that's over $40,000/year in saved storage costs.

The Code

To cut to the chase, here’s a query to find tables that have not been accessed in the last 3 months, or the last accessed date of the table:

WITH tokens AS (
  SELECT
    start_time,
    f.value as token 
  FROM snowflake.account_usage.query_history,
  LATERAL FLATTEN(REGEXP_SUBSTR_ALL(UPPER(query_text), '\\b\\w+\\b')) f
  WHERE start_time >= DATEADD(month, -3, CURRENT_TIMESTAMP())
)
SELECT 
  m.table_catalog,
  m.table_schema,
  m.table_name,
  sum(m.ACTIVE_BYTES + m.TIME_TRAVEL_BYTES) as TOTAL_BYTES,
  MAX(t.start_time) as last_query_time
FROM tokens t
RIGHT JOIN snowflake.account_usage.table_storage_metrics m 
  ON t.token = UPPER(m.table_name)
WHERE m.table_name IS NOT NULL 
  AND m.TABLE_DROPPED IS NULL
  AND m.ACTIVE_BYTES + m.TIME_TRAVEL_BYTES > 0
GROUP BY ALL
ORDER BY last_query_time DESC

How It Works

Let's break down this query step by step:

  1. Query History Tokenization
    • The tokens CTE extracts individual tokens from all SQL queries run in the last 3 months
    • It uses REGEXP_SUBSTR_ALL to split queries into tokens
    • The LATERAL FLATTEN converts the array of tokens into rows
      ‍
  2. Storage Metrics Join
    • We RIGHT JOIN with table_storage_metrics to ensure we get ALL tables
    • Tables that don't appear in any queries will have NULL last_query_time
    • The join matches table names against our extracted tokens
      ‍
  3. Results Analysis
    • TOTAL_BYTES shows storage cost (active + time travel)
    • last_query_time shows when the table was last referenced in a query
    • Tables with NULL last_query_time haven't been queried in 3 months

Limitations to Consider

This approach is effective, but it isn’t perfect. In particular, it has the following issues:

  1. False Positives: Tables mentioned in comments or unused CTEs might appear as "used"
  2. Indirect References: Tables accessed through views or stored procedures will not be detected

Storage vs. Compute

Most Snowflake users spend 90% of their bill on compute. If you're looking to cut Snowflake costs, Espresso AI can help you automatically bring down your compute bill by up to 70% - you can get a savings estimate with the script here. 

Alex Kouzemtchenko
Founder
Share this post

Discover your Savings

What would your Snowflake bill look like with Espresso?
Our AI can tell you.

Get A Savings Estimate
Thank you! We'll be in touch shortly.
This is error message

The Case For Apache Iceberg: Moving Storage Off Snowflake Can Cut Your Bill In Half

We’ve seen Snowflake customers cut their total bill in half by moving storage into Apache Iceberg, by saving compute costs on ELT.
Read

Explaining Snowflake Pricing

Snowflake pricing is split between Storage, Compute, and Cloud Services. Compute is by far the biggest source of Snowflake spend.
Read

Three Questions To Ask Before Panicking About Your Snowflake Bill

So, your Snowflake bill is high. But is it too high? That depends- How does Snowflake cost growth compare to top-line growth? Or ROI?
Read
View all
Savings EstimateOnboardingTerms of ServiceDatabricks Waiting ListSupport
Sign up for news
Thank you!
Oops! Something went wrong while submitting the form.
© 2025 Espresso AI. All rights reserved.
Website Visitor Tracking