How to Write Faster Queries with Snowflake Features

Published on

30 Jun 26

Hidden within Snowflake’s extensive feature set are four features that can help make your queries more performant: Search Optimization Service, Query Acceleration Service, dynamic tables, and materialized views. All of these target queries that aren’t easy to optimize by just rewriting the SQL.

Search Optimization Service

Snowflake tables are organized into micro-partitions, each storing 50 to 500 MB of data. Snowflake automatically manages a table's micro-partitions as the table grows.

The Search Optimization service stores metadata about the values in your table's columns, allowing Snowflake to skip scanning micro-partitions that can't match a WHERE clause. This metadata, which Snowflake calls "search access paths," is similar to a column index, but provides data about the column values in micro-partitions rather than the values in an entire column. Very large tables can contain millions of micro-partitions, so being able to skip most of them can dramatically improve performance.

NOTE What kind of data structure is Snowflake using in search access paths? Snowflake doesn't document the implementation details, but according to their patent on the technique, it's a Bloom filter or a cuckoo filter—probably the latter, since cuckoo filters allow removing entries.

‍

The Search Optimization Service mainly accelerates two kinds of queries:

Point queries, or queries that specify a few records using an equality or inequality comparison on a string, numeric, or Boolean column. For example, opening a specific customer record is a point query. Looking up a customer's orders is also a point query when the number of matching records is relatively small. Finally, a date range specifying a single day can be a point query.
Searches involving other data types with specific predicates, such as SEARCH and SEARCH_IP on text columns, substring and regular expression searches on text using LIKE, ILIKE, RLIKE, etc., values in structured and semi-structured data types, and NULL checks.

TIP For the Search Optimization Service to work best, records that are likely to be returned by common queries should be in the same micro-partition, or in a small group of micro-partitions. You can use a clustering key to influence how Snowflake clusters your data into micro-partitions.

For a full explanation of situations where SOS can be enabled, see Search optimization service.

Search optimization is enabled at the table level for specific columns. Each type of query that can be accelerated requires its own search access path. The Search Optimization Service lets you choose which kinds of searches are optimized on each column.

Search access paths consume storage, and building and maintaining them consumes compute. Building a full search access path on a very large table could be massively parallelized, which can consume a lot of compute in a very short time. You can minimize both storage and compute consumption by carefully choosing the columns to accelerate and the specific kinds of queries you want to be faster.

To enable search optimization on a table, you use the ALTER TABLE … ADD SEARCH OPTIMIZATION command (for Apache Iceberg tables, use ALTER ICEBERG TABLE).

When you enable search optimization on a table and don't specify which columns to accelerate, the Search Optimization Service builds search paths for point queries on all eligible columns. To specify additional columns or to choose the types of queries that will be answered, use ALTER TABLE. … ADD SEARCH OPTIMIZATION ON .. and specify the type of predicate and the columns, such as SUBSTRING(column1, column2).

You can estimate the cost of the Search Optimization Service using a system function, SYSTEM$ESTIMATE_SEACH_OPTIMIZATION_COSTS. To see how much the Search Optimization Service is actually costing you, query the SNOWFLAKE.ACCOUNT_USAGE.SEARCH_OPTIMIZATION_USAGE view.

Query Acceleration Service

Snowflake's Query Acceleration Service (QAS) automatically assigns additional serverless compute resources to queries that it determines can benefit from further parallelization. QAS might be applied to any SELECT, INSERT, COPY INTO, CREATE TABLE AS SELECT (CTAS), MERGE, or UPDATE query that runs on a warehouse where the service is enabled. QAS can also accelerate dynamic table refresh and data export. According to Snowflake, it is not unusual for QAS to accelerate SELECT statements by ten times or more.

QAS specifically accelerates queries that touch many of a table's micro-partitions, as queries of multiple micro-partitions can be executed in parallel. Ad-hoc queries to answer one-time questions often qualify. Queries on unindexed columns (or where an index can't be used because a query term isn't sargable) are also obvious candidates. In general, any query that takes more than a couple of minutes to run might be a candidate for QAS acceleration. Use the QUERY_ACCELERATION_ELIGIBLE account usage view to identify eligible queries and to help gauge the potential impact of QAS.

New multi-cluster or Gen 2 warehouses are provisioned with QAS enabled. To enable QAS on other warehouses, use the following SQL:

ALTER WAREHOUSE <warehouse>

ENABLE_QUERY_ACCELERATION = TRUE;

In theory, QAS might not accelerate a query if not enough shared compute resources are available when they're needed. In practice, Snowflake makes makes plenty of resources available for QAS so they can charge you for the extra compute. It should be rare for QAS to not have the compute it needs.

QAS compute is billed as serverless compute based on hours used, rounded up to the nearest whole second. To see actual Query Acceleration Service usage, query the table SNOWFLAKE.ACCOUNT_USAGE.QUERY_ACCELERATION_HISTORY.

Materialized views and dynamic tables

Both materialized views and dynamic tables are similar to ordinary views in that they provide a way to predefine a different view on your data, but whereas views are simply a way to name queries on one or more tables, materialized views and dynamic tables actually store the results of the query you specify, and refresh their data from its source query from time to time. Because they contain real data, you can add indexes to columns in a materialized view or a dynamic table. You can't do that in an ordinary view.

NOTE Going forward, we're going to use MV/DT to mean "materialized view and/or dynamic table," because typing (and reading) that over and over is a drag.

A materialized view is updated whenever a row is inserted, updated, or deleted from an underlying data source. A dynamic table allows its data to lag behind the underlying data source, and is updated when that lag time is exceeded. Dynamic tables can also be updated manually.

As with ordinary views, MVs/DTs can have columns that are derived from other columns. The difference is that a view's derived columns might be calculated each time the view is queried (although a cache may be used when the underlying data sources haven't changed). A MV/DT stores the actual results of derived column calculations, so when querying a MV/DT, those columns don't need to be recalculated the way they would be in an ordinary view, or in a query that defines derived columns in the SELECT clause.

In this way, MVs/DTs provide a way to avoid performing expensive operations with every query, including not just derived columns but also costly JOINs. This characteristic is especially useful with the results of Cortex AI Functions, which use a large language model (LLM) to generate completions from your prompt. These functions can take a significant amount of time to execute, and they also incur per-token compute costs. Most AI Functions are non-deterministic, meaning you can get slightly different results from repeated calls, so storing the result instead of calculating it repeatedly can also provide a more stable result.

NOTE You might prefer to store derived data in the same table as columns it's calculated from, and calculate derived values when a row is added or updated. But this isn't always possible. You might not own the source table(s) or the code that inserts rows and updates them. Or the derived column might be calculated from columns in two or more tables.

When should you choose a materialized view versus a dynamic table?

Use a materialized view when it is important for the results to always be up-to-date. For example, if you're using a derived column to make a query term sargable, you generally want a materialized view.
Use a dynamic table when it is acceptable for the data to lag behind its source of truth. For example, for a report that covers only months previous to the current month, you can update the dynamic table once a month using a manual update triggered by a task. For other uses, the dynamic table's lag feature might be more appropriate.

Because they contain data, MVs/DTs increase storage consumption. To reduce storage requirements, store only the rows and columns you need along with a suitable key, and join the materialized view or query back to the source table(s). Creating and updating a MV/DT also can consume compute time, especially if their underlying SELECT clause is expensive.

Conclusion

Several Snowflake features can help make high-latency queries run faster. Search Optimization Service, Query Acceleration Service, materialized views, and dynamic tables are different ways to trade storage, compute, or freshness for performance, and the right choice depends on which part of the query is actually slow and how much you're willing to spend to address it. Know what you're optimizing, use the tool that best matches the problem, and be sure to actually measure the improvement to judge whether the cost is worth it.

Frequently Asked Questions

The Search Optimization Service stores metadata — called "search access paths" — about the values in a table's columns, enabling Snowflake to skip micro-partitions that can't match a WHERE clause. It primarily accelerates point queries (equality or inequality comparisons returning a small number of records) and predicate-heavy searches such as LIKE, RLIKE, SEARCH, and SEARCH_IP. Enabled per table with ALTER TABLE … ADD SEARCH OPTIMIZATION, it consumes both storage and compute to build and maintain those paths.

Enable the Query Acceleration Service on warehouses running queries that scan many micro-partitions — typically ad-hoc analytical queries taking more than a couple of minutes. QAS automatically assigns serverless compute to eligible queries including SELECT, CTAS, MERGE, and UPDATE statements. New multi-cluster and Gen 2 warehouses have it on by default; use ALTER WAREHOUSE … ENABLE_QUERY_ACCELERATION = TRUE for older warehouses. Billing is based on serverless compute hours rounded to the nearest second.

Materialized views store the precomputed results of a query — including derived columns and JOIN outputs — rather than recalculating them at query time. Unlike ordinary views, they hold real data that can be indexed, and they refresh automatically whenever the source data changes. This makes them especially useful for Cortex AI Function outputs, where recomputing results on every query is both slow and expensive in per-token compute costs, and where a stable, non-recalculated result is often preferable.

Both dynamic tables and materialized views store precomputed query results, but differ in freshness guarantees. A materialized view updates immediately whenever the underlying data changes, making it the right choice when results must always be current — for example, when making a query term sargable. A dynamic table allows its data to lag behind the source of truth by a configurable amount and can also be refreshed manually, making it suitable for workloads like monthly reports where some staleness is acceptable.

Jerry Kindall

Technical Content Writer

Share this post

Never miss an update

Subscribe to our newsletter. Get exclusive insights delivered straight to your inbox.