Skip to content

Table

  • types: permanent, transient, temporary, external
  • no failsafe period for transient and temporary
  • only permanent tables (enterprise+) have up to 90 days of time-travel, other tables have 0-1 days
  • temporary: exists only for the session, can have the same name as permanent tables but will shadow them
  • Only constrained enforced is NOT NULL
  • RI is not enforced, but may be used by the optimizer for join elimination
  • copying tables
  • create table t2 like t1 [copy grants]: copy only structure (and grants)
  • create table t2 as select * from t1 [copy grants]: copy structure and data (and grants)
  • Consists of micro partitions each max 16MB compressed => 50-500MB sized object storage
  • re-clustering is automatic and non-blocking
  • can be suspended/resumed using ALTER TABLE t1 SUSPEND/RESUME RECLUSTER
  • charges available in information_schema.automatic_clustering_history
  • Statistics maintained by cloud-services layer:
  • table level: Row count, table size (bytes)
  • micro-partition level: Counts: Row, NULLs, Distinct Values; Values: MIN, MAX (only for integer and date data types?)

Dynamic Table

  • Supported in incremental refresh:
  • WITH,
  • SELECT: expression must have columns or deterministic functions, immutable UDF
  • FROM: tables, views, dynamic tables, Iceberg tables
  • OVER: all window functions
  • JOIN: inner, outer-equi, cross, lateral flatten (non-static FLATTEN)
  • lateral joins are unsupported except lateral flatten

External Table

  • Can have Snowflake leverage partition pruning
SELECT metadata$filename FROM @s1/;
--  METADATA$FILENAME
--  ---------------------------------------
--  files/logs/2018/08/05/0524/log.parquet
--  files/logs/2018/08/27/1408/log.parquet

CREATE EXTERNAL TABLE et1(
  date_part      date   AS TO_DATE(SPLIT_PART(metadata$filename, '/', 3) || '/' ||
                                   SPLIT_PART(metadata$filename, '/', 4) || '/' ||
                                   SPLIT_PART(metadata$filename, '/', 5), 'YYYY/MM/DD'),
  timestamp     bigint  AS (value:timestamp::bigint),
  col2          varchar AS (value:col2::varchar)
) PARTITION BY (date_part) LOCATION=@s1/logs/
  • Allows row access policy
  • Workflow for creating:
  • create named stage: create storage integration using IAM role and S3 prefix, edit IAM policy trust relation using SI IAM user and external ID
  • create external table
  • verify using alter refresh,
  • configure s3 notification: two options depending on if there is already an existing notification configured for the S3 prefix:
    • no existing notification: configure S3 notification for ObjectCreate and ObjectRemoved events using Ext table's SQS ARN (notification_channel)
    • already existing notification:
    • Create SNS topic
    • subscribe existing SQS to the SNS topic
    • Grant Snowflake SQS permission to subscribe to the SNS topic using system$get_aws_sns_iam_policy('<sns_topic_arn>') to get
    • Create external table with SNS topic
  • alter refresh
  • Use ALTER EXTERNAL TABLE <tb> REMOVE FILES ('file1', ...) to remove old files
  • Use table function EXTERNAL_TABLE_FILES(TABLE_NAME => '<TB>') to list all files

Event Tables

  • One per account, used for logging

Hybrid Tables

  • Supports and enforces primary-key, foreign-key, unique constraints and indexes; must have a primary-key
  • Row-level locking
  • Bulk-loading: supported only if table is empty. Use,
  • initial bulk-load as a single transaction
  • CTAS (does not support foreign key)
  • quotas and throttles
  • 2TB per database
  • 8,000 operations per second per database
  • 100 databases that can contain hybrid tables
  • unsupported: failsafe, data sharing, streams, snowpipe, dynamic tables, materialized view, replication, QAS, SO
  • limited support:
feature support
time-travel only timestamp option
cloning only at database level
undrop table only by cloning database
  • Queries that run under 500ms don't appear in QUERY_HISTORY, but AGGREGATE_QUERY_HISTORY will be updated
  • hybrid tables are run on a separate compute cluster, but must be submitted through the standard WH
  • quotas:
  • storage ~ 2.5TB
  • compute: throttling is activated when the compute cluster is being dominated by a single customer
  • cost:
  • storage cost can be as high as 8 to 10 times the regular tables
  • VWH cost + "request" costs
  • supports: time-travel
  • does not support:
  • result cache
  • streams
  • replication (not yet)

BP

  • x-small warehouse is sufficient for most queries
  • for concurrency use MCW with standard scaling policy
  • use separate WH for hybrid tables, since initial warming up time can take up to 2-3 minutes
  • Supported by may offer sub-optimal performance:
  • stored procedures (prefer multi-statement transactions and autocommit)
  • serverless tasks
  • use aggregate_query_history, which aggregates repeated SQL statements every minute
  • several metric columns are of object type, and contain sum, avg, p99 etc values

Performance BP

  • loading into empty table is much faster
  • Single and multi-statement transaction offer better performance
  • order of key columns matters
  • query-profile: look for and prefer "row scan mode" v/s column (object storage)

Interactive Tables

  • must be added to a interactive warehouse
  • max of 10 tables can be added to an interactive warehouse
  • Interactive WH cache data locally