Table¶

types: permanent, transient, temporary, external
no failsafe period for transient and temporary
only permanent tables (enterprise+) have up to 90 days of time-travel, other tables have 0-1 days
temporary: exists only for the session, can have the same name as permanent tables but will shadow them
Only constrained enforced is NOT NULL
RI is not enforced, but may be used by the optimizer for join elimination
copying tables
create table t2 like t1 [copy grants]: copy only structure (and grants)
create table t2 as select * from t1 [copy grants]: copy structure and data (and grants)
Consists of micro partitions each max 16MB compressed => 50-500MB sized object storage
re-clustering is automatic and non-blocking
can be suspended/resumed using ALTER TABLE t1 SUSPEND/RESUME RECLUSTER
charges available in information_schema.automatic_clustering_history
Statistics maintained by cloud-services layer:
table level: Row count, table size (bytes)
micro-partition level: Counts: Row, NULLs, Distinct Values; Values: MIN, MAX (only for integer and date data types?)

Dynamic Table¶

Supported in incremental refresh:
WITH,
SELECT: expression must have columns or deterministic functions, immutable UDF
FROM: tables, views, dynamic tables, Iceberg tables
OVER: all window functions
JOIN: inner, outer-equi, cross, lateral flatten (non-static FLATTEN)
lateral joins are unsupported except lateral flatten

External Table¶

Can have Snowflake leverage partition pruning

SELECT metadata$filename FROM @s1/;
--  METADATA$FILENAME
--  ---------------------------------------
--  files/logs/2018/08/05/0524/log.parquet
--  files/logs/2018/08/27/1408/log.parquet

CREATE EXTERNAL TABLE et1(
  date_part      date   AS TO_DATE(SPLIT_PART(metadata$filename, '/', 3) || '/' ||
                                   SPLIT_PART(metadata$filename, '/', 4) || '/' ||
                                   SPLIT_PART(metadata$filename, '/', 5), 'YYYY/MM/DD'),
  timestamp     bigint  AS (value:timestamp::bigint),
  col2          varchar AS (value:col2::varchar)
) PARTITION BY (date_part) LOCATION=@s1/logs/

Allows row access policy
Workflow for creating:
create named stage: create storage integration using IAM role and S3 prefix, edit IAM policy trust relation using SI IAM user and external ID
create external table
verify using alter refresh,
configure s3 notification: two options depending on if there is already an existing notification configured for the S3 prefix:
- no existing notification: configure S3 notification for ObjectCreate and ObjectRemoved events using Ext table's SQS ARN (notification_channel)
- already existing notification:
- Create SNS topic
- subscribe existing SQS to the SNS topic
- Grant Snowflake SQS permission to subscribe to the SNS topic using system$get_aws_sns_iam_policy('<sns_topic_arn>') to get
- Create external table with SNS topic
alter refresh
Use ALTER EXTERNAL TABLE <tb> REMOVE FILES ('file1', ...) to remove old files
Use table function EXTERNAL_TABLE_FILES(TABLE_NAME => '<TB>') to list all files

Event Tables¶

One per account, used for logging

Hybrid Tables¶

Supports and enforces primary-key, foreign-key, unique constraints and indexes; must have a primary-key
Row-level locking
Bulk-loading: supported only if table is empty. Use,
initial bulk-load as a single transaction
CTAS (does not support foreign key)
quotas and throttles
2TB per database
8,000 operations per second per database
100 databases that can contain hybrid tables
unsupported: failsafe, data sharing, streams, snowpipe, dynamic tables, materialized view, replication, QAS, SO
limited support:

feature	support
time-travel	only timestamp option
cloning	only at database level
undrop table	only by cloning database

Queries that run under 500ms don't appear in QUERY_HISTORY, but AGGREGATE_QUERY_HISTORY will be updated
hybrid tables are run on a separate compute cluster, but must be submitted through the standard WH
quotas:
storage ~ 2.5TB
compute: throttling is activated when the compute cluster is being dominated by a single customer
cost:
storage cost can be as high as 8 to 10 times the regular tables
VWH cost + "request" costs
supports: time-travel
does not support:
result cache
streams
replication (not yet)

BP¶

x-small warehouse is sufficient for most queries
for concurrency use MCW with standard scaling policy
use separate WH for hybrid tables, since initial warming up time can take up to 2-3 minutes
Supported by may offer sub-optimal performance:
stored procedures (prefer multi-statement transactions and autocommit)
serverless tasks
use aggregate_query_history, which aggregates repeated SQL statements every minute
several metric columns are of object type, and contain sum, avg, p99 etc values

Performance BP¶

loading into empty table is much faster
Single and multi-statement transaction offer better performance
order of key columns matters
query-profile: look for and prefer "row scan mode" v/s column (object storage)

Interactive Tables¶

must be added to a interactive warehouse
max of 10 tables can be added to an interactive warehouse
Interactive WH cache data locally