Skip to content

SQL

  • IN same as = ANY. NOT IN same as NOT = ALL. NOT = ANY is true for all except NULL - NOT IN ('a', 'b', 'c', NULL) will never return any rows
  • cast to integer truncates and cast to decimal rounds (up to even and down to even)
  • truncation of display data by cast is not permitted in ANSI mode
  • NUMBER data type is stored as floating point with 100 as exponent base 1. Has variable length storage: 1 byte exponent and up to 17 bytes of mantissa 1. Value of 0 and NULL are stored as 0 length 1. if COMPRESS is specified, needs 1 byte as length, otherwise 2 bytes 1. can have floating values: fixed scale NUMBER(*,m) or both floating NUMBER(*)
  • ANSI SQL, auto rollback only on failures (deadlock, failed DDL, ROLLBACK)
  • ANSI style CAST CHAR implicitly trims leading spaces irrespective of ANSI or BTET mode
  • TD style CAST CHAR front pads with spaces for max digits possible + 1 for the sign
  • Syntax error is a failure in BTET model, but an error in ANSI mode
  • MSR cannot contain a DDL statement and if it does, then it’s considered failure.
  • Sampling: FROM … SAMPLE [WITH REPLACEMENT] [RANDOMIZED ALLOCATION] WHEN … THEN n[,n…] … END. n: fraction or count
    1. Stratified: multiple sampling group, each with different probability
    2. WITH REPLACEMENT: non mutually exclusive groups
    3. RANDOMIZED ALLOCATION: Sample across all AMPs instead of per AMP
    4. SAMPLEID function returns the ID for each of the multiple sample sets
  • Order of evaluation
    • WHERE > GROUP BY/HAVING > OLAP/QUALIFY > SAMPLE/TOP > SELECT/ORDER BY
  • TOP N with no ORDER BY is always faster or equivalent to QUALIFY with ROW_NUMBER()
  • SUBSTRING can accept 0 or -ve values for start pos, but no -ve values for length
  • SUBSTRING works on INT by first converting them to string that has place of signs
  • Specification for Data Formatting (SDF)
    1. G/D: number grouping/decimal character (,/. for USA)
    2. I/F: # of Integer/Fractional digits
    3. L/C/N-O/U/A: currency/iso/name-dual currency symbol. can appear either at beginning or end of the format string. LL/OO/--: floating symbols
    4. EEE/EEEE: short/long day
    5. HH/MI/SS/T/DS(n)/Z: hour/min/sec/(12 hour format with AM/PM)/fractional seconds/zone
  • UPDATE ELSE INSERT/MERGE must qualify PI and partitioning columns
  • A positioned UPDATE/DELETE on a table with triggers causes error
  • Time value is acquired at request level. All stmt in MSR will have same CURRENT_TIME
  • SELECT AND CONSUME:No WHERE clause, blocks on empty table, TOP 1 only, rollback only if explicit transaction rollbacks, no rollback if is part of MSR/Macro and fails
  • ROT: Frequent UPDATE and DELETE on queue table causes PEs to rebuild FIFO queues
  • SET tables only ignore duplicates in multi-row operations, give error on single row dups
  • An unusally long MERGE step indicates inserting into a SET table with large number of duplicates
  • IDENTITY col:
    1. if GENERATED ALWAYS, user specified values ignored with a warning
    2. if exceeds MAXVALUE and NO CYCLE, it’s failure
    3. Each PE has range assigned, so it’s possible to exceed MAXVALUE with few rows
    4. For bulk insert (load, insert-select?) each AMP reserves a block of IDs
    5. GENERATED BY DEFAULT doesn’t consider any previously explicit values.
  • Search path for unqualified functions: current DB, SYSLIB and then TD_SYSFNLIB
  • TD_SYSFNLIB database contains Domain Specific functions
  • TD14+: Can create PI and SI on most UDT except LOB,PERIOD,VARIANT,ARRAY
  • TD_ANYTYPE in C UDFs, is resolved at run-time. (COLTYPE = ‘++’)
  • CURRENT_USER/CURRENT_ROLE, unlike USER/ROLE return proxy user/role for trusted session if set
  • NULL values are ordered first unless explicitly NULL LAST clause in ORDER BY
  • REGEX_{SUBSTR,REPLACE,INSTR,SIMILAR,SPLIT_TO_TABLE}
  • Prefer macros over SP for tactical because stored procedures are evaluated on PE

Date and Time

  • Date format set at session, user or DBS Control level. ANSI is alway YYYY-MM-DD, integer date defaults to YY/MM/DD unless overridden by SDF. DATEFORM
  • TIMEZONE at 3 levels: dbscontrol, user (TIME ZONE = …) or session (SET TIME ZONE)
    1. unnormalized timestamps ordering ignore timezone
  • INTERVAL: broadly two categories YEAR/MONTH or DAY/TIME.
    1. Expression must evaluate to valid date e.g. DATE ‘2001-12-30’ + ‘2’ MONTHS will generate error. Use ADD_MONTH function to not generate an error
    2. ‘dt1’ - ‘dt2’ is number, but (‘dt1’ - ‘dt2’) DAY is INTERVAL
    3. default INTERVAL length is 2 digits, max 4 digits
    4. division of an interval truncates
    5. Add/Subtraction of different precision, do operation first and then truncate
  • range are pair of two time or date or time or date and an interval.
    1. exact same types can be checked for OVERLAPS condition.
    2. Adjacent ranges do not overlap (only one point is common)
    3. range with one point as NULL, is a single instant. Must fall in the other range for OVERLAP
  • TD14.10+: Can collect stats on PERIOD columns using BEGIN() or END()

Bus_Calendar: Teradata, ISO and Compatible (Oracle)

  • Teradata: BOW Sunday, Week 1 is full week, Week 0, if exists, is partial
  • ISO: BOW Monday, Week 1 has at least 4 days, no partial weeks
  • Compatible: BOW and Week 1 is always Jan 1. Last week is always partial
  • dbc.BusinessCalendarPattern: patterns about business and non-business days
  • dbc.BusinessCalendarException: Exceptions to above patterns

Array

  • ARRAY or VARRAY are UDT and support multi-dimensions
    1. CREATE TYPE mytype AS INTEGER ARRAY[10]; NEW mytype(1,2,3,45,6)
  • Functions
    1. ARRAY_AGG: Turns normal columns into ARRAY using GROUP BY
    2. Table function UNNEST([<key>,]<array>) normalizes arrays into multiple rows
      1. returns up to 3 column ([key], val, [pos]) pos WITH CARDINALITY clause
    3. CARDINALITY: Count # of elements in an ARRAY.
    4. || : concats two arrays into one.

Ordered Analytical functions (OAF)

  • ROLLUP produces aggregates of hierarchy in the top-bot order of columns listed
    • GROUP BY ROLLUP (A,B) = GROUP BY GROUPING SET ((A,B),A,())
  • CUBE produces aggregates of all combinations of columns (2**n)
    • GROUP BY CUBE(A,B) = GROUP BY GROUPING SET (A,()), GROUPING SET (B,())
  • GROUPING() returns one if current row corresponds to a group
  • Multiple GROUPING SETS is equivalent to multiplying.
    • e.g. GS(A), GS(B,()) = GS( (A,B), (A) )
  • RESET WHEN clause cannot contain agg function that include RESET WHEN
  • GROUP BY clause collapses rows before window function executes
  • GROUP BYclause defines the partition for Teradata specific functions
  • ROWS clause specifies starting window, ending is implied to be current row
  • QUANTILE(100,col [<DESC>]): Ranked normalized to 100. DESC indicates, quantile value descends with col value descends (both descend)
  • WIDTH_BUCKET is like QUANTILE, except it has lower and upper bounds
  • When window isn’t specified (no ROWS clause), entire window is default
  • When ROWS UNBOUNDED PRECEDING specified, current row is default end
  • PROCEEDING includes current row, and FOLLOWING excludes it
  • AMP may choose HASH or RANGE partitioning for PARTITION BY clause
    • HASH partitioning is chosen only for cumulative or reporting window types because they are not semantically order dependent

Temporal

  • Must be MULTISET and cannot have UNIQUE PI
  • Qualifiers: {CURRENT,SEQUENCED,NONSEQUENCED} {VT,TT}, {VT,TT} AS OF, NONTEMPORAL
  • Temporal modifications: Applies only to VT tables
    1. Current: PA = [current, UNTIL_CHANGED) PV: current rows only
    2. Seq: PA = explicitly specified, PV: all rows that overlap PA
    3. Non-seq: Disables temporal => no history rows are created
  • Non-temporal operations: Disables VT and TT both => can do physical DELETEs as well
  • Temporal queries:
    1. Current: TT: only open rows, VT: Only current rows, Result: non-temporal
    2. As Of: VT and TT that overlap.
      1. Unlike CURRENT, AS OF can return closed rows
      2. AS OF can be applied independently to TT or VT
    3. Seq: TT: only open, VT: All rows that overlap PA. default PA=all time, Result: temporal
    4. Non-seq VT: Ignores temporal columns. If PA specified, result is temporal
    5. Non-seq TT: Ignores temporal columns, cannot specify PA, result always non-temporal
  • TT_TIMESTAMP is different for each row even within the same transaction, same statement
  • TEMPORAL_TIMESTAMP or TEMPORAL_DATE: is the value set at either first non-locking statement or when explicitly referencing these functions.
Feature CURR SEQ VT SEQ TT NONS VT NONS TT AS OF NONT
Result Temporal? - Y PA?Y:- - - -
Update - VT-PA - - Y
set sess Y VT Y Y Y
Ins/Del/Mrg/Upd VT VT VT Y
Only 1 Tbl in qry - TT - - -
STJI Y Y - Y Y -
MTJI - Y - Y Y
Positioned Cursor Y - - - - -
non-RI Constraints Y Y - Y - - -
RI Constraint Y Y Y non-temp parent
  • NONSEQUENCED DML do not create history rows in valid-time
  • Default PA for SEQUENCED queries is all times
  • FLOAD Yes, MLoad No
  • RI are not enforced. Child FK must include DATE/TIMESTAMP column. No gaps in VALIDTIME values in parent tables allowed.
    1. Temporal Rel Const: When Child is non-temporal
  • NONTEMPORAL modifications are always logged in AccessLoggging tables
    1. NONTEMPORAL cannot be used in SP
    2. Can be disabled completely DBSControl flag
  • TT_TIMESTAMP is tracked individually for each row, even within the same transactions, unless a row modification results in multiple rows. e.g update causes insert of another row.
  • If PV isn’t specified explicitly, PV for the row is TEMPORAL_TIMESTAMP built-in function, which is the time temporal table is first non-locking referenced in a transaction.
  • Uniqueness: current, sequenced or nonsequenced uniqueness
  • To minimize high JI maintenance, use NONSEQUENCED/NONTEMPORAL DML
  • EXPAND ON can be interval, anchor point or anchor period.
    1. EXPAND ON Period_Dt_Tm AS Day_Dt_Tm BY ANCHOR PERIOD DAY AT TIME '00:00:00'
    2. Anchor can be day, month_begin, month_end, any week day
    3. TD14+: Anchor can also be hour, minute, second, millisecond
    4. For interval, expands on exact interval, last expansion might be partial
    5. For anchor point, expands at anchor point if begin(a.p.) falls in the expanding period
    6. For anchor period, expands when expanding period overlaps expanded period
  • Constraints:
    1. Table/Column: CURRENT VT, SQUENCED VT, NONSEQUENCED VT
    2. RI: CURRENT VT/TT, SEQUENCED VT/TT, NONSEQ VT/TT (only allowed bet temporal child and either non-temporal parent or different type temporal parent).
    3. For RI constraints, only partial time of child row may be sufficient.
    4. Temporal Relationship Constraint: When only parent is VT temporal.
      1. child must include date/timestamp column as FK
      2. parent VT must be non-overlapping and contiguous with no gaps
  • SEQUENCED VALIDTIME aggregates return values for each distinct validtime period (Solution for for Real-World Temporal Problems slide #53)
  • Suggested partitioning:
    1. VT: CASE_N(<valid AT -‘12:59’> OR VCOL is NULL, NO CASE);
    2. TT: CASE_N(<open>, NO CASE);
    3. BT: CASE_N(<valid> AND <open>, <invalid> AND <open>, <closed>)
  • SELECT *: includes temporal columns only for non-sequenced queries