Skip to content

Misc

  • Major components: PE, MPL, AMP, VDISK
  • hash of DECIMAL is same as INT if it has no decimal places
  • hash of uppercase and lowercase letter is same
  • TPA Node memory:
    • After allocating memory of all VPROCs and a small buffer, remaining memory is allocated to FSG Cache as % (90% for 66xx systems)
    • Full Cylinder reads: more efficient IO than data block IO. carved out of FSG Cache.
  • Network attached clients use CLI/ODBC -> MTDP (Micro TDP)-> MOSI (OS dependent libraries).
    • MTDP connects to Gateway that manages sessions
    • Gateway runs on every node. Consists of two tasks
      • Assign (ycgastsk): one per system, handles initial connection and hands off to connect task.
      • Connect (ycgcntsk): one per node. communicates with the PEs on that node.
    • Gateway can distribute session request to PE that is on a diff node
    • Multiple LANs can be assigned different host id, which essential is a Gateway
    • Unlike MP-RAS, gateway on Linux are VPROCs and there can be >1/node
    • Gateway receive parcels from clients and can forward it to session PE, which can be on a different node
  • TD Systems
    • One CMIC per cabinet that communicates via SWS (AWS)
    • Appliances don’t have HSN
    • Appliances generally emulate BYNET over ethernet
    • VMS combines CMIC+SWS+Viewpoint functionalities
    • AMP cluster is group of, usually, 2 AMPs, one primary and one fallback.
  • ROWID(SQL reserved word) = PART#+ROWHASH+UNIQNESS
  • TD background tasks: deadlock detection, AutoCylPack,Cylinder defrag, TJ purging, FSG cache flushing
  • TD12+: QueryID in DBQL is longer and unique => no need for ProcID in joins
  • User can have multiple account strings, whereas database can only have one.
  • Rowkey = Partition # + Rowhash
  • All DB space is per AMP basis. Skew will artificially cause out-of-space scenario
  • Spool/Temp space
    • cannot exceed that of the owner.
    • limit is per user including all sessions
    • limit is taken from Profile > explicit > parent
  • USI changes are logged in TJ, whereas NUSI changes are not
  • Nodes: TPA, HSN (Host standby), NONTPA (Application nodes, ECN), TMS. Except TMS, all nodes are connected to BYNET. All are managed by SWS
    • ECN: Extended Communication Node hosts hardware for connecting to Mainframe
  • Clique: Group of nodes connected to the same storage system.
  • VDISK: one per AMP, consists of multiple PDISKs, each PDISK is full LUN or partition
    • LUN maybe multiple disk array ranks.
    • Disk Array Rank is set of disks with same SCSI ID, but on different SCSI Buses
  • SI have fallback if base table has fallback
  • Deleted uniqueness values aren’t reused unless they are the highest values.
  • Databases have only one account string, whereas Users can have multiple
  • BYNET: Supports point-to-point, multicast and broadcast messaging
    • TD and TCP/IP network see only one logical BYNET.
    • communication using channels, which use signaling rather than messaging
  • A data block cannot have more free space that >= one sector
  • SP and UDF are always FALLBACK protected
  • CLI (as opposed to ODBC) clients use COP entries. Generally there are as many COP entries as there are nodes.
  • Channel attached clients connect using a separately running TDP.
    • Teradata Director Program manages, balances, queues sessions. IO from/to PEs. May enforce security.
  • FLOAD: Apply phase time increases log base 8 times volume.
  • PUT to deliver software updates via AWS
  • EDW systems: 6650, 6690, Appliances: 2650, 2690
  • In a clique, one node is a control node that monitors health of other nodes
  • // in AS or TITLE clause indicates new line
  • There is no YEAR/DAY interval.
  • subqueries implicitly include DISTINCT clause
  • Analytical Architecture Cube dimensions:
    • Context: Business, Information, Application, System
    • Def Level: Reference, Logical, Solution, Deployment
    • From..To: Current, Transition, Target
  • RIA: Corresponds to Reference, Information, Target from the AA cube
  • When TASM is not enabled, priority is determined by Acct string and is for the session
  • When TASM is enabled, WD classifications may cause each query to have different priority
  • Down AMP recovery journal records changes to AMP while it is down.
  • All ON clause in MERGE DML must use PI equality.
  • DBC might have higher CurrPerm than MaxPerm, because TransientJournal table even though reported in DBC, has space residing in WAL
    • TD12 DB Design: TJ space comes from DBC PERM. Must have enough PERM to make sure TJ always gets space
  • Free cylinder list is not maintained in Master Index, but in Filesys Info Block (Cyl 0)
  • Oversized row is one that requires one data block to itself
  • Changes to Cylinder Index are not written to TJ, instead an alternate CI is maintained for data integrity.
  • FASTLOAD
    • Default load sessions = # of AMPs rounded up to multiples of 16
  • TPT
    • SCHEMA * (deferred schema) can only be used with consumer operators
    • DDL op does not support SQL statements that return data nor the USING clause
    • private log is specific to an op. If specified, it’s used in place of the public log
  • Full Cylinder Read: TD<13.10 had reserved (typically 6 to 8) slots per AMP
  • AMP may use much more memory (230M) than initial (80M) for operations such as redistribution (180M) that comes from free space
  • PDE Manages FSG Cache and TD Intelligent Memory(VH Cache)
  • Free memory (managed by OS): Used by OS and VPROCs AMP, GTW, RSG (Relay Services Gateway used for replication), PDE, PE and TVS (Teradata Virtualize Storage)
  • Multi-node Single clique system may have more than one TVS vproc per node.
  • Volatile tables must have current username as qualifier if explicitly specified
  • RSS
    • Two components: PDE and Resource Sampling Subsystem.
    • Gather/Live Buffer -> Work Buffer -> Log Buffer -> ResUsage Tables
    • Gather/Live Buffer -> Collect Buffer -> ViewPoint
    • PM/API provides interfaces to collect session and resource level data.
      • Some of the Session level data is cumulative.
      • Resource level data is not cumulative, but overwritten at each interval
  • Legacy Archive methods: methods a, b, c are legacy
    • All AMP: Single job, single stream, sessions >= 1
    • Parallel All AMP: Multiple jobs, each with single stream, sessions >=1 1/job
      • Each job backs up subset of objects.
      • Need multiple user-ids for restore
    • Cluster: Backup AMP cluster. Single cluster job, multiple streams. Each stream has different Teradata DB job ID. Legacy, only supported by NetVault
      • Need multiple user-ids for restore
      • AMP configuration change is problem.
    • Group read: allows write locks, requires after image journal, restore point is archive end, uses HUT and rolling read locks. No DDLs are allowed.
      • Archive: enable PJ, archive, checkpoint with save, archive journal, delete saved journal, disable PJ
      • restore: restore data tables, restore journal tables, rollforward, disable PJ
      • can backup partitions
      • DBC can only be restored to an initialized TD system (after SYSINIT)
    • On restore if config has changed, AMP redistributes rows in BUILD phase (TD12)
  • Value ordered NUSI counts as two indexes
  • Spool defined for DB is only used for inheriting value for any owned USERs
  • ALTER TABLE DATABLOCKSIZE IMMEDIATE causes reblocking
  • Unnamed constraints cannot be modified. must be dropped and re-added
  • SP: Updatable cursor only in ANSI mode and if no triggers
  • TD 14: if total partitions < 65535 then 2 byte else 8 byte partition number
  • TD13.10+: Partition expression can be char type data
  • MSR are like single transaction, except in TD13.10+ multiple INSERT, where only failing INSERT is rolled back (Statement Independence Feature)
  • SET SESSION ACCOUNT: to change a/c for request or session without logging off
  • RANDOM function:
    • cannot be used in both SELECT and WHERE
    • cannot be used with aggregate, OLAP functions, GROUP BY or ORDER BY
  • default time/timestamp precision is 6
    • Timestamp cannot be cast to a lower precision
  • WITH BY includes detail rows and the GROUP BY line aggregates
  • Non-ANSI QUANTILE and WIDTH_BUCKET use GROUP BY for partitioning
  • Non-window func like QUANTILE aren’t valid with std or wind aggregates
  • TOP n with unordered select, results in same rows being returned.??
  • TOP n PERCENT always rounds up (e.g. 1.2 rows is 2 rows etc).
  • TVS cold data if it was always in FSG Cache

SLES10 PSF & TASM

  • AWT: Must divide counts by collect interval on <TD13
  • PS evaluates priority on each node at node level.
    • A query can be running in different AG on different nodes
    • Relative weight is calculated in integers, fractions are truncated
    • Only active AG and RP participate in weight calc => any unused CPU from an AG is distributed to other AG in the same RP first
    • Placing CPU limit inhibits use of spare CPU. Can be at AG, RP, System level.
    • Rollback/Abort processing will not honor CPU limit
    • Console utilities and some internal DB tasks run in RP0 (default RP)
  • AG:
    • Age interval: last n (default 60) sec which are evaluated for resource calc (reaction time)
    • Active time: inactivity time in seconds, after which AG is considered inactive.
    • Two ways to distribute CPU. S: session and N:none (processes) (obsolete)
      • ROT: simple or single AMP queries:S; complex, all-amp:N
    • An AG can be assigned to >1 PG, as long as all PG belong to same RP
    • A Workload can be mapped to an AG only if they both have the same enforcement priority (Tactical, Priority, Normal, Background)
    • Performance AG (#200) is reserved for system activities such as recovery, abort, deadlock detection, TVS, GTW and AWT management
  • Performance Periods: Link PG to AG.
    • Milestone types: T: time of day (day optional), S: session CPU, Q:Query CPU sec
    • For a given PG, all milestone types must be same
  • Rollback if with USER priority, do not honor CPU limits
  • Before the first request is parsed in a new session, PG is used assign priority
  • ROT:
    • use dedicated RP for tactical RP
    • make RP0 assigned wt = highest non-tactical RP’s weight
    • if Age interval is shortened, shorten Active interval by same degree

Legacy Utilities

Fast-Load

  • 2 phases:
    • Acquisition: supports chkpt, receives blocks, unblocks, redistributes to work area
    • Application: internal chkpt, sorts data and writes to perm table. AMP local
  • Restart:
    • loading phase: resubmit (remove ddls) and restart from last checkpoint
    • end loading: not needed normally. else submit abbr job
  • 2 SQL sessions + n load sessions
  • TENACITY: # hours to try. SLEEP: # of minutes to pause between retries
  • cmds: BEGIN LOADING, SET RECORD, DEFINE, INSERT, END LOADING
  • Formats: FORMATTED(LEN+IND+LF), UNFORMATTED(IND), BINARY(LEN+IND), TEXT(LF), VARTEXT ‘c’
  • err1: translation or constraint errors. err2: UPI violations for non-dup rows
  • opts before .logon:tenacity, sleep, sessions
  • config file: floadcfg.dat, must be in curr dir or FLOADLIB env var
  • unpacked-signed decimal load using FORMAT with PIC clause (#TF 34-23)
  • PUBLIC needs SIUD access on SYSADMIN.FASTLOG restart log table
  • NOPI tables: Are accessible with ACCESS lock. Dup rows are not discarded
  • BEGIN LOADING … NODROP option to manually drop error tables later. Must be dropped.
  • ROT: For NUPI tables, not sorting on PI columns improves performance

Support Environment

  • .LOGTABLE before .LOGON
  • sys variables: &SYSDATE/4, &SYSRC, &SYSTIME, &SYSUSER
  • cmds: .accept,.display,.if,.else,.endif,.route,.run,.set,.system

Multi-load

  • 5 phases: prelim (log/work tables, lock), DML tran (parse DML), Acq (block/deblock data, redist, sort, chg lock to write), Appl (process DML), cleanup
  • doesn’t honor freespace when loading a non-empty table
  • Import task can perform delete, update, upsert with a WHERE clause
  • If WHERE used, import task must reference PI, but delete must not
  • upto 5 tables per import, only 1 table per delete task
  • NO: USI, Trigger, JI, RI
  • BEGIN MLOAD … TABLEWAIT: hours to wait for another MLOAD to release table
  • Read lock in acq, rw lock in apply. Utility Acq lock to prevent any DDL except DROP
  • Locks are released after each phase; must wait for new lock to be acquired
  • cmds: .begin import mload, .layout, .field, .dml, .import, .end mload
  • cmds: .begin delete mload, .end mload
  • sysvar: &SYS{aply,noaply,del,ins,upd,rcd,rjct,et,uv}CNT{1..5}
  • checkpoint < 60 is minutes (15 is default), >=60 is number of records
  • .DML LABEL lbl [MARK|IGNORE (DUP INS, DUP UPD, MISS UPD, MISS DEL) ROWS)]
  • .DML LABEL lbl DO INSERT FOR MISSING UPDATE ROWS
  • Restart if mload failed in apply phase:
    • use OVERRIDE parm if log table has been dropped
    • any corrupt tables, drop data/work/error tables. submit as new mload
    • use abbreviated script (just BEGIN MLOAD and END MLOAD) if no script
  • RELEASE MLOAD has no effect if in APPLY phase or DELETE task is in DML phase.
  • fallback tables are written after base tables in background to allow quicker access.
  • Staging table has fallback, if the target table is defined with fallback
  • Work tables are always fallback protected, have same PI as target table
  • Eval order: cmd line > script > mloadcfg.dat/$MLOADLIB > defaults
  • ET_ for all application phase errors and when PI cannot be build in apply phase
  • UV_ for all apply phase errors when PI has been built successfully.
  • if there is PJ, MLOAD maintains private journal and is transferred to PJ at end
  • AMPCHECK: For fallback tables, fail when AMP is down. NONE|APPLY|ALL
  • CONTINUEIF > INDICATORS
  • MLOAD Performance:
    • Too many errors and error table has fallback. Fallback is written at SQL speed
    • Too many NUSIs. Each NUSI requires change row after is table loaded
    • Inefficient for highly non-unique (>100 rows/value) NUPI due to dup row checking unless NUPI helps locality of reference and table is multiset
    • Efficient when it affects at least one row per data block
    • PPI helps locality leading to improved performance
    • NUSI is spread-out, it hurts performance.
  • If target has PJ, MLOAD maintains private PJ then transfers at the end to true PJ

TPUMP

  • Allows throttling, uses macros
  • serializeon: data with same keys processed in the same session, same order as source
  • For tables with JI, serialization can still deadlock if base and JI have diff PI
    • Use JI PI if base table P\I is unique or almost unique
    • Use common set of columns as key between base table and JI
    • Use single session
  • arraysupport: send multiple sets of data using 1 dml stmt (INS/UPD/DEL/MACRO)
  • pack: send multiple statements in 1 request
    • PACKMAXIMUM determines max pack rate by trial & error. Do not use in prod
  • robust on: required for certain complex dml, which cannot be repeated on restart
  • max 4 import commands
  • cmds: .begin load, .layout, .field, .dml, .import, .end load
  • errortable options: APPEND, NODROP QUEUETABLE
  • DML LABEL same as MLOAD + IGNORE|MARK EXTRA UPD|DEL ROWS
  • Can use atomic upsert (UPDATE ELSE INSERT)
    • NO: if UPD/INS trigger could be fired or JI/HI could be updated
    • NO: UPD FROM, INS ... SEL, sub-query
    • TPUMP automatically converts to atomic when possible
  • PARTITIONed DML: segregates DML into diff session => cache reuse
    • smart USING: only packs used columns in DML (in case of multi-table)
  • Can specify EXEC MACRO INS|UPD|DEL|UPSERT as DML
  • Monitoring: SysAdmin.TPumpStatusTbl: Updated/checked by Tpump every min.
    • SysAdmin.TPumpUpdateSelect(M), TPumpMacro.UserUpdateSelect(M): Set new stmt rate
  • SysAdmin.TPumpActionRequest(M), TPumpMacro.UserActionRequest(M)
    • Chkpt,Pause,Resume,Terminate(chkpt taken),Abort(immediate)
  • High DBS wait-time (response from DBS) in log => more sessions may help
  • When UPSERT processing, high # of INSERT => lower throughput
  • ROT: Use TPUMP if % of rows updated is less than
    • NUSI:5%, USI:5%, STJI:2-5%, STAJI: 0.2-1%, MTJI/MTAJI:0.2%
  • ROT: For UPSERT processing, large # of INSERT will reduce performance
  • Overhead: 1 NUSI (50%), 2 NUSI (100%), 1% errors (40%), fallback (45%), SERIALIZE (30%), ROBUST (15%)
  • ROT: Use NOMONITOR, persistent macros, for short jobs
  • ROT: If high amount of data contain errors, use lower pack factor to reduce large rollbacks

FastExport

  • Two sorts, vertical & horizontal
  • Formats: FASTLOAD(len+eor), BINARY(len), UNFORMAT, TEXT(eor)
  • Modes: RECORD, INDICATOR
  • MLSCRIPT option generates MLOAD script to loads the exported data.