Skip to content

Teradata Analytic Ecosystem

  • Data movement: Ingest, prepare and consume
  • Governance and security
  • Metadata: operational, technical and business
  • Sources: Business, Human, Machine, External (Social)
  • Reference Information Architecture: (Acquisition, Integration, Access)
  • Consumers: General, Analysts, Statisticians, Autonomous Applications, Data Scientists
  • Analytical methods: Reporting, Visualization, Statistical Analysis, Data Mining, Simulation, Optimization, NLP, Machine Learning

Data Movement

  • Acquisition
  • typical sources: OLTP, Enterprise applications, IoT, web/log
  • Landing: Raw, unprocessed, lowest granularity
  • Staging: Data that is ready to be ingested and processed. Usually in-database
  • Standardization: Consumable format
    • light standardization such as gender codes, medical codes etc
    • physical layout optimization, such as partitioning, indexing etc
  • Integration
  • Common keys: standardize common keys to connect various subject areas
  • Derived values: derive and automate KPI
  • Common summaries: standard aggregation (not just for performance)
  • Access
  • Optimized structures: partitioning, indexing for optimizing resource and query speed
  • shared views and services: for easier navigation, provide metadata
  • Data pipeline: process to prepare raw data for consumption-friendly data. It includes ingestion, transformation, and aggregation

Data organization

  • Data Lake: long term, low cost, optionally light level of data integration
  • applicability: Acquisition layer, may replace parts of staging layer in traditional DW
  • Data Warehouse: Integrated data from one or more disparate sources, usually 3NF
  • applicability: Integration layer
  • MDM: consistent enterprise wide reference data
  • applicability: Standardization tier of acquisition layer, and integration layer
  • Data projections: Purpose built data structure to solve specific business problems.
  • applicability: Access layer
  • Physical and/or Virtual Data Marts
  • DM can be Dependent (single source of truth) or Independent (multiple source of truth)
  • Metadata store: business, technical, operational

Data Zones

  • Ingestion Source:
  • Source systems, Enterprise Applications (e.g. SalesForce), Logs, IoT
  • landing zone: bulk storage, kafka topics
  • Operational zone:
  • Raw: raw data, no transformations
  • Conformed: Raw + de-duped + standardization
  • Modeled: Integrated, cleaned, modeled
  • Presentation (Semantic): Optimized for analytics and BI queries
  • Share (Snowflake): Sharing for monetization
  • Exploration zone: experimental before becoming part of operational, data discovery
  • quick to set up
  • limited life span