Layers: mainline layers where data flows from lower order layer to the next
Sources: SoRs
Landing: Cloud storage
Raw: ingested into database for processing
analysts and data scientists can be allowed access for exploration
Integrated/Refined: conformed, after business transformations have been applied
Presentation/Curated: Semantic, secured, optimized for consumption
Shared: business entities make data available for consumption by other business entities or other corporation for monetization
other areas that complement the mainline layers
common: contain common assets such as UDFs that are common across all business entities
workspace: dedicated work-areas for analysts and data scientists to persist intermediate results
BP: split layers into different databases (v/s schema) if replication of a particular is required (e.g. cross-geo data access)
Service architecture: how responsibilities are divided
Platform as a Service: the central team provisions databases and access, business entities are responsible for managing end-to-end data layers
Data as a Service: Central team brings the data from all source systems into the Raw layer, business entities are responsible for integrated, through presentation (access or semantic) layer
Analytics as a Service: Central team is responsible for all layers including presentation
Data security:
Functional role: assigned to users, a collection of access roles