Skip to content

Storage

  • Managed storage location can be at metastore, catalog or schema level
  • External location is configured with storage credentials, used by external tables and external volumes

Workspace storage bucket

  • Workspace system data: Databricks features as notebooks and revisions, job run details, spark logs etc
  • DBFS: A now deprecated storage
  • Unity catalog workspace catalog

DBFS

  • databricks file system (DBFS) is deprecated
  • DBFS Root is location provisioned when creating a workspace
  • DBFS Mount is external location that can be mounted and is accessible as part of DBFS
  • Predefined locations, deprecated since Unity (available through dbfs:/Volumes/)
    • /databricks-datasets: standard read-only, open-source, datasets provided by Databricks
    • /user/hive/warehouse: EXAM default location for managed tables registered with hive
    • /FileStore: files uploaded via UI or the image files of generated plots
    • /databricks-results: result-sets of queries that were run
    • /databricks-init: legacy global init scripts

Tables

  • Managed
  • External: Connected using external location
  • Foreign: discovered by crawling hive metastore
    • can be located at DBFS root