Skip to content

Data Classification

  • classify sensitive data by applying Snowflake system tags:
    1. snowflake.core.semantic_category: non-exhaustive, e.g. gender, country, passport etc
    2. snowflake.core.privacy_category: 3 values,
      1. identifier: able to identify an individual, e.g. name, phone, SSN
      2. quasi-identifier: partially identifies an individual, e.g. age, gender
      3. sensitive: personal information, currently only salary
      4. insensitive: no personally identifiable information
  • automatic classification: create a classification profile that controls
    • how often sensitive data in a schema is automatically classified
    • whether system tags (semantic_category and privacy_category) to be applied
    • optionally map user-defined tags to system tags, so user defined tag is auto applied
    • optionally provide a custom classifier to classify using user defined semantic and privacy categories
  • ~~Legacy API~~
    • ~~sys function extract_semantic_categories(<obj>) returns all identified system tags for all columns as JSON object~~
    • ~~sproc associate_semantic_category_tags(<obj>, <json>) applies output of extract_semantic_categories()~~