Skip to content

Other

SageMaker

  • Types of SageMake Pipeline steps
    • Processing: requires a processor, a python script, for processing data and model evaluation, e.g. pyspark job that does data cleaning
    • Training: Training a model, requires estimator, training and validation inputs
    • Tuning: hyperparameter tuning. Associated with SageMaker experiment, runs multiple training jobs as trials. Requires a HyperparameterTuner and TrainingInput
      • can get top performing model(s) (max 50) from the job
    • CreateModel: create a model
    • RegisterModel: Register Model or PipelineModel with Sagemaker Registry
    • Transform: batch transformation to run inference on entire dataset

Kinesis Streaming Data

  • Streams
    • Scaling using shards
    • 24h to 7 days retention
    • EC2 consumers can write to S3, RedShift etc
  • Firehose
    • No scaling required
    • No retention period because it’s delivered to destinations such as - S3, RedShift (via S3), Lambda, Elastic Search
  • Analytics
    • Uses SQL to process data in Streams/Firehose
    • Write data to S3, ElasticSearch, Redshift
  • BP: Kinesis is for handling massive streaming data, whereas SQS is for inter application communication

SWF Simple Workflow Service

  • 1 year retention
  • Has 3 type of Actors: WF starters, Deciders and Activity Workers
  • Can involve non-AWS or manual tasks as part of workflow

Management Tools

  • CloudWatch: Monitor CPU, network, IOs,
    • Available via: console, API, SDK or CLI
  • CloudFormation: JSON formatted template
  • AWS Trusted Advisor: Best practices in Cost, Security, Performance and Fault Tolerance

Analytics

  • Elastic Map Reduce Apache Hadoop + EC2, + S3
  • Kinesis: Streaming

Application Services

  • API Gateway: Frontend to lambda, Kinesis or HTTP
    • Traffic management, Access, Monitoring, API versioning
  • Simple Queue Service (SQS): messaging queuing

Enterprise Applications

  • Workspaces: Windows, Mac, Chromebook, iPad, Fire/Android tablets
    • Ports: TCP 443, TCP+UDP: 4172
  • Value (2GB), Standard Plus (4GB), Performance Plus (7.5GB)
  • MS Office, Trend Micro
  • SWF: Simple WorkFlow

AWS Services Access

  • Management Console, CLI, SDK, Query API (Using HTTP)
  • Support: 4 levels (Basic (free), Developer, Business, Enterprise)

AWS DevOps

  • Code pipeline: CodeCommit, CodeBuild, CodeDeploy, CodePipeline
  • AWS Infrastructure as code: CloudFormation, OpsWorks (Chef), Config
  • Active Monitoring: CloudWatch, CloudTrail (API caller info such as IP, time params etc)
  • PaaS: EBS (Elastic Beanstalk): upload code and beanstalk does provisioning, scale, load balancing, health monitoring

Cloud Migration

  • Unmanaged: rsync, S3 and glacier CLI
    • Replace Internet: Direct connect, Snowball, S3 Transfer acceleration
    • Friends S3: Gateways, Partners, Kinesis Firehose - Gateways: on-premise device that interfaces with SAN or VTL - Partners: Write backups directly to S3 - Kinesis Firehose: load streaming data to S3/Redshift/Elastic Search
  • Application DIscovery Service: identify running on-premise applications, performance profiles, configuration data
  • AWS SMS, Server Migration Service: migrate VM servers including volumes
  • AWS DMS: Database Migration Service
    • Uses replication, allows to migrate to different database

AWS Web Architecture

  • Auto Scaling Group=[Security Group[EC2+CloudWatch] in AZ1 + SecGroup in AZ2]
    • Can be triggered to expand/shrink
  • Sec Group = Protocol + Ports + IP ranges
    • Eg Internet => only ports 80+443 to webserver
    • Eg Corporate => only ports 22 to App servers
    • No access to DB servers
  • Mobile
    • Cognito: Sign-In
    • Sync data using Cognito and S3
    • SNS: Push notification

KMS

  • AWS

AWS Pricing

  • Compute, Storage, Data Transfer
  • Simple Monthly Calculator,
    • TCO Calculator: compare on-premise vs AWS
    • Billing and Cost Management Console: View, pay bills

Security and Compliance

  • Infrastructure:
    • Facility: video surveillance, two factor
    • Decommissioning; magnetic storage degaussed and physically destroyed
    • Compliance: HIPAA, ISO 27001, NIST etc
  • Network
    • Boundary devices, ACL
    • Secure Access points: API endpoint, HTTPS access
    • Transmission protection: TLS, VPC, IPSec VPN
    • Amazon Corporate is segregated from AWS
    • Fault Tolerant
    • Network monitoring and protection, prevents - DDoS, Man in the Middle, IP Spoofing, packet sniffing, port scanning
  • Computer: shared instances are isolated by Xen hypervisor, AWS FIrewall, Signed API calls
  • Database security:
    • RDS: DB security groups, permissions
    • DynamoDB: requires HMAC-SHA256
    • Storage: KMS (Key Management Service), S3 encryption client, S3 server side encryption
  • IAM: Least privilege, explicitly permission, temporary access, multi-factor
  • CloudTrail: API call tracking