Skip to content

Other

SageMaker

  • Types of SageMake Pipeline steps
  • Processing: requires a processor, a python script, for processing data and model evaluation, e.g. pyspark job that does data cleaning
  • Training: Training a model, requires estimator, training and validation inputs
  • Tuning: hyperparameter tuning. Associated with SageMaker experiment, runs multiple training jobs as trials. Requires a HyperparameterTuner and TrainingInput
    • can get top performing model(s) (max 50) from the job
  • CreateModel: create a model
  • RegisterModel: Register Model or PipelineModel with Sagemaker Registry
  • Transform: batch transformation to run inference on entire dataset

Kinesis Streaming Data

  • Streams
  • Scaling using shards
  • 24h to 7 days retention
  • EC2 consumers can write to S3, RedShift etc
  • Firehose
  • No scaling required
  • No retention period because it’s delivered to destinations such as
    • S3, RedShift (via S3), Lambda, Elastic Search
  • Analytics
  • Uses SQL to process data in Streams/Firehose
  • Write data to S3, ElasticSearch, Redshift
  • BP: Kinesis is for handling massive streaming data, whereas SQS is for inter application communication

SWF Simple Workflow Service

  • 1 year retention
  • Has 3 type of Actors: WF starters, Deciders and Activity Workers
  • Can involve non-AWS or manual tasks as part of workflow

Management Tools

  • CloudWatch: Monitor CPU, network, IOs,
  • Available via: console, API, SDK or CLI
  • CloudFormation: JSON formatted template
  • AWS Trusted Advisor: Best practices in Cost, Security, Performance and Fault Tolerance

Analytics

  • Elastic Map Reduce Apache Hadoop + EC2, + S3
  • Kinesis: Streaming

Application Services

  • API Gateway: Frontend to lambda, Kinesis or HTTP
  • Traffic management, Access, Monitoring, API versioning
  • Simple Queue Service (SQS): messaging queuing

Enterprise Applications

  • Workspaces: Windows, Mac, Chromebook, iPad, Fire/Android tablets
  • Ports: TCP 443, TCP+UDP: 4172
  • Value (2GB), Standard Plus (4GB), Performance Plus (7.5GB)
  • MS Office, Trend Micro
  • SWF: Simple WorkFlow

AWS Services Access

  • Management Console, CLI, SDK, Query API (Using HTTP)
  • Support: 4 levels (Basic (free), Developer, Business, Enterprise)

AWS DevOps

  • Code pipeline: CodeCommit, CodeBuild, CodeDeploy, CodePipeline
  • AWS Infrastructure as code: CloudFormation, OpsWorks (Chef), Config
  • Active Monitoring: CloudWatch, CloudTrail (API caller info such as IP, time params etc)
  • PaaS: EBS (Elastic Beanstalk): upload code and beanstalk does provisioning, scale, load balancing, health monitoring

Cloud Migration

  • Unmanaged: rsync, S3 and glacier CLI
  • Replace Internet: Direct connect, Snowball, S3 Transfer acceleration
  • Friends S3: Gateways, Partners, Kinesis Firehose
    • Gateways: on-premise device that interfaces with SAN or VTL
    • Partners: Write backups directly to S3
    • Kinesis Firehose: load streaming data to S3/Redshift/Elastic Search
  • Application DIscovery Service: identify running on-premise applications, performance profiles, configuration data
  • AWS SMS, Server Migration Service: migrate VM servers including volumes
  • AWS DMS: Database Migration Service
  • Uses replication, allows to migrate to different database

AWS Web Architecture

  • Auto Scaling Group=[Security Group[EC2+CloudWatch] in AZ1 + SecGroup in AZ2]
  • Can be triggered to expand/shrink
  • Sec Group = Protocol + Ports + IP ranges
  • Eg Internet => only ports 80+443 to webserver
  • Eg Corporate => only ports 22 to App servers
  • No access to DB servers
  • Mobile
  • Cognito: Sign-In
  • Sync data using Cognito and S3
  • SNS: Push notification

KMS

  • AWS

AWS Pricing

  • Compute, Storage, Data Transfer
  • Simple Monthly Calculator,
  • TCO Calculator: compare on-premise vs AWS
  • Billing and Cost Management Console: View, pay bills

Security and Compliance

  • Infrastructure:
  • Facility: video surveillance, two factor
  • Decommissioning; magnetic storage degaussed and physically destroyed
  • Compliance: HIPAA, ISO 27001, NIST etc
  • Network
  • Boundary devices, ACL
  • Secure Access points: API endpoint, HTTPS access
  • Transmission protection: TLS, VPC, IPSec VPN
  • Amazon Corporate is segregated from AWS
  • Fault Tolerant
  • Network monitoring and protection, prevents
    • DDoS, Man in the Middle, IP Spoofing, packet sniffing, port scanning
  • Computer: shared instances are isolated by Xen hypervisor, AWS FIrewall, Signed API calls
  • Database security:
  • RDS: DB security groups, permissions
  • DynamoDB: requires HMAC-SHA256
  • Storage: KMS (Key Management Service), S3 encryption client, S3 server side encryption
  • IAM: Least privilege, explicitly permission, temporary access, multi-factor
  • CloudTrail: API call tracking