WAF
- AWS Well-Architected Framework
- Follows certain design principles and best practices within 5 pillars
- Key AWS services: CloudFormation (op-as-code), CloudWatch/CloudTrail (monitor), ElasticSearch (improvement opportunities)
Operational Excellence
- Design principles
- Perform operations as code
- Annotate documentation
- Make frequent, small, reversible changes
- refine operations procedures frequently
- Anticipate failures (pre-mortem)
- Learn from failures
- Best practices
- Prepare: goals: design workloads with mechanisms to monitor, validate
- Operate: define success criteria, health measurement, establish baselines
- Evolve: continuous, incremental improvement
Security
- Design Principles
- strong identity foundation: principle of least privilege, separation of duty
- traceability: monitor, alert, audit
- secure all layers: (instead of just the outer layer), secure edge network, VPC, subnet, LB, instance, OS, Apps
- automate: use policies and controls as code
- keep people and data separate: no direct access to or manual processing of data
- incident management: have a process for reporting and mitigating security incidents
- Best practices
- IAM: Authenticate and authorize, Use AWS IAM
- Detective controls: log, events processing and monitoring => CloudTrail, CloudWatch, S3 logs, API logs
- Infrastructure protection: Use VPC controls for networks; Use compute resource config (EC2/ECS/Beanstalk)
- Data protection: Use encryption and versioning
- Incident response:
Reliability
- Ability to recover from infra, service, capacity failures
- Design Principles
- test recovery procedures:
- automate: monitor KPIs and trigger recovery
- scale horizontally: HA
- enough capacity: let cloud ensure capacity to prevent capacity related failures (eg. DOS)
- change automation: changes should be automated
- Best practices
- Foundations: manage service limits (from under/over provisioning) and network topology using IAM and VPC
- Change mgt: elasticity (for auto scale), logs (for failure detection)
- Failure mgt: have backups with MTTR (mean time to recover) and RPO (recovery point objectives)
- using computing resources efficiently with changes in demand and technologies
- Design Principles
- democratize advanced technologies: let cloud manage advanced technologies (eg. NoSQL DB, transcoding)
- easy globalization: use cloud to expand easily
- serverless architecture: services make efficient use of resources
- experiment: try alternate resources (such as storage, computer) easily
- mechanical sympathy: pick best technology (e.g. pick right DB platform)
- Best practices
- Selection: computer/storage/database/network. , choose best tech for the need (e.g. pick fully managed dynamoDB for low latency)
- Review: read AWS blogs to constantly look for better solutions
- Monitoring: AWS CloudWatch to monitor performance
- Tradeoffs: ElastiCache, CloudFront to increase performance. Use read-replicas in RDS
Cost Optimization
- using computing resources efficiently with changes in demand and technologies
- Design Principles
- adopt a consumption model: (vs capacity model), pay only what you consume
- measure efficiency: measure business output v/s cost to deliver it
- avoid data-center ops: focus on business customers instead of IT infrastructure
- used managed service or application: such as databases to reduce TCO
- Best practices
- Expenditure awareness: AWS Cost Explorer to track exact spending, AWS Budget to set up notifications
- Cost effective resources: choose right resource, e.g. CPU optimized EC2 instance might be right, use spot/reserved instances
- Match supply and demand: use Auto Scaling, lambda
- Optimize over time: Be informed about new AWS services, consult AWS trusted advisor