Learning More About AWS (Part 3) - Notes for Certified Developer Associate Exam
Splitting up the notes for the Certified Developer Associate Exam because I have not yet implemented lazy loading for articles.
Amazon S3 Glacier
Amazon S3 Glacier
- In addition to existing as a S3 Storage Class, S3 Glacier is a separate AWS Service on its own
- Extremely low cost storage for archives and long term backups:
- Old media content
- Archives to meet regulatory requirements (old patient records, etc.)
- As a replacement for magnetic tapes
- High durability (11 9s)
- High scalability (unlimited storage)
- High Security (encrypted at rest and in transfer)
- Cannot upload objects to Glacier using Management Console
- Use REST API, AWS CLI, AWS SDK
Retrieving archives from S3 Glacier
- Asynchronous two step process (Use REST API< AWS CLI< or SDK)
- Initiate a archive retrieval
- (After archive is available) Download the archive
- Reduce costs by optionally specify a range, or portion, of the archive to retrieve
- Reduce costs by requesting longer access times
- Amazon S3 Glacier:
- Expedited (1-5 minutes)
- Standard (3-5 hours)
- Bulk Retrieval (5-12 hours)
- Amazon D3 Glacier Deep Archive
- Standard Retrieval (up to 12 hours)
- Bulk Retrieval (up to 48 hours)
IAM - Fundamentals
Typically identity management in the cloud
- You have resources in the cloud (examples - a virtual server, a database, etc.)
- You have identities (human and non-human) that need access those resources and perform actions
- For example: launch (stop, start, or terminate) a virtual server
- How do you identify users in the cloud?
- How do you configure resources they can access?
- How can you configure what actions to allow?
- In AWS, Identity and Access Management (IAM) provides this service
AWS Identity Management (IAM)
- Authentication (is it the right user?)
- Authorization (do they have the right access?)
- Identities can be
- Provides very granular control
- Limit a single user:
- to perform a single action
- on a specific AWS resources
- from a specific IP address
- during a specific time window
Important IAM Concepts
- IAM Users: Users created in an AWS account
- Has credentials attached (name/password or access keys)
- IAM groups: Collection of IAM users
- Roles: Temporary identities
- Does NOT have credentials attached
- (Advantage) Expire after a set period of time
- Policies: Define permissions
- AWS managed policies - Standalone policy predefined by AWS
- Customer managed policies - Standalone policy created by you
- Inline Policies - Directly embedded into a user, group, or role
AWS IAM Policies
- Policy is a JSON document with one or more permissions
- Effect - Allow or Deny
- Resource - Which resource are you providing access to?
- Action - What actions are allowed on the resource?
- Condition - Are there any restrictions on IP address ranges or time intervals?
Authentication with IAM - Remember
- IAM User identities exist until they are explicitly deleted
- IAM allows you to create a password policy
- What characters should your password contain?
- When does your password expire?
- Access key's should be constantly rotated
- Two access keys can be active simultaneously. Makes rotation of keys easier
- An IAM role can be added to already running EC2 instances. Immediately effective.
- An IAM user can assume IAM role temporarily.
- An IAM role is NIT associated with long-term credentials
- When a user, a resource (For example, an EC2 instance) or an application assumes a Role, it is provided with temporary credentials
- DO NOT use AWS IAM root user for regular everyday tasks. Lock it away after creating an admin IAM user.
- Enable Multi Factor Authentication for all important IAM operations
- Extra layer of security
- MFA Devices
- Hardware device - Gemalto
- Virtual device - An app or a smart phone
IAM Best Practices - Recommended by AWS
- Users - Create individual users
- Groups - Manage permissions with groups
- Permissions - Grant least privilege
- Auditing - Turn on AWS CloudTrail
- Password - Configure a strong password policy
- MFA - Enable MFA for privileged users
- Roles - Use IAM roles for Amazon EC2 instances
- Sharing - Use IAM roles to share access
- Rotate - Rotate security credentials regularly
- Conditions - Restrict privileged access further with conditions
- Root - Reduce or remove use of root
Data Encryption KMS and Cloud HSM
Data States
- Data at rest: Stored on a device or a backup
- Data in motion: Being transferred across a network
- Also called Data in transit
- Examples:
- Data copied from on-premise to cloud storage
- An application in a VPC talking to a database
- Two Types:
- In and out of AWS
- Within AWS
- Data in use: Active data processed in a non-persistent state
- Example: Data in your RAM
Encryption
- If you store data as is, what would happen if an unauthorized entity gets access to it?
- Imagine losing an unencrypted hard disk
- First law of security: Defense in Depth
- Typically, enterprises encrypt all data
- Is it sufficient if you encrypt data at rest?
- No. Encrypt data in transit - between application to database as well
Symmetric Key Encryption algorithms use the same key for encryption and decryption
Asymmetric Key Encryption (Also Called Public Key Cryptography) uses two keys: public and private. Encrypt data with public key and decrypt with private key.
AWS provides two important services - KMS and Cloud HSM that allow you to manage your keys and perform encryption and decryption.
AWS KMS
- Create and manage cryptographic keys
- Control their use in your application and AWS Services
- Defined usage permissions
- Track key usage in AWS CloudTrail
- Integrated with almost all AWS services that need data encryption
- Automatically rotate master keys once a year
- No need to re-encrypt previously encrypted data (versions of master key are maintained)
- Schedule key deletion to verify if the key is used
- Mandatory minimum wait period of 7 days (max-30 days)
- The process KMS uses for encryption is called Envelope Encryption
- Data is encrypted using data key
- Data key is encrypted using Master Key
- Master Key never leaves KMS
AWS CloudHSM
- Managed (highly available and auto scaling) dedicated single-tenant Hardware Security Module (HSM) for regulatory compliance
- AWS CANNOT access your encryption master keys in CloudHSM
- In KMS, AWS can access your master keys
- Be ultra safe with your keys when you are using CloudHSM
- (Recommendation) Use two or more HSMs in separate AZs in a production cluster
- Recommended to use HTTPS endpoints to ensure encryption of data in transit
- All AWS services (including S3) provides HTTPS endpoints
- Encryption is optional with S3 but highly recommended in flight and at rest
Networking
Need for Amazon VPC
- In a corporate network or an on-premises data center:
- Can anyone on the internet see the data exchange between the application and the database
- No
- Can anyone from internet directly connect to your database?
- Typically NO
- You need to connect to your corporate network and then access your applications or databases
- Corporate network provides a secure internal network protecting your resources, data, and communication from external users
- How do you create your own private network in the cloud>
- Enter Virtual Private Cloud (VPC)
Amazon VPC (Virtual Private Cloud)
- Your own isolated network in AWS cloud
- Network traffic within a VPC is isolated (not visible) from all other Amazon VPCs
- You control all the traffic coming in and going outside a VPC
- (Best Practice) Create all your AWS resources (compute, storage, databases etc.) within a VPC
- Secure resources from unauthorized access AND
- Enable secure communication between your cloud resources
Need for VPC Subnets
- Different resources are created on cloud - databases, compute (EC2) etc
- Each type of resource has its own access needs
- Public Elastic Load Balancers are accessible form internet (public resources)
- Databases or EC2 instances should NOT be accessible from internet
- ONLY applications within your network (VPC) should be able to access them (private resources)
- How do you separate public resources from private resources inside a VPC?
VPC Subnets
- (Solution) Create different subnets for public and private resources
- Resources in a public subnet CAN be accessed from the internet
- Resources in a private subnet CANNOT be accessed from internet
- BUT resources in public subnet can talk to resources in private subnet
- Each VPC is created in a Region
- Each Subnet is created in an Availability Zone
Routing on the internet
- You have an IP address of a website you want to visit
- There is no direct connection from your computer to the website
- Internet is actually a set of routers routing traffic
- Each router has a set of rules that help it decide the path to the destination ip address
- In AWS, route tables are used for routing
- Each route table consists of a set of rules called routes
- Each route or routing rule has a destination and target
- What range of addresses should be routed to which target resource?
Public Subnet vs Private Subnet
- Public Subnet
- Communication allowed from subnet to internet
- Communication allowed from internet to subnet
- Private Subnet
- Communication NOT allowed from internet to subnet
- An internet gateway enables internet communication for subnets
- Any subnet which has a route to an internet gateway is called a public subnet
- Any subnet which DOES NOT have route to an internet gateway is called a private subnet
Network Address Translation (NAT) Instance and Gateway
- How do you allow instances in a private subnet to download software updates and security patches while denying inbound traffic from internet?
- How do you allow instances in a private subnet to connect privately to other AWS Services outside the VPC>
- Three Options:
- NAT Gateway: Managed Service
- NAT Instance: Install a EC2 instance with specific NAT AMI and configure as a gateway
- Egress-Only Internet Gateways: For IPv6 subnets
Network Access Control List
- Security groups control traffic to a specific resource in a subnet
- How about stopping traffic from even entering the subnet?
- NACL provides stateless firewall at subnet level
- Each subnet must be associated with a NACL
- Default NACL allows all inbound and outbound traffic
- Custom created NACL denies all inbound and outbound traffic by default
- Rules have a priority number
- Lower number => Higher Priority
VPC Flow Logs
- Monitor Network Traffic
- Troubleshoot connectivity issues (NACL and/or security groups misconfiguration)
- Capture traffic going in and out of your VPX (network interfaces)
- Can be created for
- a VPC
- a subnet
- Publish logs to Amazon CloudWatch Logs or Amazon S3
- Flow Log records contain ACCEPT or REJECT
- Is traffic permitted by security groups or network ACLs?
VPC Peering
- Connect VPXs belonging to same or different AWS accounts irrespective of the region of the VPCs
- Allows private communication between the connected VPXs
- Peering uses a request / accept protocol
- Owner of requesting VPC sends a request
- Owner of the peer VPC has one week to accept
VPC - Review
- VPC: Virtual Network to protect resources and communication from outside world
- Subnet: Separate resources from public resources
- Internet Gateway: Allows Public Subnets to connect/accept traffic to/from internt
- NAT Gateway: Allow internet traffic from private subnets
- VPC Peering: Connect one VPC with other VPCs
- VPC Flow Logs: Enable logs to debug problems
- AWS Direct Connect: Private pipe from AWS to on-premises
- AWS VPC: Encrypted (IPsec) tunnel over internet to on-premises
Database Fundamentals
Databases Primer
- Databases provide organized and persistent storage for your data
- To choose between different database types, we would need to understand
- Availability
- Durability
- RTO
- RPO
- Consistency
- Transactions, etc.
- Availability
- Will be able to access my data now and when I need it?
- Percentage of time an application provides the operations expected of it
- Durability
- Will my data be available after 10 or 100 or 1000 years?
- Typically, an availability of four 9's is considered very good
- Typically, a durability of eleven 9's is considered very good
- meaning: If you store one million files for ten million years, you would expect to lose one file
- Increasing Availability
- Having multiple standbys available
- in multiple AZs
- in multiple Regions
- Increasing Durability
- Multiple copies of data (standbys, snapshots, transaction logs and replicas)
- in multiple AZs
- in multiple Regions
- Replicating data comes with its own challenges!
- RPO (Recovery Point Objective): Maximum acceptable period of data loss
- RTO (Recovery Time Objective): Maximum acceptable downtime
- Vertically scale the database - increase CPU and memory
- create a database cluster - typically database clusters are expensive to setup
- Create read replicas - Run read only applications against read replicas
- Strong Consistency - Synchronous replication to all replicas
- Will be slow if you have multiple replicas or standbys
- Eventual Consistency - Asynchronous replication. A little lag - few seconds - before the change is available in all replicas
- In the intermediate period, different replicas might return different values
- Used when scalability is more important than data integrity
- Read-after-Write consistency - Inserts are immediately available and deletes are eventually consistent
Database Categories
- There are several categories of databases:
- Relational (OLTP and OLAP), Document, Key Value, Graph, In Memory among others
- Choosing type of database for you ruse case is not easy. A few factors:
- Do you want a fixed schema?
- Do you want flexibility in defining and changing your schema (schema-less)?
- What level of transaction properties do you need? (atomicity and consistency)
- What kind of latency do you want? 9seconds, milliseconds, or microseconds)
- How many transactions do you expect? (hundreds or thousands or millions of transactions per second)
- How much data will be stored? (MBs or GBs or TBs or PBs)
Relational Databases
- Predefined schema with tables and relationships
- Very strong transactional capabilities
- Used for
- OLTP (Online Transaction processing) use cases and
- OLAP (Online Analytics Processing) use cases
Relational database - Online Transaction processing
- Applications where large number of users make large number of small transactions
- Recommended AWS Managed Service: Amazon RDS
Relational Database - Online Analytics Processing
- Applications allowing users to analyze petabytes of data
- Recommended AWS Managed Service: Amazon Redshift
OLTP databases use row storage. OLAP databases use columnar storage.
Document Database
- Structure data the way your application needs it
- Recommended AWS Managed Service: Amazon DynamoDB
Key-Value
- Recommended AWS managed service: Amazon Dynamo DB again
Graph
- Store and navigate data with complex relationships
- Recommended AWS Managed Service: Amazon Neptune
In-memory Databases
- Retrieving data from memory is much faster than retrieving data from disk
- Recommended AWS Service: Amazon Elasticache
Amazon RDS
- AWS is responsible for
- Availability
- Durability
- Scaling (according to your configuration)
- Maintenance (patches)
- Backups
- Multi-AZ makes maintenance easy
- Standby in a different AZ
- Synchronous replication (strong consistency)
- Perform maintenance on standby
- Promote standby to primary
- Perform maintenance on (old) primary
- Amazon Aurora
- MySQL and PostgreSQL compatible
- 2 copies of data each in a minimum of 3 AZ
RDS - Scaling
- Vertical Scaling: Change DB instance type and scale storage
- Storage and compute changes are typically applied during maintenance window
- You can choose to "apply immediately"
- RDS would take care of data migration
- Vertically Scaling: RDS also supports auto scaling storage
- Horizontal Scaling
- Configure read replicas
RDS - Security and Encryption
- Create in a VPC private subnet
- Use security groups to control access
- Option to use IAM Authentication with PostgreSQL
- Enable encryption with Keys from KMS
- When encryption is enabled
- Data in the database, automated backups, read replicas and snapshots are all encrypted
- data In-Flight Encryption
- Using SSL certificates
RDS Costs - Key Elements
- DB instance hours - How many hours is the DB instance running?
- Storage (per GB per month) - How much storage have your provisioned for your DB instance?
- Provisioned IOPS per month - If you are using Amazon RDS Provisioned IOPS (SSD) Storage
- Backups and snapshot storage - More backups, More snapshots => More cost
- Data transfer costs
Amazon DynamoDB
- Fast, scalable, distributed for any scale
- Flexible NoSQL Key-value & document database (schema-less)
- Single-digit millisecond responses for million of TPS
- Do not worry about scaling, availability or durability
- Automatically partitions data as it grows
- maintains 3 replicas within the same region
- No need to provision a database
- Create a table and configure read and write capacity (RCU and WCU)
- Automatically scales to meet your RCU and WCU
- Provides an expensive serverless mode
- Use cases: User profiles, shopping carts, high volume read and write applications
DynamoDB Tables
- Hierarchy: Table -> item(s) -> attribute (key value pair)
- Mandatory primary key
- Other than the primary key, tables are schema less
- No need to define the other attributes or types
- Each item can have distinct attributes
Data Types
- Scalar (one value) - String, Number, Binary (base 64 encoded), Boolean (true or false) and null (unknown or undefined state)
- Document (List and Map)
- Supports complex JSON structures
- Set (multiple values) - String set, Number set, and Binary set
- All elements of the same scalar type
- Each value within a set must be unique
- Order is not important
I don't plan on using a NoSQL database - except in the case of elasticserach - so I will come back and look at DynamoDB in depth if I need to use it.
Decoupling Applications with SQS, SNS and MQ
Synchronous Communication
- Applications on your web server make synchronous calls to the logging service
- What if your logging service goes down?
- Will your applications go down too?
- What if all of a sudden, there is a high load and there are a lot o logs coming in?
- Log Service is not able to handle the load and goes down very often
Asynchronous Communication
- Create a queue or a topic
- Your applications put the logs on the queue
- They would be picked up when the logging service is ready
- Good example of decupling
Asynchronous Communication - Pull Model - SQS
- Producers put messages on the queue
- Consumers poll on the queue
- Only one of the consumers will successfully process a given message
- Scalability
- Scale consumer instances under high load
- Availability
- Producer up even if a consumer is down
- Reliability
- Work is not lost due to insufficient resources
- Decoupling
- Make changes to consumers without effect on producers worrying about them
Asynchronous Communication - Push Model - SNS
- Subscribers subscribe to a topic
- Producers send notifications to a topic
- Notification sent to all subscribers
- Decoupling
- Producers don't care about who is listening
- Availability
- Producer up even if a subscriber is down
Simple Queueing Service
- Reliable, scalable, fully-managed message queueing service
- High availability
- Unlimited scaling
- Auto scale to process billions of messages per day
- Low cost (pay for use)
Standard and FIFO Queues
- Standard Queue
- Unlimited throughput
- BUT NO guarantee of ordering (Best Effort Ordering)
- and NO guarantee of exactly once processing
- Guarantees at-least-once delivery system (some messages can be processed twice)
- FIFO (first-in-first-out) Queue
- First-In-First-Out Delivery
- Exactly-Once Processing
- BUT throughput is lower
- Up to 300 messages per second (300 send, receive, or delete operations per second)
- If you batch 10 messages per operation (maximum), up to 3,000 messages per second
- Choose
- Standard SQS queue if throughput is important
- FIFO queue if order of events is important
- Simple Queuing Service Lifecycle of a Message
Amazon Simple Notification Service (SNS)
- Publish-Subscribe (pub-sub) paradigm
- Broadcast asynchronous event notifications
- Simple process
- Create an SNS topic
- Subscribers can register for a topic
- When a SNS Topic receives an event notification (from publisher), it is broadcast to all Subscribers
- Use Cases: Monitoring Apps, workflow systems, mobile apps
- Provide mobile and enterprise messaging web services
- Push notifications to Apple, Android, FireOS, Windows devices
- Send SMS to mobile users
- Send Emails
- REMEMBER: SNS does not need AQS or a Queue
- You can allow access to other AWS accounts using AWS SNS generated policy
Amazon MQ
- Managed message broker service for Apache MQ
- (Functionally)Amazon MQ = Amazon SQS (Queues) + Amazon SNS (Topics)
- But with restricted scalability
- Supports traditional APIs (JMS) and protocols (AMQP, MQTT, OpenWire, and STOMP)
- Easy to migrate on-premise applications using traditional messages brokers
- Start with Amazon MQ as first step and slowly re-design apps to use Amazon SQS and/or SNS
Handling Data Streams
Streaming Data
- Characteristics of streaming data:
- Continuously generated
- Small pieces of data
- Sequenced - mostly associated with time
Amazon Kinesis
- Handle streaming data
- NOT recommended for ETL batch jobs
- Amazon Kinesis Data Streams
- Process Data Streams
- Amazon Kinesis Firehose
- Data ingestion for streaming data: S3, Elasticsearch, etc.
- Amazon Kinesis Analytics
- Run queries against streaming data
- Amazon Kinesis Video Steams
- Monitor video streams
Amazon Kinesis Data Streams
Routing and Content Delivery
Content Delivery Network
- You want to deliver content to your global audience
- Content Delivery Networks distribute content to multiple edge locations around the world
- AWS provides 200+ edge locations around the world
- Provides high availability and performance
Amazon CloudFront
- How do you enable serving content directly from AWS edge locations?
- Amazon CloudFront (one of the options)
- Serve users from nearest edge location (based on user location)
- Source content can be from S3, EC2, ELB, and External websites
- If content is not available at the edge location, then it its retrieved from the origin server and cached
- No minimum usage commitment
- Provides features to protect your private content.
- Use Cases
- Static web apps. Audio, video, and software downloads. Dynamic web apps
- Support media streaming with HTTP and RTMP
- Integrates with
- AWS Shield to protect from DDoS attacks
- AWS Web Application Firewall (WAF) to protect from SQL inject, cross site scripting, etc.
- Cost Benefits
- Zero cost for data transfer between S3 and CloudFront
- Reduce compute workload for your EC2 instance
Amazon CloudFront Distribution
- Create a CloudFront distribution to distribute your content to edge locations
- DNS domain name - example: abc.cloudfront.com
- Origins - Where do you get content from? S3, EC2, ELB, External Website
- Cache-Control
- By default, objects expire after 24 hours
- Customize min, max, default TTL in CloudFront distributions
- (For file level customization) Use Cache-Control max-age and Expires headers in origin server
- You can configure CloudFront to only use HTTPS (or) use HTTPS for certain objects
- Default is to support both HTTP and HTTPS
- You can configure CloudFront to redirect HTTP to HTTPS
Store Static content in S3 and use CloudFront to reduce latency
Amazon CloudFront - Cache Behaviors
- Configure different CloudFront behavior for different URL path patterns from the same origin
- Path pattern (can use wild cards - *.php, *.jsp)
- Do you want to forward query strings?
- Should we use https?
- TTL
Amazon CloudFront - Private Content
- Signed URLs
- Signed cookies using key pairs
- Origin Access Identities (OAI)
- Ensures that only CloudFront can access S3
- Allow access to S3 only to a special CloudFront user
Amazon CloudFront - Signed URLs and Cookies
- Signed URLs
- RTMP distribution
- Application downloads (individual files) and
- Situations where cookies are not supported
- Signed Cookies
- Multiple files (You have a subscriber website)
- Does not need any change in application URLs
Amazon CloudFront - Origin Access Identities (OAI)
- Only CloudFront can access S3
- Create a special CloudFront user - Origin Access Identities (OAI)
- Associate OAI with CloudFront distribution
- Create a S3 Bucket Policy allowing access to OAI
Amazon CloudFront - Remember
- old content automatically expires from CloudFront
- Invalidation API - remove object from cache
- REMEMBER: Designed for use in emergencies
- Best Practice - Use versioning in object path name
- Example: /images/profile?version=1
- Prevents the need to invalidated content
- Do not use CloudFront for
- all requests from single location
- all request from corporate VPM
Route 53
- Route 52 = Domain Registrar + DNS
- Buy your domain name
- Setup your DNS routing for domain name
- Look into Route53 as a DNS provider. Look into the differences of speeds between DNS providers
DevOps
Getting better at the Three Great Elements of Software Teams - Communication, Feedback, and Automation
- AWS CodeDeploy - Automate deployment
- AWS CodeBuild - Fully managed build service in AWS
- Provides pre configures build environments (Docker Images) for popular programming languages
I think the main thing that I would want to look into here is IAC (Infrastructure as Code)
Infrastructure as Code
- Treat infrastructure the same way as application code
- Track your infrastructure changes over time (version control)
- Bring repeatability into your infrastructure
- Two Key Parts
- Infrastructure Provisioning
- Provisioning compute, database, storage, and networking
- Open source loud neutral - Terraform
- Open Source
- Terraform
- AWS CloudFormation
- Provision AWS Resources
- AWS SAM (Serverless Application Model)
- Provision Serverless Resources
- Configuration Management
- Install right software tools on the provisioned resources
- Open source Tools - Chef, Puppet, Anasible
AWS CloudFormation
- Let's consider an example:
- I would want to create a new VPX and a subnet
- I want to provision a ELB, ASG, with 5 EC2 instances & RDS database
- I would want to setup the right security groups
- AND I would want to create 4 environments
- Dev, QA, Stage, and Production!
- CloudFormation can help you do all these with a simple (actually NOT so simple) script!
- Advantages (Infrastructure as Code - IAC and CloudFormation)
- Automate deployment of AWS resources in a controlled, predictable way
- Avoid mistakes with manual configuration
- Think of it as version control for your environments
AWS Cloud Formation
- All configuration is defined in a simple text file - JSON or YAML
- I want a VPC, a subnet, a database, and ...
- CloudFormation understands dependencies
- Creates VPCs first, then subnets, and then the database
- (Default) Automatic rollbacks on errors (Easier to retry)
- If creation of database fails, it would automatically delete the subnet and VPC
- Version control your configuration file and make changes to it over time
- Free to use - Pay only for the resources provisioned
- Get an automated estimate for your configuration
AWS Cloud Formation - Terminology
- Template
- A cloud formation JSON or YAML defining multiple resources
- Stack
- A group of resources that are created from a CloudFormation template
- Change Sets
- To make changes to a stack, update the template
- Change set shows what would change if you execute
- Allows you to verify the changes and then execute
AWS Cloud Formation - Important Template Elements
{
"AWSTemplateFormatVersion": "version date",
"Description": "JSON string",
"Metadata" : {},
"Parameters": {},
"Mappings": {},
"Resources": {},
"Outputs": {},
}
- Resources - what do you want to create?
- One and only mandatory element
- Parameters - Values to pass to your template at runtime
- Which EC2 instance to create? ("t2.micro", "m1.small", "ml.large")
- Mappings - Key value pairs
- Example: Configure different values for different regions
- Outputs - Return values from execution
- See them on console and use in automation
AWS Cloud Formation - Resources
- The only mandatory section in the template
- Contains the list of resource objects to be created
- Each resource has different attributes (mandatory & optional)
- ImageId attribute for an EC2 instance resource
- Specified under Properties
- Type attribute specifies the type of the resource
- Format for type attribute is: (AWS::ProductIdentifier::ResourceType)
AWS CloudFormation - Parameters
- Parameters make the template dynamic
- You can defined constraints on parameters - AllowedPattern, AllowedValues, MaxLength, MaxValue, MinLength, MinValue, etc.
- Type is Mandatory (String, Number etc.)
- Can be AWS-specific parameter type
AWS CloudFormation - pseudo Parameters
- Parameters predefined by AWS CloudFormation
Common Resource Attributes - CreationPolicy
- CreationPolicy: When is the creation of a resource complete?
- AustoScalingCreatingPolicy: How many instances in ASG should be ready>
- ResourceSignal: No of signals and max wait time
- Used with Amazon EC2 and Auto Scaling resources
- (Alternative) For coordination with external configuration actions use WaitCondition
Common resource Attributes - Others
- Condition can be attached with a resource =or ouput section
- Based on the condition: Resource or output is created
AWS Cloud Formation - Conditions
- Matches a key to the set of values (can contain one or multiple values)
AWS CloudFormation - Mappings
- Export values from templates for later use
- Maximum of 60 outputs in a template
- Can be used to create cross stack reference by exporting the value
- CloudFromation does not hide or encrypt the output section
- If you export password it will be visible
...
AWS CloudFormation - Remember
- Deleting a stack deletes all the asssociated resources
- EXCEPT for resources with DeletionPolicy attribute set to "Retain"
- You can enable termination protection for the entire stack
- You can execute CloudFormation templates from AWS CLI:
aws cloudformation create-stack/list-stacks/describe-stacks
- Python helper scripts simplify deployment on EC2 instances:
cfn-init
Retrieve resource metadata, install packages etc.cfn signal
Enable you to synchronize with other resources in stack - Signal with aCreationPolicy
orWaitCondition
cfn-get-metdata
Retrieve metadata for a resource or path to a specific keycfn-hup
Check for updates to metadata and execute custom hooks
Actually, not sure if I want to use this service. It seems like it would only complicate things.
Serverless Application Model SAM
- Serverless projects can become a maintenance headache
- 1000s of Lambda functions to manage, versioning, deployment, etc.
- How to setup serverless projects with Lambda, DynamoDB in your local?
- How to ensure that your serverless projects are adhering to best practices?
- Welcome SAM - Serverless Application Model
- Infrastructure as Code (IAC) for Serverless Applications
Serverless Application model - Approach and Advantages
- Open source framework for building serverless applications
- Define YAML file with resources (Functions, APIs, Databases...)
- (BEHIND THE SCENES) SAM config => CloudFormation scripts
- Benefits of SAM
- Single deployment configuration
- Extends CloudFormation and hides complexity
- Built-in best practices
- Local debugging and testing
- Benefits of IAC (Infrastructure as Code)
- No manual errors, Version control, Avoid configuration drift
EC2 - Advanced
Vertical Scaling
- Deploying application/database to bigger instance:
- A larger hard drive
- A faster CPU
- More RAM, CPU, I/O, or networking capabilities
- There are no limits to vertical scaling
Horizontal Scaling
- Deploying multiple instances of application/database
- (Typically but not always) Horizontal Scaling is preferred to Vertical Scaling
- Vertical Scaling has limits
- Vertical scaling can be expensive
- Horizontal scaling increases availability
- (BUT) Horizontal Scaling needs additional infrastructure
- Load Balancers etc.
Horizontal Scaling for EC2
- Distribute EC2 instances
- in a single AZ
- in multiple AZs in single region
- in multiple AZs in multiple regions
- Auto scale: Auto Scaling Group
- Distribute Load: Elastic Load Balancer
EC2 Tenancy - Shared vs Dedicated
- Shared Tenancy (Default)
- Single host machine can have instances from multiple customers
- EC2 Dedicated Instances
- Virtualized instances on hardware dedicated to one customer
- You do NOT have visibility into the hardware of underling host
- EC2 Dedicated Hosts
- Physical servers dedicated to one customer
- You have visibility into the hardware of underlying host (sockets and physical cores)
- (Use cases) Regulatory needs or server-bound software licenses like Windows Server, SQL Server
Pricing Model | Description | Details |
---|---|---|
On Demand | Request when you want it | Flexible and Most Expensive |
Spot | Quote the maximum price | Cheapest (up to 90% off) BUT NO Guarantees |
Reserved | Reserve ahead of time | Up to 75% off. 1 or 3 yeas reservation. |
Savings Plan | Commit spending $X per hour (EC2 or AWS Fargate or Lambda) | Up to 66% off. No restrictions. 1 or 3 years reservation. |
- EC2 On-Demand
- Ideal for:
- A web application that receives spiky traffic
- A batch program which has unpredictable runtime and cannot be interrupted
- A batch program being moved from on-premises to cloud for the first time
- EC2 Spot Instances
- Ideal for Non time-critical workloads that can tolerate interruptions (fault tolerant)
- A batch program that does not have a strict deadline AND can be stopped at short notice and re-started.
- EC2 Reserved Instances
- Reserve EC2 instances ahead of time
- EC2 Pricing Models Overview
EC2 Placement Groups
- Certain use cases need control over placement of a group of EC2 instances
- Low latency network communication
- High availability
- EC2 Placement groups:
- Cluster (low latency)
- Spread (avoid simultaneous failures)
- Partition (multiple partitions with low network latency
- Cluster Placement Groups
- EC2 instances placed near each other in single AZ
- High Network Throughput: EC2 instances can use 10 Gbps or 25Gbps network
Storage in Cloud - Block Storage and File Storage
Storage Types - Block Storage and File Storage
- What is the type of storage of your hard disk?
- Block Storage
- Amazon Elastic Block Store (EBS)
- Instance Store
- You've create a file share to share a set of files with your colleagues in a enterprise. What type of storage are you using?
- File storage
- Amazon EFS (for Linux instances)
- Amazon FSx Windows File Servers
- Amazon FSx for Lustre (high performance use cases)
EC2 - Block Storage
- Two popular types of blocks storage can be attached to EC2 instances
- Elastic Block Store
- Instance Store
- Instance stores are physically attached to the EC2 instance
- Elastic Block Store (EBS) is network storage
AWS Managed Services
PAAS vs IAAS
AWS Managed Service Offerings
- Elastic Load Balancing - Distribute incoming traffic across multiple targets
- AWS Elastic Beanstalk- Run and manage web apps
- Amazon Elastic Container Service- Containers orchestration on AWS
- AWS Fargate- Serverless compute for containers
- Amazon Elastic Kubernetes Service- Run Kubernetes on AWS
- Amazon RDS- Relational Database
Comments
You have to be logged in to add a comment
User Comments
There are currently no comments for this article.