Learning More About AWS (Part 3) - Notes for Certified Developer Associate Exam

Splitting up the notes for the Certified Developer Associate Exam because I have not yet implemented lazy loading for articles.

Date Created:
Last Edited:

Amazon S3 Glacier


Amazon S3 Glacier

  • In addition to existing as a S3 Storage Class, S3 Glacier is a separate AWS Service on its own
  • Extremely low cost storage for archives and long term backups:
    • Old media content
    • Archives to meet regulatory requirements (old patient records, etc.)
    • As a replacement for magnetic tapes
  • High durability (11 9s)
  • High scalability (unlimited storage)
  • High Security (encrypted at rest and in transfer)
  • Cannot upload objects to Glacier using Management Console
    • Use REST API, AWS CLI, AWS SDK

Retrieving archives from S3 Glacier

  • Asynchronous two step process (Use REST API< AWS CLI< or SDK)
    • Initiate a archive retrieval
    • (After archive is available) Download the archive
  • Reduce costs by optionally specify a range, or portion, of the archive to retrieve
  • Reduce costs by requesting longer access times
    • Amazon S3 Glacier:
      • Expedited (1-5 minutes)
      • Standard (3-5 hours)
      • Bulk Retrieval (5-12 hours)
    • Amazon D3 Glacier Deep Archive
      • Standard Retrieval (up to 12 hours)
      • Bulk Retrieval (up to 48 hours)


IAM - Fundamentals


Typically identity management in the cloud

  • You have resources in the cloud (examples - a virtual server, a database, etc.)
  • You have identities (human and non-human) that need access those resources and perform actions
    • For example: launch (stop, start, or terminate) a virtual server
  • How do you identify users in the cloud?
  • How do you configure resources they can access?
  • How can you configure what actions to allow?
  • In AWS, Identity and Access Management (IAM) provides this service

AWS Identity Management (IAM)

  • Authentication (is it the right user?)
  • Authorization (do they have the right access?)
  • Identities can be
  • Provides very granular control
    • Limit a single user:
      • to perform a single action
      • on a specific AWS resources
      • from a specific IP address
      • during a specific time window

Important IAM Concepts

  • IAM Users: Users created in an AWS account
    • Has credentials attached (name/password or access keys)
  • IAM groups: Collection of IAM users
  • Roles: Temporary identities
    • Does NOT have credentials attached
    • (Advantage) Expire after a set period of time
  • Policies: Define permissions
    • AWS managed policies - Standalone policy predefined by AWS
    • Customer managed policies - Standalone policy created by you
    • Inline Policies - Directly embedded into a user, group, or role

AWS IAM Policies

  • Policy is a JSON document with one or more permissions
    • Effect - Allow or Deny
    • Resource - Which resource are you providing access to?
    • Action - What actions are allowed on the resource?
    • Condition - Are there any restrictions on IP address ranges or time intervals?

Authentication with IAM - Remember

  • IAM User identities exist until they are explicitly deleted
  • IAM allows you to create a password policy
    • What characters should your password contain?
    • When does your password expire?
  • Access key's should be constantly rotated
  • Two access keys can be active simultaneously. Makes rotation of keys easier
  • An IAM role can be added to already running EC2 instances. Immediately effective.
  • An IAM user can assume IAM role temporarily.
  • An IAM role is NIT associated with long-term credentials
    • When a user, a resource (For example, an EC2 instance) or an application assumes a Role, it is provided with temporary credentials
  • DO NOT use AWS IAM root user for regular everyday tasks. Lock it away after creating an admin IAM user.
  • Enable Multi Factor Authentication for all important IAM operations
    • Extra layer of security
    • MFA Devices
      • Hardware device - Gemalto
      • Virtual device - An app or a smart phone

IAM Best Practices - Recommended by AWS

  • Users - Create individual users
  • Groups - Manage permissions with groups
  • Permissions - Grant least privilege
  • Auditing - Turn on AWS CloudTrail
  • Password - Configure a strong password policy
  • MFA - Enable MFA for privileged users
  • Roles - Use IAM roles for Amazon EC2 instances
  • Sharing - Use IAM roles to share access
  • Rotate - Rotate security credentials regularly
  • Conditions - Restrict privileged access further with conditions
  • Root - Reduce or remove use of root

Data Encryption KMS and Cloud HSM


Data States

  • Data at rest: Stored on a device or a backup
  • Data in motion: Being transferred across a network
    • Also called Data in transit
    • Examples:
      • Data copied from on-premise to cloud storage
      • An application in a VPC talking to a database
    • Two Types:
      • In and out of AWS
      • Within AWS
  • Data in use: Active data processed in a non-persistent state
    • Example: Data in your RAM

Encryption

  • If you store data as is, what would happen if an unauthorized entity gets access to it?
    • Imagine losing an unencrypted hard disk
  • First law of security: Defense in Depth
  • Typically, enterprises encrypt all data
  • Is it sufficient if you encrypt data at rest?
    • No. Encrypt data in transit - between application to database as well

Symmetric Key Encryption algorithms use the same key for encryption and decryption

Asymmetric Key Encryption (Also Called Public Key Cryptography) uses two keys: public and private. Encrypt data with public key and decrypt with private key.

AWS provides two important services - KMS and Cloud HSM that allow you to manage your keys and perform encryption and decryption.

AWS KMS

  • Create and manage cryptographic keys
  • Control their use in your application and AWS Services
  • Defined usage permissions
  • Track key usage in AWS CloudTrail
  • Integrated with almost all AWS services that need data encryption
  • Automatically rotate master keys once a year
    • No need to re-encrypt previously encrypted data (versions of master key are maintained)
    • Schedule key deletion to verify if the key is used
      • Mandatory minimum wait period of 7 days (max-30 days)
  • The process KMS uses for encryption is called Envelope Encryption
    • Data is encrypted using data key
    • Data key is encrypted using Master Key
    • Master Key never leaves KMS

AWS CloudHSM

  • Managed (highly available and auto scaling) dedicated single-tenant Hardware Security Module (HSM) for regulatory compliance
  • AWS CANNOT access your encryption master keys in CloudHSM
    • In KMS, AWS can access your master keys
    • Be ultra safe with your keys when you are using CloudHSM
    • (Recommendation) Use two or more HSMs in separate AZs in a production cluster

  • Recommended to use HTTPS endpoints to ensure encryption of data in transit
    • All AWS services (including S3) provides HTTPS endpoints
    • Encryption is optional with S3 but highly recommended in flight and at rest

Networking


Need for Amazon VPC

  • In a corporate network or an on-premises data center:
    • Can anyone on the internet see the data exchange between the application and the database
      • No
    • Can anyone from internet directly connect to your database?
      • Typically NO
      • You need to connect to your corporate network and then access your applications or databases
  • Corporate network provides a secure internal network protecting your resources, data, and communication from external users
  • How do you create your own private network in the cloud>
    • Enter Virtual Private Cloud (VPC)

Amazon VPC (Virtual Private Cloud)

  • Your own isolated network in AWS cloud
    • Network traffic within a VPC is isolated (not visible) from all other Amazon VPCs
  • You control all the traffic coming in and going outside a VPC
  • (Best Practice) Create all your AWS resources (compute, storage, databases etc.) within a VPC
    • Secure resources from unauthorized access AND
    • Enable secure communication between your cloud resources

Need for VPC Subnets

  • Different resources are created on cloud - databases, compute (EC2) etc
  • Each type of resource has its own access needs
  • Public Elastic Load Balancers are accessible form internet (public resources)
  • Databases or EC2 instances should NOT be accessible from internet
    • ONLY applications within your network (VPC) should be able to access them (private resources)
  • How do you separate public resources from private resources inside a VPC?

VPC Subnets

  • (Solution) Create different subnets for public and private resources
    • Resources in a public subnet CAN be accessed from the internet
    • Resources in a private subnet CANNOT be accessed from internet
    • BUT resources in public subnet can talk to resources in private subnet
  • Each VPC is created in a Region
  • Each Subnet is created in an Availability Zone

Routing on the internet

  • You have an IP address of a website you want to visit
  • There is no direct connection from your computer to the website
  • Internet is actually a set of routers routing traffic
  • Each router has a set of rules that help it decide the path to the destination ip address
  • In AWS, route tables are used for routing
  • Each route table consists of a set of rules called routes
    • Each route or routing rule has a destination and target
    • What range of addresses should be routed to which target resource?

Public Subnet vs Private Subnet

  • Public Subnet
    • Communication allowed from subnet to internet
    • Communication allowed from internet to subnet
  • Private Subnet
    • Communication NOT allowed from internet to subnet
  • An internet gateway enables internet communication for subnets
  • Any subnet which has a route to an internet gateway is called a public subnet
  • Any subnet which DOES NOT have route to an internet gateway is called a private subnet

Network Address Translation (NAT) Instance and Gateway

  • How do you allow instances in a private subnet to download software updates and security patches while denying inbound traffic from internet?
  • How do you allow instances in a private subnet to connect privately to other AWS Services outside the VPC>
  • Three Options:
    • NAT Gateway: Managed Service
    • NAT Instance: Install a EC2 instance with specific NAT AMI and configure as a gateway
    • Egress-Only Internet Gateways: For IPv6 subnets

Network Access Control List

  • Security groups control traffic to a specific resource in a subnet
  • How about stopping traffic from even entering the subnet?
  • NACL provides stateless firewall at subnet level
  • Each subnet must be associated with a NACL
  • Default NACL allows all inbound and outbound traffic
  • Custom created NACL denies all inbound and outbound traffic by default
  • Rules have a priority number
    • Lower number => Higher Priority

VPC Flow Logs

  • Monitor Network Traffic
  • Troubleshoot connectivity issues (NACL and/or security groups misconfiguration)
  • Capture traffic going in and out of your VPX (network interfaces)
  • Can be created for
    • a VPC
    • a subnet
  • Publish logs to Amazon CloudWatch Logs or Amazon S3
  • Flow Log records contain ACCEPT or REJECT
    • Is traffic permitted by security groups or network ACLs?

VPC Peering

  • Connect VPXs belonging to same or different AWS accounts irrespective of the region of the VPCs
  • Allows private communication between the connected VPXs
  • Peering uses a request / accept protocol
    • Owner of requesting VPC sends a request
    • Owner of the peer VPC has one week to accept

VPC - Review

  • VPC: Virtual Network to protect resources and communication from outside world
  • Subnet: Separate resources from public resources
  • Internet Gateway: Allows Public Subnets to connect/accept traffic to/from internt
  • NAT Gateway: Allow internet traffic from private subnets
  • VPC Peering: Connect one VPC with other VPCs
  • VPC Flow Logs: Enable logs to debug problems
  • AWS Direct Connect: Private pipe from AWS to on-premises
  • AWS VPC: Encrypted (IPsec) tunnel over internet to on-premises

Database Fundamentals


Databases Primer

  • Databases provide organized and persistent storage for your data
  • To choose between different database types, we would need to understand
    • Availability
    • Durability
    • RTO
    • RPO
    • Consistency
    • Transactions, etc.
  • Availability
    • Will be able to access my data now and when I need it?
    • Percentage of time an application provides the operations expected of it
  • Durability
    • Will my data be available after 10 or 100 or 1000 years?
  • Typically, an availability of four 9's is considered very good
  • Typically, a durability of eleven 9's is considered very good
    • meaning: If you store one million files for ten million years, you would expect to lose one file

  • Increasing Availability
    • Having multiple standbys available
      • in multiple AZs
      • in multiple Regions
  • Increasing Durability
    • Multiple copies of data (standbys, snapshots, transaction logs and replicas)
      • in multiple AZs
      • in multiple Regions
  • Replicating data comes with its own challenges!
  • RPO (Recovery Point Objective): Maximum acceptable period of data loss
  • RTO (Recovery Time Objective): Maximum acceptable downtime
  • Vertically scale the database - increase CPU and memory
  • create a database cluster - typically database clusters are expensive to setup
  • Create read replicas - Run read only applications against read replicas
  • Strong Consistency - Synchronous replication to all replicas
    • Will be slow if you have multiple replicas or standbys
  • Eventual Consistency - Asynchronous replication. A little lag - few seconds - before the change is available in all replicas
    • In the intermediate period, different replicas might return different values
    • Used when scalability is more important than data integrity
  • Read-after-Write consistency - Inserts are immediately available and deletes are eventually consistent

Database Categories

  • There are several categories of databases:
    • Relational (OLTP and OLAP), Document, Key Value, Graph, In Memory among others
  • Choosing type of database for you ruse case is not easy. A few factors:
    • Do you want a fixed schema?
      • Do you want flexibility in defining and changing your schema (schema-less)?
      • What level of transaction properties do you need? (atomicity and consistency)
      • What kind of latency do you want? 9seconds, milliseconds, or microseconds)
      • How many transactions do you expect? (hundreds or thousands or millions of transactions per second)
      • How much data will be stored? (MBs or GBs or TBs or PBs)

Relational Databases

  • Predefined schema with tables and relationships
  • Very strong transactional capabilities
  • Used for
    • OLTP (Online Transaction processing) use cases and
    • OLAP (Online Analytics Processing) use cases

Relational database - Online Transaction processing

  • Applications where large number of users make large number of small transactions
    • Recommended AWS Managed Service: Amazon RDS

Relational Database - Online Analytics Processing

  • Applications allowing users to analyze petabytes of data
  • Recommended AWS Managed Service: Amazon Redshift

OLTP databases use row storage. OLAP databases use columnar storage.

Document Database

  • Structure data the way your application needs it
  • Recommended AWS Managed Service: Amazon DynamoDB

Key-Value

  • Recommended AWS managed service: Amazon Dynamo DB again

Graph

  • Store and navigate data with complex relationships
  • Recommended AWS Managed Service: Amazon Neptune

In-memory Databases

  • Retrieving data from memory is much faster than retrieving data from disk
  • Recommended AWS Service: Amazon Elasticache

Amazon RDS

  • AWS is responsible for
    • Availability
    • Durability
    • Scaling (according to your configuration)
    • Maintenance (patches)
    • Backups
  • Multi-AZ makes maintenance easy
    • Standby in a different AZ
    • Synchronous replication (strong consistency)
    • Perform maintenance on standby
    • Promote standby to primary
    • Perform maintenance on (old) primary
  • Amazon Aurora
    • MySQL and PostgreSQL compatible
    • 2 copies of data each in a minimum of 3 AZ

RDS - Scaling

  • Vertical Scaling: Change DB instance type and scale storage
    • Storage and compute changes are typically applied during maintenance window
    • You can choose to "apply immediately"
    • RDS would take care of data migration
    • Vertically Scaling: RDS also supports auto scaling storage
  • Horizontal Scaling
    • Configure read replicas

RDS - Security and Encryption

  • Create in a VPC private subnet
  • Use security groups to control access
  • Option to use IAM Authentication with PostgreSQL
  • Enable encryption with Keys from KMS
  • When encryption is enabled
    • Data in the database, automated backups, read replicas and snapshots are all encrypted
  • data In-Flight Encryption
    • Using SSL certificates

RDS Costs - Key Elements

  • DB instance hours - How many hours is the DB instance running?
  • Storage (per GB per month) - How much storage have your provisioned for your DB instance?
  • Provisioned IOPS per month - If you are using Amazon RDS Provisioned IOPS (SSD) Storage
  • Backups and snapshot storage - More backups, More snapshots => More cost
  • Data transfer costs

Amazon DynamoDB


  • Fast, scalable, distributed for any scale
  • Flexible NoSQL Key-value & document database (schema-less)
  • Single-digit millisecond responses for million of TPS
  • Do not worry about scaling, availability or durability
    • Automatically partitions data as it grows
    • maintains 3 replicas within the same region
  • No need to provision a database
    • Create a table and configure read and write capacity (RCU and WCU)
    • Automatically scales to meet your RCU and WCU
  • Provides an expensive serverless mode
  • Use cases: User profiles, shopping carts, high volume read and write applications

DynamoDB Tables

  • Hierarchy: Table -> item(s) -> attribute (key value pair)
  • Mandatory primary key
  • Other than the primary key, tables are schema less
    • No need to define the other attributes or types
    • Each item can have distinct attributes

Data Types

  • Scalar (one value) - String, Number, Binary (base 64 encoded), Boolean (true or false) and null (unknown or undefined state)
  • Document (List and Map)
    • Supports complex JSON structures
  • Set (multiple values) - String set, Number set, and Binary set
    • All elements of the same scalar type
    • Each value within a set must be unique
    • Order is not important

I don't plan on using a NoSQL database - except in the case of elasticserach - so I will come back and look at DynamoDB in depth if I need to use it.

Decoupling Applications with SQS, SNS and MQ


Synchronous Communication

  • Applications on your web server make synchronous calls to the logging service
  • What if your logging service goes down?
    • Will your applications go down too?
  • What if all of a sudden, there is a high load and there are a lot o logs coming in?
    • Log Service is not able to handle the load and goes down very often

Asynchronous Communication

  • Create a queue or a topic
  • Your applications put the logs on the queue
  • They would be picked up when the logging service is ready
  • Good example of decupling

Asynchronous Communication - Pull Model - SQS

  • Producers put messages on the queue
  • Consumers poll on the queue
    • Only one of the consumers will successfully process a given message
  • Scalability
    • Scale consumer instances under high load
  • Availability
    • Producer up even if a consumer is down
  • Reliability
    • Work is not lost due to insufficient resources
  • Decoupling
    • Make changes to consumers without effect on producers worrying about them

Asynchronous Communication - Push Model - SNS

  • Subscribers subscribe to a topic
  • Producers send notifications to a topic
    • Notification sent to all subscribers
  • Decoupling
    • Producers don't care about who is listening
  • Availability
    • Producer up even if a subscriber is down

Simple Queueing Service

  • Reliable, scalable, fully-managed message queueing service
  • High availability
  • Unlimited scaling
    • Auto scale to process billions of messages per day
  • Low cost (pay for use)

Standard and FIFO Queues

  • Standard Queue
    • Unlimited throughput
    • BUT NO guarantee of ordering (Best Effort Ordering)
    • and NO guarantee of exactly once processing
      • Guarantees at-least-once delivery system (some messages can be processed twice)
  • FIFO (first-in-first-out) Queue
    • First-In-First-Out Delivery
    • Exactly-Once Processing
    • BUT throughput is lower
      • Up to 300 messages per second (300 send, receive, or delete operations per second)
      • If you batch 10 messages per operation (maximum), up to 3,000 messages per second
  • Choose
    • Standard SQS queue if throughput is important
    • FIFO queue if order of events is important
  • Simple Queuing Service Lifecycle of a Message

Amazon Simple Notification Service (SNS)

  • Publish-Subscribe (pub-sub) paradigm
  • Broadcast asynchronous event notifications
  • Simple process
    • Create an SNS topic
    • Subscribers can register for a topic
    • When a SNS Topic receives an event notification (from publisher), it is broadcast to all Subscribers
  • Use Cases: Monitoring Apps, workflow systems, mobile apps
  • Provide mobile and enterprise messaging web services
    • Push notifications to Apple, Android, FireOS, Windows devices
    • Send SMS to mobile users
    • Send Emails
  • REMEMBER: SNS does not need AQS or a Queue
  • You can allow access to other AWS accounts using AWS SNS generated policy

Amazon MQ

  • Managed message broker service for Apache MQ
  • (Functionally)Amazon MQ = Amazon SQS (Queues) + Amazon SNS (Topics)
    • But with restricted scalability
  • Supports traditional APIs (JMS) and protocols (AMQP, MQTT, OpenWire, and STOMP)
    • Easy to migrate on-premise applications using traditional messages brokers
    • Start with Amazon MQ as first step and slowly re-design apps to use Amazon SQS and/or SNS

Handling Data Streams


Streaming Data

  • Characteristics of streaming data:
    • Continuously generated
    • Small pieces of data
    • Sequenced - mostly associated with time
Amazon Kinesis
  • Handle streaming data
    • NOT recommended for ETL batch jobs
  • Amazon Kinesis Data Streams
    • Process Data Streams
  • Amazon Kinesis Firehose
    • Data ingestion for streaming data: S3, Elasticsearch, etc.
  • Amazon Kinesis Analytics
    • Run queries against streaming data
  • Amazon Kinesis Video Steams
    • Monitor video streams
Amazon Kinesis Data Streams

Routing and Content Delivery


Content Delivery Network
  • You want to deliver content to your global audience
  • Content Delivery Networks distribute content to multiple edge locations around the world
  • AWS provides 200+ edge locations around the world
  • Provides high availability and performance
Amazon CloudFront

  • How do you enable serving content directly from AWS edge locations?
    • Amazon CloudFront (one of the options)
  • Serve users from nearest edge location (based on user location)
  • Source content can be from S3, EC2, ELB, and External websites
  • If content is not available at the edge location, then it its retrieved from the origin server and cached
  • No minimum usage commitment
  • Provides features to protect your private content.
  • Use Cases
    • Static web apps. Audio, video, and software downloads. Dynamic web apps
    • Support media streaming with HTTP and RTMP
  • Integrates with
    • AWS Shield to protect from DDoS attacks
    • AWS Web Application Firewall (WAF) to protect from SQL inject, cross site scripting, etc.
  • Cost Benefits
    • Zero cost for data transfer between S3 and CloudFront
    • Reduce compute workload for your EC2 instance
Amazon CloudFront Distribution

  • Create a CloudFront distribution to distribute your content to edge locations
    • DNS domain name - example: abc.cloudfront.com
    • Origins - Where do you get content from? S3, EC2, ELB, External Website
    • Cache-Control
      • By default, objects expire after 24 hours
      • Customize min, max, default TTL in CloudFront distributions
      • (For file level customization) Use Cache-Control max-age and Expires headers in origin server
  • You can configure CloudFront to only use HTTPS (or) use HTTPS for certain objects
    • Default is to support both HTTP and HTTPS
    • You can configure CloudFront to redirect HTTP to HTTPS

Store Static content in S3 and use CloudFront to reduce latency

Amazon CloudFront - Cache Behaviors
  • Configure different CloudFront behavior for different URL path patterns from the same origin
    • Path pattern (can use wild cards - *.php, *.jsp)
    • Do you want to forward query strings?
    • Should we use https?
    • TTL
Amazon CloudFront - Private Content

  • Signed URLs
  • Signed cookies using key pairs
  • Origin Access Identities (OAI)
    • Ensures that only CloudFront can access S3
    • Allow access to S3 only to a special CloudFront user
Amazon CloudFront - Signed URLs and Cookies

  • Signed URLs
    • RTMP distribution
    • Application downloads (individual files) and
    • Situations where cookies are not supported
  • Signed Cookies
    • Multiple files (You have a subscriber website)
    • Does not need any change in application URLs
Amazon CloudFront - Origin Access Identities (OAI)

  • Only CloudFront can access S3
  • Create a special CloudFront user - Origin Access Identities (OAI)
  • Associate OAI with CloudFront distribution
  • Create a S3 Bucket Policy allowing access to OAI
Amazon CloudFront - Remember
  • old content automatically expires from CloudFront
  • Invalidation API - remove object from cache
    • REMEMBER: Designed for use in emergencies
  • Best Practice - Use versioning in object path name
    • Example: /images/profile?version=1
    • Prevents the need to invalidated content
  • Do not use CloudFront for
    • all requests from single location
    • all request from corporate VPM
Route 53

  • Route 52 = Domain Registrar + DNS
    • Buy your domain name
    • Setup your DNS routing for domain name
  • Look into Route53 as a DNS provider. Look into the differences of speeds between DNS providers

DevOps


Getting better at the Three Great Elements of Software Teams - Communication, Feedback, and Automation

  • AWS CodeDeploy - Automate deployment
  • AWS CodeBuild - Fully managed build service in AWS
    • Provides pre configures build environments (Docker Images) for popular programming languages

I think the main thing that I would want to look into here is IAC (Infrastructure as Code)

Infrastructure as Code

  • Treat infrastructure the same way as application code
  • Track your infrastructure changes over time (version control)
  • Bring repeatability into your infrastructure
  • Two Key Parts
    • Infrastructure Provisioning
      • Provisioning compute, database, storage, and networking
      • Open source loud neutral - Terraform
      • Open Source
        • Terraform
      • AWS CloudFormation
        • Provision AWS Resources
      • AWS SAM (Serverless Application Model)
        • Provision Serverless Resources
    • Configuration Management
      • Install right software tools on the provisioned resources
      • Open source Tools - Chef, Puppet, Anasible

AWS CloudFormation


  • Let's consider an example:
    • I would want to create a new VPX and a subnet
    • I want to provision a ELB, ASG, with 5 EC2 instances & RDS database
    • I would want to setup the right security groups
  • AND I would want to create 4 environments
    • Dev, QA, Stage, and Production!
  • CloudFormation can help you do all these with a simple (actually NOT so simple) script!
  • Advantages (Infrastructure as Code - IAC and CloudFormation)
    • Automate deployment of AWS resources in a controlled, predictable way
    • Avoid mistakes with manual configuration
    • Think of it as version control for your environments
AWS Cloud Formation
  • All configuration is defined in a simple text file - JSON or YAML
    • I want a VPC, a subnet, a database, and ...
  • CloudFormation understands dependencies
    • Creates VPCs first, then subnets, and then the database
  • (Default) Automatic rollbacks on errors (Easier to retry)
    • If creation of database fails, it would automatically delete the subnet and VPC
  • Version control your configuration file and make changes to it over time
  • Free to use - Pay only for the resources provisioned
    • Get an automated estimate for your configuration
AWS Cloud Formation - Terminology
  • Template
    • A cloud formation JSON or YAML defining multiple resources
  • Stack
    • A group of resources that are created from a CloudFormation template
  • Change Sets
    • To make changes to a stack, update the template
    • Change set shows what would change if you execute
    • Allows you to verify the changes and then execute
AWS Cloud Formation - Important Template Elements
{
"AWSTemplateFormatVersion": "version date",
"Description": "JSON string",
"Metadata" : {},
"Parameters": {},
"Mappings": {},
"Resources": {},
"Outputs": {},
}
  • Resources - what do you want to create?
    • One and only mandatory element
  • Parameters - Values to pass to your template at runtime
    • Which EC2 instance to create? ("t2.micro", "m1.small", "ml.large")
  • Mappings - Key value pairs
    • Example: Configure different values for different regions
  • Outputs - Return values from execution
    • See them on console and use in automation
AWS Cloud Formation - Resources
  • The only mandatory section in the template
  • Contains the list of resource objects to be created
  • Each resource has different attributes (mandatory & optional)
    • ImageId attribute for an EC2 instance resource
    • Specified under Properties
  • Type attribute specifies the type of the resource
  • Format for type attribute is: (AWS::ProductIdentifier::ResourceType)
AWS CloudFormation - Parameters
  • Parameters make the template dynamic
    • You can defined constraints on parameters - AllowedPattern, AllowedValues, MaxLength, MaxValue, MinLength, MinValue, etc.
    • Type is Mandatory (String, Number etc.)
      • Can be AWS-specific parameter type
AWS CloudFormation - pseudo Parameters
  • Parameters predefined by AWS CloudFormation
Common Resource Attributes - CreationPolicy
  • CreationPolicy: When is the creation of a resource complete?
    • AustoScalingCreatingPolicy: How many instances in ASG should be ready>
    • ResourceSignal: No of signals and max wait time
    • Used with Amazon EC2 and Auto Scaling resources
    • (Alternative) For coordination with external configuration actions use WaitCondition
Common resource Attributes - Others
  • Condition can be attached with a resource =or ouput section
  • Based on the condition: Resource or output is created
AWS Cloud Formation - Conditions
  • Matches a key to the set of values (can contain one or multiple values)
AWS CloudFormation - Mappings
  • Export values from templates for later use
    • Maximum of 60 outputs in a template
    • Can be used to create cross stack reference by exporting the value
    • CloudFromation does not hide or encrypt the output section
      • If you export password it will be visible
...
AWS CloudFormation - Remember
  • Deleting a stack deletes all the asssociated resources
    • EXCEPT for resources with DeletionPolicy attribute set to "Retain"
    • You can enable termination protection for the entire stack
  • You can execute CloudFormation templates from AWS CLI:
    • aws cloudformation create-stack/list-stacks/describe-stacks
  • Python helper scripts simplify deployment on EC2 instances:
    • cfn-init Retrieve resource metadata, install packages etc.
    • cfn signal Enable you to synchronize with other resources in stack - Signal with a CreationPolicy or WaitCondition
    • cfn-get-metdata Retrieve metadata for a resource or path to a specific key
    • cfn-hup Check for updates to metadata and execute custom hooks

Actually, not sure if I want to use this service. It seems like it would only complicate things.

Serverless Application Model SAM


  • Serverless projects can become a maintenance headache
    • 1000s of Lambda functions to manage, versioning, deployment, etc.
    • How to setup serverless projects with Lambda, DynamoDB in your local?
    • How to ensure that your serverless projects are adhering to best practices?
  • Welcome SAM - Serverless Application Model
    • Infrastructure as Code (IAC) for Serverless Applications
Serverless Application model - Approach and Advantages
  • Open source framework for building serverless applications
  • Define YAML file with resources (Functions, APIs, Databases...)
  • (BEHIND THE SCENES) SAM config => CloudFormation scripts
  • Benefits of SAM
    • Single deployment configuration
    • Extends CloudFormation and hides complexity
    • Built-in best practices
    • Local debugging and testing
    • Benefits of IAC (Infrastructure as Code)
      • No manual errors, Version control, Avoid configuration drift

EC2 - Advanced


Vertical Scaling

  • Deploying application/database to bigger instance:
    • A larger hard drive
    • A faster CPU
    • More RAM, CPU, I/O, or networking capabilities
  • There are no limits to vertical scaling
Horizontal Scaling

  • Deploying multiple instances of application/database
  • (Typically but not always) Horizontal Scaling is preferred to Vertical Scaling
    • Vertical Scaling has limits
    • Vertical scaling can be expensive
    • Horizontal scaling increases availability
  • (BUT) Horizontal Scaling needs additional infrastructure
    • Load Balancers etc.

Horizontal Scaling for EC2

  • Distribute EC2 instances
    • in a single AZ
    • in multiple AZs in single region
    • in multiple AZs in multiple regions
  • Auto scale: Auto Scaling Group
  • Distribute Load: Elastic Load Balancer
EC2 Tenancy - Shared vs Dedicated
  • Shared Tenancy (Default)
    • Single host machine can have instances from multiple customers
  • EC2 Dedicated Instances
    • Virtualized instances on hardware dedicated to one customer
    • You do NOT have visibility into the hardware of underling host
  • EC2 Dedicated Hosts
    • Physical servers dedicated to one customer
    • You have visibility into the hardware of underlying host (sockets and physical cores)
    • (Use cases) Regulatory needs or server-bound software licenses like Windows Server, SQL Server


Pricing Model

Description

Details

On Demand

Request when you want it

Flexible and Most Expensive

Spot

Quote the maximum price

Cheapest (up to 90% off) BUT NO Guarantees

Reserved

Reserve ahead of time

Up to 75% off. 1 or 3 yeas reservation.

Savings Plan

Commit spending $X per hour (EC2 or AWS Fargate or Lambda)

Up to 66% off. No restrictions. 1 or 3 years reservation.

  • EC2 On-Demand
    • Ideal for:
      • A web application that receives spiky traffic
      • A batch program which has unpredictable runtime and cannot be interrupted
      • A batch program being moved from on-premises to cloud for the first time
  • EC2 Spot Instances
    • Ideal for Non time-critical workloads that can tolerate interruptions (fault tolerant)
      • A batch program that does not have a strict deadline AND can be stopped at short notice and re-started.
  • EC2 Reserved Instances
    • Reserve EC2 instances ahead of time
  • EC2 Pricing Models Overview
EC2 Placement Groups
  • Certain use cases need control over placement of a group of EC2 instances
    • Low latency network communication
    • High availability
  • EC2 Placement groups:
    • Cluster (low latency)
    • Spread (avoid simultaneous failures)
    • Partition (multiple partitions with low network latency
  • Cluster Placement Groups
    • EC2 instances placed near each other in single AZ
    • High Network Throughput: EC2 instances can use 10 Gbps or 25Gbps network

Storage in Cloud - Block Storage and File Storage


Storage Types - Block Storage and File Storage

  • What is the type of storage of your hard disk?
    • Block Storage
      • Amazon Elastic Block Store (EBS)
      • Instance Store
  • You've create a file share to share a set of files with your colleagues in a enterprise. What type of storage are you using?
    • File storage
      • Amazon EFS (for Linux instances)
      • Amazon FSx Windows File Servers
      • Amazon FSx for Lustre (high performance use cases)
EC2 - Block Storage
  • Two popular types of blocks storage can be attached to EC2 instances
    • Elastic Block Store
    • Instance Store
  • Instance stores are physically attached to the EC2 instance
  • Elastic Block Store (EBS) is network storage


AWS Managed Services


PAAS vs IAAS
AWS Managed Service Offerings
  • Elastic Load Balancing - Distribute incoming traffic across multiple targets
  • AWS Elastic Beanstalk- Run and manage web apps
  • Amazon Elastic Container Service- Containers orchestration on AWS
  • AWS Fargate- Serverless compute for containers
  • Amazon Elastic Kubernetes Service- Run Kubernetes on AWS
  • Amazon RDS- Relational Database

Comments

You must be logged in to post a comment!

Insert Math Markup

ESC
About Inserting Math Content
Display Style:

Embed News Content

ESC
About Embedding News Content

Embed Youtube Video

ESC
Embedding Youtube Videos

Embed TikTok Video

ESC
Embedding TikTok Videos

Embed X Post

ESC
Embedding X Posts

Embed Instagram Post

ESC
Embedding Instagram Posts

Insert Details Element

ESC

Example Output:

Summary Title
You will be able to insert content here after confirming the title of the <details> element.

Insert Table

ESC
Customization
Align:
Preview:

Insert Horizontal Rule

#000000

Preview:


Insert Chart

ESC

View Content At Different Sizes

ESC

Edit Style of Block Nodes

ESC

Edit the background color, default text color, margin, padding, and border of block nodes. Editable block nodes include paragraphs, headers, and lists.

#ffffff
#000000

Edit Selected Cells

Change the background color, vertical align, and borders of the cells in the current selection.

#ffffff
Vertical Align:
Border
#000000
Border Style:

Edit Table

ESC
Customization:
Align:

Upload Lexical State

ESC

Upload a .lexical file. If the file type matches the type of the current editor, then a preview will be shown below the file input.

Upload 3D Object

ESC

Upload Jupyter Notebook

ESC

Upload a Jupyter notebook and embed the resulting HTML in the text editor.

Insert Custom HTML

ESC

Edit Image Background Color

ESC
#ffffff

Insert Columns Layout

ESC
Column Type:

Select Code Language

ESC
Select Coding Language