AWS Devops : Management & Governance

AWS DEVOPS : Cheat Sheet

Table Of Contents

Cloudformation
AWS Config
Elastic Bean Stalk
Opsworks
Cloudtrail:
Cloud Watch
Trusted Advisor
System Manager

Cloudformation

We can deploy the stack to OU
Stack : upload or s3 path
On stack failure: rollback all, or preserve provisioned resources
Stack Policy : Defines the resources that you want to protect from unintentional updates during a stack update.
There are other configurations as well, rollback policies, timeout period, SNS notification
Create the changeset for any update and execute them, there is always failure policies: complete rollback or partial
tempaltes : json|yaml
stacks: resource as stack
changeset: make an update in runing resource : generate changeset, about what operation is going to happen
you can validate against syntax error
The AWS CloudFormation action that CodePipeline invokes when processing the associated stage. Choose one of the following action modes:
Create or replace a change set creates the change set if it doesn’t exist based on the stack name and template that you submit. If the change set exists, AWS CloudFormation deletes it, and then creates a new one.
Create or update a stack creates the stack if the specified stack doesn’t exist. If the stack exists, AWS CloudFormation updates the stack. Use this action to update existing stacks. CodePipeline won’t replace the stack.
Delete a stack deletes a stack. If you specify a stack that doesn’t exist, the action is completed successfully without deleting a stack.
Execute a change set executes change set.
Replace a failed stack and creates the stack if the specified stack doesn’t exist. If the stack exists and is in a failed state (reported as ROLLBACK_COMPLETE, ROLLBACK_FAILED, CREATE_FAILED, DELETE_FAILED, or UPDATE_ROLLBACK_FAILED), AWS CloudFormation deletes the stack and then creates a new one. If the stack isn’t in a failed state, AWS CloudFormation updates it. Use this action to replace failed stacks without recovering or troubleshooting them. You would typically choose this mode for testing.
Stack set to manage multiple duplicate account across different region

CloudFormation supports Chef & Puppet Integration to deploy and configure right down the application layer
By default, automatic rollback on error feature is enabled,
CloudFormation provides a WaitCondition resource that acts as a barrier,
IAM can be applied with CloudFormation to access control for users whether they can view stack templates, create stacks, or delete stacks
IAM permissions need to be provided for the user to the AWS services and resources provisioned when the stack is created
Before a stack is created, AWS CloudFormation validates the template to check for IAM resources that it might create
A stack policy is a JSON document that defines the update actions that can be performed on designated resources.
After you set a stack policy, all of the resources in the stack are protected by default.
Updates on specific resources can be added using an explicit Allow statement for those resources in the stack policy.
Only one stack policy can be defined per stack, but multiple resources can be protected within a single policy.
A stack policy applies to all CloudFormation users who attempt to update the stack. You can’t associate different stack policies with different users
A stack policy applies only during stack updates. It doesn’t provide access controls like an IAM policy.

We can’t edit the previous template, we need to reupload the new version of the template
Stacks are identified by name
SNS can be configured for notification
Termination protection, timeout and stack policies can be applied
While updating cloud formation template, it will show preview changes, which shows replacement true/false
If the stack update fails, rollbacks to a previous known working state
Stack policy , updating during update to allow the update, is only valid for that update, and after that it will remain as originally applied. It is done to bypass the restriction if required.

AWS Config

Resource management

Specify the resource types you want AWS Config to record.

Set up an Amazon S3 bucket to receive a configuration snapshot on request and configuration history.
Set up Amazon SNS to send configuration stream notifications.
Grant AWS Config the permissions it needs to access the Amazon S3 bucket and the Amazon SNS topic.

SNS is not available at rule level, however available at AWS config level as whole
However, we can use CW events to automate the rule action
Now, AWS remediation can also be added [pre-defined / custom]

Rules and conformance packs

Specify the rules that you want AWS Config to use to evaluate compliance information for the recorded resource types.
Use conformance packs, or a collection of AWS Config rules and remediation actions that can be deployed and monitored as a single entity in your AWS account.

Aggregators

Use an aggregator to get a centralized view of your resource inventory and compliance. An aggregator is an AWS Config resource type that collects AWS Config configuration and compliance data from multiple AWS accounts and AWS Regions into a single account and Region.

Config rules trigger at configuration changes or periodic
Custom rules: either by lambda or guard [policy as coode]
The default period is 7 years for recording
You will require an s3 bucket to record the configuration

Elastic Bean Stalk

Elastic Beanstalk automates the setup, configuration, and provisioning of other AWS services like EC2, RDS, and Elastic Load Balancing to create a web service.

APP + ENV

App has version and saved configuration
One can setup a lifecycle policy to retain or delete the app versions uploaded for deployment
There is a change history for configuration change
The env has two-tier : web server/worker tier [listen for message on sqs queue]

You can choose platform, tomcat, java, php .net docker etc
There is a platform version,for docker: running on ec2 or running on ecs
Source code: upload or s3
You can tag applications separately from env
There are configuration preset : single instance, HA, custom
Amazon X-ray support, s3 logs with rotate logs, CW logs
Env contains
- Software for logs and env properties
- Instance, size, and security group
- Capacity: env type, az, AMI, placement, scaling for HA
- Load balancer if any
- Rolling update and deployments
  - Rolling update type [immutable]
- Deployment type: All at once / Immutable, deployment preference through health check
- You can directly add email for notification
- Can directly create RDS or restore a snapshot
- Db deletion policy
  - Create snapshot
  - Retain
  - delete
One can restore terminated env
There is a swap url feature.Swapping the environment URL will modify the Route 53 DNS configuration, which may take a few minutes

Codepipline -> EB -> using ECS

Codepipeline -> ECS

can launch Docker environments by building an image described in a Dockerfile or pulling a remote Docker image

Instead, if you are also using Docker Compose, use a docker-compose.yml file, which specifies an image to use and additional configuration options. If you are not using Docker Compose with your Docker environments, use a Dockerrun.aws.json file instead.
You can use a Dockerrun.aws.json v2 file for an ECS managed Docker environment.
On-premise servers are not supported as it is for aws resource management

The precedence order is : direct applied > saved configuration > .ebextension file > default value
.ebextension, commands written under commands section runs before application setup and commands written under container_commands section runs after the application setup
There is a lifecycle policy for rotation of application version, [by day, by number of versions], delete from EB and keep in s3, or delete from S3 as well
The environment can be cloned
Ec2 can be patched via manage updated which requires scheduling window for patching
EBs are either for webserver applications or for worker

Opsworks

Support on premise ec2 linux with opwsowrk agent
Support ami, ubuntu, windows server
Stack + layers
- Stack : blueprints, called layers, used to launch and manage these instances. Applications, user permissions, and other resources are scoped and controlled in the context of the Stack.
- Layers: A Layer is a blueprint for how to setup and configure an instance and related resources such as volumes, Elastic IPs, and can automatically take care of infrastructure configuration like SSL setting, configuration for each layer, including installation scripts, initialization tasks, and packages.Layers also include lifecycle events that let you automate configuration actions in response to changes in an instance’s status
Lifecycle hooks : setup > configure > Deploy > Undeploy > Shutdown
You can not use ec2 user data to setup instance
Support time based scaling as well as load based

OpsWorks Stacks can be accessed globally and can be used to create and manage instances globally
Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying apps, and running scripts
Custom recipes and related files is packaged in one or more cookbooks and stored in a cookbook repository such S3 or Git
OpsWorks Stacks supports the following instance types
24/7 instances – launched and stopped manually
Time based instances – run on scheduled time
Load based instances – automatically started and stopped based on configurable load metrics [cpu/memory/average load]
OpsWorks Stacks does not automatically deploy updated code to online instances, and needs to be done manually

Opsworks needs to know in advance how many servers will bre required as there is no min/max/desired, we need to add this before hand, opsworks will only start and stop
We can deploy many app on one layer
Auto healing needs to be enabled
Events on app layer : setup [after boot] > configure [will occur on all of the stack’s instance] > deploy > undeploy > shutdown
- Configure happens to all the instances if any of the instances goes online or offline, helpful in configuration chage or discovery
- Rest of the events happen for that particular instance

Cloudtrail:

Default , the trail from the console is multi-region
Digest is delivered every 4 hour, and it can check if trial log file is modifed or deleted as well as it contains the reference to the file
you can create a trail that logs all events for all AWS accounts in that organization, organizational trail. When you set up an Organization’s trail with CloudTrail, CloudTrail replicates the trail to each member account within your organization.
Trail, means delivering log to a bucket, for events, there is SNS for delivery as well log file can be encrypted using SSE-KMS and logfile validation is also there, as a hash for audit report
Integrated with cloduwatch logs, sending logs to cloud watch logs to analyze, or you can do from athena as cloudtrail interface is only for 90 days
Event type :
- management event, control plane operation [read/write], option to choose
- Data event: for service : s3/lambda/dynamodb/s3outpost/managed blockchian/s3 object lamabda/ lake formation/EBS direct api/ S3 access point/ Dynamo DB strea,
  - By default, trails don’t log data events.
  - You can select all or specify arn for a particular resource
- Insight event : API call rate, API error rate [unusual event]

Cloud Watch

Monitoring Area
CloudWatch provides two categories of monitoring: basic monitoring and detailed monitoring[api gateway / CloudFront/ec2/EB/Kinesis data stream/MSK/S3].
Alarm / logs / Events/ Xray / Application Monitoring / Insights
Alarm :
- Metric + Condition + Datapoints
- To state inalarm|ok|insufficient data
- Trigger: SNS
- Action : autoscailng[asg group/ecs] / ec2[recover/stop/terminate/reboot] / system manager [cretae opposite / incident]
Logs: steam of data
- Logs cost : data ingestion and storage and if you query
- Log groups: can set up retention period [1 day to never expire]
  - Collection of log streams [can be exported to csv and download the data]
- Can create subscription filter : kinesis, kinesis firehose, lamda, opensearch
- You can export the data to s3
- Log insight : query builder platorform to query log group
Metric
- collection of data points in time series fashion, 7 years/ can’t’ delete, it aggregates automatically
EVENT
- Schedule / event
- Event source
  - AWS event or Event Bridge partner events [There are listed partners]
  - Others
  - All event
- Target
  - Event bus [same account or different account]
  - Can be added the HTTP url for webhooks as a destination
  - Aws services
- Input-output transformer
- Global END POINT : DR/healthcheck/replication
  - Event buses receive events from a variety of sources and match them to rules in your account. Different types of event buses receive events from different sources, including AWS services in your account and other accounts, custom applications and services, and partner applications and services. There is a default
  - Event can be routed from primary to second region bus for DR failover, and route53 health check can be applied for failover
  - When event replication is enabled, events will be sent to the secondary Region event bus to be reprocessed. This is an asynchronized replication between the primary and secondary Region event buses.
- One can archive the published events by settings up filter and retention period
- One can replay the event / retrigger
Dashboard aggregation of metric in visual form
Require CW agent, and can logs on-premise resource
You can add cross-account functionality to your CloudWatch console. This functionality provides you with cross-account visibility to your dashboards, alarms, metrics, and automatic dashboards without having to log in and log out of different accounts.
You can use grafana with cloud watch

When log events are sent to the receiving service, they are base64 encoded and compressed with the gzip format.
Each log group can have up to two subscription filters associated with it.
If the destination service returns a retryable error such as a throttling exception or a retryable service exception (HTTP 5xx for example), CloudWatch Logs continues to retry delivery for up to 24 hours. CloudWatch Logs does not try to re-deliver if the error is a non-retryable error, such as AccessDeniedException or ResourceNotFoundException.
CloudWatch Logs also produces CloudWatch metrics about the forwarding of log events to subscription
CloudWatch Logs subscription to stream log data in near real time to an Amazon OpenSearch Service cluster
There is an api get-metric-statistics , to export the metric
Alarm can be configured only with one metric
Cloudwatch alarm is not a valid event source in CW event
Log group can be setup for expiration but not the log stream
S3 events are native to s3, if we use CW events for s3 [object level events], we need to enable it at clodutrail for that bucket

Cross account :
- You must enable sharing in each account that will make data available to the monitoring account.

Trusted Advisor

Advise on
- Cost
- Performance
- Security
- Fault tolerance
- Service limits
Can setup a weekly email for notification
you can create reports to aggregate the check results for all member accounts in your organization

Refresh rate is 5 minute, can be refreshed through API whole as well as a single rule
Cost / performance/ security/fault tolerance/service limits
Basic security and service limits are free
A trusted advisor can check the exposed access key
Can be automated with CW events
Business/Enterprise will see the cloud watch metric

System Manager

You need to select the OS patch base line
You can add your own source while making patch baseline
You can setup rejected/accepted package
You need to create schedule
You need to register instance [tag, instance, resource group]
Register run command / automation task / lambda task/ Step function task
Can setup concurrency, error rate, sns

AWS DEVOPS : Cheat Sheet

Tagged with: AWS

Notes By DeveloperCK

AWS Devops : Management & Governance

Cloudformation

AWS Config

Elastic Bean Stalk

Opsworks

Cloudtrail:

Cloud Watch

Trusted Advisor

System Manager

Leave a Reply Cancel reply