AWS Devops : Management & Governance

AWS DEVOPS : Cheat Sheet

Cloudformation

  • We can deploy the stack to OU
  • Stack : upload or s3 path
  • On stack failure: rollback all, or preserve provisioned resources
  • Stack Policy : Defines the resources that you want to protect from unintentional updates during a stack update.
  • There are other configurations as well, rollback policies, timeout period, SNS notification
  • Create the changeset for any update and execute them, there is always failure policies: complete rollback or partial
  • tempaltes : json|yaml 
  • stacks: resource as stack
  • changeset: make an update in runing resource : generate changeset, about what operation is going to happen 
  • you can validate against syntax error
  • The AWS CloudFormation action that CodePipeline invokes when processing the associated stage. Choose one of the following action modes:
  • Create or replace a change set creates the change set if it doesn’t exist based on the stack name and template that you submit. If the change set exists, AWS CloudFormation deletes it, and then creates a new one.
  • Create or update a stack creates the stack if the specified stack doesn’t exist. If the stack exists, AWS CloudFormation updates the stack. Use this action to update existing stacks. CodePipeline won’t replace the stack.
  • Delete a stack deletes a stack. If you specify a stack that doesn’t exist, the action is completed successfully without deleting a stack.
  • Execute a change set executes change set.
  • Replace a failed stack and creates the stack if the specified stack doesn’t exist. If the stack exists and is in a failed state (reported as ROLLBACK_COMPLETE, ROLLBACK_FAILED, CREATE_FAILED, DELETE_FAILED, or UPDATE_ROLLBACK_FAILED), AWS CloudFormation deletes the stack and then creates a new one. If the stack isn’t in a failed state, AWS CloudFormation updates it. Use this action to replace failed stacks without recovering or troubleshooting them. You would typically choose this mode for testing.
  • Stack set to manage multiple duplicate account across different region
  • CloudFormation supports Chef & Puppet Integration to deploy and configure right down the application layer
  • By default, automatic rollback on error feature is enabled,
  • CloudFormation provides a WaitCondition resource that acts as a barrier,
  • IAM can be applied with CloudFormation to access control for users whether they can view stack templates, create stacks, or delete stacks
  • IAM permissions need to be provided for the user to the AWS services and resources provisioned when the stack is created
  • Before a stack is created, AWS CloudFormation validates the template to check for IAM resources that it might create
  • A stack policy is a JSON document that defines the update actions that can be performed on designated resources.
  • After you set a stack policy, all of the resources in the stack are protected by default.
  • Updates on specific resources can be added using an explicit Allow statement for those resources in the stack policy.
  • Only one stack policy can be defined per stack, but multiple resources can be protected within a single policy.
  • A stack policy applies to all CloudFormation users who attempt to update the stack. You can’t associate different stack policies with different users
  • A stack policy applies only during stack updates. It doesn’t provide access controls like an IAM policy.
  • We can’t edit the previous template, we need to reupload the new version of the template
  • Stacks are identified by name
  • SNS can be configured for notification 
  • Termination protection, timeout and stack policies can be applied
  • While updating cloud formation template, it will show preview changes, which shows replacement true/false
  • If the stack update fails, rollbacks to a previous known working state
  • Stack policy , updating during update to allow the update, is only valid for that update, and after that it will remain as originally applied. It is done to bypass the restriction if required.
Cloudformation Update Policies

CFN hooks :

AWS Config

Resource management

  • Specify the resource types you want AWS Config to record.
  • Set up an Amazon S3 bucket to receive a configuration snapshot on request and configuration history.
  • Set up Amazon SNS to send configuration stream notifications.
  • Grant AWS Config the permissions it needs to access the Amazon S3 bucket and the Amazon SNS topic.
  • SNS is not available at rule level, however available at AWS config level as whole
  • However, we can use CW events  to automate the rule action
  • Now, AWS remediation can also be added [pre-defined / custom]

Rules and conformance packs

  • Specify the rules that you want AWS Config to use to evaluate compliance information for the recorded resource types.
  • Use conformance packs, or a collection of AWS Config rules and remediation actions that can be deployed and monitored as a single entity in your AWS account.

Aggregators

  • Use an aggregator to get a centralized view of your resource inventory and compliance. An aggregator is an AWS Config resource type that collects AWS Config configuration and compliance data from multiple AWS accounts and AWS Regions into a single account and Region.
  • Config rules trigger at configuration changes or periodic
  • Custom rules: either by lambda or guard [policy as coode]
  • The default period is 7 years for recording
  • You will require an s3 bucket to record the configuration

Elastic Bean Stalk

  • Elastic Beanstalk automates the setup, configuration, and provisioning of other AWS services like EC2, RDS, and Elastic Load Balancing to create a web service.

APP + ENV 

  • App has version and saved configuration
  • One can setup a lifecycle policy to retain or delete the app versions uploaded for deployment
  •  There is a change history for configuration change
  • The env has two-tier : web server/worker tier [listen for message on sqs queue]
  • You can choose platform, tomcat, java, php .net docker etc
  • There is a platform version,for docker: running on ec2 or running on ecs
  • Source code: upload or s3
  • You can tag applications separately from env
  • There are configuration preset : single instance, HA, custom
  • Amazon X-ray support, s3 logs with rotate logs, CW logs
  • Env contains
    • Software for logs and env properties
    • Instance, size, and security group
    • Capacity: env type, az, AMI, placement, scaling for HA
    • Load balancer if any
    • Rolling update and deployments 
      • Rolling update type [immutable]
    • Deployment type: All at once  / Immutable, deployment preference through health check
    • You can directly add email for notification
    • Can directly create RDS or restore a snapshot
    • Db deletion policy
      • Create snapshot
      • Retain
      • delete
  • One can restore terminated env
  • There is a swap url feature.Swapping the environment URL will modify the Route 53 DNS configuration, which may take a few minutes

Codepipline -> EB -> using ECS

Codepipeline -> ECS

  • can launch Docker environments by building an image described in a Dockerfile or pulling a remote Docker image
  • Instead, if you are also using Docker Compose, use a docker-compose.yml file, which specifies an image to use and additional configuration options. If you are not using Docker Compose with your Docker environments, use a Dockerrun.aws.json file instead.
  • You can use a Dockerrun.aws.json v2 file for an ECS managed Docker environment.
  • On-premise servers are not supported as it is for aws resource management
  • The precedence order is  : direct applied > saved configuration > .ebextension file > default value
  • .ebextension, commands written under commands section runs before application setup and commands written under container_commands section runs after the application setup
  • There is a lifecycle policy for rotation of application version, [by day, by number of versions], delete from EB and keep in s3, or delete from S3 as well
  • The environment can be cloned
  • Ec2 can be patched via manage updated which requires scheduling window for patching
  • EBs are either for webserver applications or for worker

Opsworks

  • Support on premise ec2 linux with opwsowrk agent
  • Support ami, ubuntu, windows server
  • Stack + layers
    • Stack : blueprints, called layers, used to launch and manage these instances. Applications, user permissions, and other resources are scoped and controlled in the context of the Stack. 
    • Layers: A Layer is a blueprint for how to setup and configure an instance and related resources such as volumes, Elastic IPs, and can automatically take care of infrastructure configuration like SSL setting, configuration for each layer, including installation scripts, initialization tasks, and packages.Layers also include lifecycle events that let you automate configuration actions in response to changes in an instance’s status 
  • Lifecycle hooks : setup  > configure > Deploy > Undeploy > Shutdown
  • You can not use ec2 user data to setup instance
  • Support time based scaling as well as load based
  • OpsWorks Stacks can be accessed globally and can be used to create and manage instances globally
  • Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying apps, and running scripts
  • Custom recipes and related files is packaged in one or more cookbooks and stored in a cookbook repository such S3 or Git
  • OpsWorks Stacks supports the following instance types
  • 24/7 instances – launched and stopped manually
  • Time based instances – run on scheduled time
  • Load based instances – automatically started and stopped based on configurable load metrics [cpu/memory/average load]
  • OpsWorks Stacks does not automatically deploy updated code to online instances, and needs to be done manually
  • Opsworks needs to know in advance how many servers will bre required as there is no min/max/desired, we need to add this before hand, opsworks will only start and stop
  • We can deploy many app on one layer
  • Auto healing needs to be enabled
  • Events on app layer : setup [after boot] > configure [will occur on all of the stack’s instance] > deploy > undeploy > shutdown
    • Configure happens to all the instances if any of the instances goes online or offline, helpful in configuration chage or discovery
    • Rest of the events happen for that particular instance

Cloudtrail:

  • Default , the trail from the console is multi-region
  • Digest is delivered every 4 hour, and it can check if trial log file is modifed or deleted as well as it contains the reference to the file
  •  you can create a trail that logs all events for all AWS accounts in that organization, organizational trail. When you set up an Organization’s trail with CloudTrail, CloudTrail replicates the trail to each member account within your organization.
  • Trail, means delivering log to a bucket, for events, there is SNS for delivery as well log file can be encrypted using SSE-KMS and logfile validation is also there, as a hash for audit report
  • Integrated with cloduwatch logs, sending logs to cloud watch logs to analyze, or you can do from athena as cloudtrail interface is only for 90 days
  • Event type :
    •  management event, control plane operation [read/write], option to choose
    • Data event: for service : s3/lambda/dynamodb/s3outpost/managed blockchian/s3 object lamabda/ lake formation/EBS direct api/ S3 access point/ Dynamo DB strea,
      • By default, trails don’t log data events. 
      • You can select all or specify arn for a particular resource
    • Insight event : API call rate, API error rate [unusual event]

Cloud Watch

  • Monitoring Area
  • CloudWatch provides two categories of monitoring: basic monitoring and detailed monitoring[api gateway / CloudFront/ec2/EB/Kinesis data stream/MSK/S3].
  • Alarm / logs / Events/ Xray / Application Monitoring / Insights
  • Alarm : 
    • Metric + Condition + Datapoints
    • To state  inalarm|ok|insufficient data
    • Trigger: SNS
    • Action : autoscailng[asg group/ecs] / ec2[recover/stop/terminate/reboot] / system manager [cretae opposite / incident]
  • Logs: steam of data
    • Logs cost : data ingestion and storage and if you query
    • Log groups: can set up retention period [1 day to never expire]
      • Collection of log streams [can be exported to csv and download the data]
    • Can create subscription filter : kinesis, kinesis firehose, lamda, opensearch
    • You can export the data to s3
    • Log insight : query builder platorform to query log group
  • Metric
    •  collection of data points in time series fashion, 7 years/ can’t’ delete, it aggregates automatically
  • EVENT
    • Schedule / event
    • Event source
      • AWS event or Event Bridge partner events [There are listed partners]
      • Others
      • All event
    • Target
      • Event bus [same account or different account]
      • Can be added the HTTP url for webhooks as a destination
      • Aws services
    • Input-output transformer
    • Global END POINT : DR/healthcheck/replication
      • Event buses receive events from a variety of sources and match them to rules in your account. Different types of event buses receive events from different sources, including AWS services in your account and other accounts, custom applications and services, and partner applications and services. There is a default
      • Event can be routed from primary to second region bus for DR failover, and route53 health check can be applied for failover
      • When event replication is enabled, events will be sent to the secondary Region event bus to be reprocessed. This is an asynchronized replication between the primary and secondary Region event buses.
    • One can archive the published events by settings up filter and retention period
    • One can replay the event / retrigger
  • Dashboard aggregation of metric in visual form
  • Require CW agent, and can logs on-premise resource
  • You can add cross-account functionality to your CloudWatch console. This functionality provides you with cross-account visibility to your dashboards, alarms, metrics, and automatic dashboards without having to log in and log out of different accounts.
  • You can use grafana with cloud watch
  • When log events are sent to the receiving service, they are base64 encoded and compressed with the gzip format.
  • Each log group can have up to two subscription filters associated with it.
  • If the destination service returns a retryable error such as a throttling exception or a retryable service exception (HTTP 5xx for example), CloudWatch Logs continues to retry delivery for up to 24 hours. CloudWatch Logs does not try to re-deliver if the error is a non-retryable error, such as AccessDeniedException or ResourceNotFoundException.
  • CloudWatch Logs also produces CloudWatch metrics about the forwarding of log events to subscription
  • CloudWatch Logs subscription to stream log data in near real time to an Amazon OpenSearch Service cluster
  • There is an api get-metric-statistics , to export the metric
  • Alarm can be configured only with one metric 
  • Cloudwatch alarm is not  a valid event source in CW event
  • Log group can be setup for expiration but not the log stream
  • S3 events are native to s3, if we use CW events for s3 [object level events], we need to enable it at clodutrail for that bucket
  • Cross account : 
    • You must enable sharing in each account that will make data available to the monitoring account.

Trusted Advisor

  • Advise on 
    • Cost
    • Performance
    • Security
    • Fault tolerance
    • Service limits
  • Can setup a weekly email for notification
  • you can create reports to aggregate the check results for all member accounts in your organization
  • Refresh rate is 5 minute, can be refreshed through API whole as well as a single rule
  • Cost / performance/ security/fault tolerance/service limits
  • Basic security and service limits are free
  • A trusted advisor can check the exposed access key 
  • Can be automated with CW events
  • Business/Enterprise will see the cloud watch metric

System Manager

  • You need to select the OS  patch base line
  • You can add your own source while making patch baseline
  • You can setup rejected/accepted package 
  • You need to create schedule
  • You need to register instance [tag, instance, resource group]
  • Register run command / automation task / lambda task/ Step function task
  • Can setup concurrency, error rate, sns
AWS DEVOPS : Cheat Sheet

Leave a Reply

Your email address will not be published. Required fields are marked *