
Automating SRE and Operations on AWS
Sushanth Mangalore
This audiobook is narrated by a digital voice.
DESCRIPTION
Reliability and efficiency are paramount in the ever-evolving cloud landscape. Whether you are an enterprise adopting cloud for the first time or have been operating in the cloud for a...
Location:
United States
Networks:
Sushanth Mangalore
Digital Voice Madison G
BPB Publications
English Audiobooks
INAudio Audiobooks
Description:
This audiobook is narrated by a digital voice. DESCRIPTION Reliability and efficiency are paramount in the ever-evolving cloud landscape. Whether you are an enterprise adopting cloud for the first time or have been operating in the cloud for a while, you want to ensure that you are taking advantage of automation options available to you in the cloud. This book provides a systematic, chapter-by-chapter approach to automating every aspect of your operations on AWS. You will understand laC with AWS CloudFormation, automate continuous infrastructure maintenance, and streamline your release pipelines with a focus on DevSecOps. This book emphasizes building a culture of performing operations as code, and you will learn about the different facets of operations on AWS and the services and features available on AWS to automate many of these operational and SRE tasks. Finally, it demonstrates how to apply automation to cost management and auditing through FinOps. By the end of this book, you will be able to apply the concepts and ideas presented in this book to your cloud operations on AWS. This will help enhance the productivity and efficiency of your organization by reducing the operational heavy lifting. You will learn to operate on AWS in the same manner as other successful cloud adopters around the world. WHAT YOU WILL LEARN ● Introduction to SRE practices in the cloud. ● Distinction between the terms SRE and DevOps. ● Differentiating operational tasks on AWS from traditional approaches. ● Automation of common operational and SRE tasks on AWS using native services and options. ● Using CloudWatch and X-ray for monitoring and distributed tracing. ● Deploying IaC and performing operations as code. ● Continuous integration and deployment, observability, and incident response for your AWS environments. ● Auditing and cost management for your AWS environment. Duration - 13h 9m. Author - Sushanth Mangalore. Narrator - Digital Voice Madison G. Published Date - Friday, 31 January 2025. Copyright - © 2026 BPB ©.
Language:
English
Title Page
Duration:00:00:20
Copyright Page
Duration:00:01:21
Dedication Page
Duration:00:00:19
About the Author
Duration:00:01:07
About the Reviewers
Duration:00:02:57
Acknowledgement
Duration:00:01:05
Preface
Duration:00:17:27
Table of Contents
Duration:00:06:56
1. Site Reliability Engineering Responsibilities
Duration:00:00:06
Introduction
Duration:00:00:38
Structure
Duration:00:00:25
Objectives
Duration:00:00:30
Introduction to operations and site reliability engineering
Duration:00:02:15
Operations in on-premises environments
Duration:00:04:53
SRE responsibilities for cloud workloads
Duration:00:10:29
Categories of operational tasks
Duration:00:16:57
Introduction to our sample workload
Duration:00:04:41
Conclusion
Duration:00:00:49
Points to remember
Duration:00:00:29
Multiple choice questions
Duration:00:01:28
Answers
Duration:00:00:18
2. SRE versus DevOps
Duration:00:00:04
Understanding DevOps
Duration:00:06:34
Comparing DevOps with SRE
Duration:00:03:49
SRE's role in enhancing DevOps practices
Duration:00:02:33
DevOps processes and tools
Duration:00:05:08
Metrics for DevOps
Duration:00:04:37
DORA and SRE
Duration:00:09:40
3. SRE on AWS
Duration:00:00:04
Introduction to SRE practices on AWS
Duration:00:09:10
Benefits of applying engineering to operations
Duration:00:09:00
AWS services for operations and SRE practices
Duration:00:22:13
Organizing for efficient operations on AWS
Duration:00:11:06
4. Infrastructure as Code
Duration:00:00:04
Benefits of automating infrastructure as code
Duration:00:06:16
AWS CloudFormation
Duration:00:06:18
Automating infrastructure for our sample workload
Duration:00:42:54
Using AWS Cloud Development Kit
Duration:00:11:23
AWS Serverless Application Model
Duration:00:03:24
Terraform
Duration:00:03:47
5. Automating Infrastructure Maintenance
Duration:00:00:05
Maintenance tasks for your infrastructure
Duration:00:06:50
AWS Systems Manager basics
Duration:00:01:34
AWS Systems Manager Agent
Duration:00:05:35
Operations management
Duration:00:08:19
Application management
Duration:00:12:48
Change management
Duration:00:24:20
Node management
Duration:00:00:26
Session Manager
Duration:00:04:59
Fleet Manager
Duration:00:02:24
Inventory
Duration:00:03:40
Run Command
Duration:00:06:45
State Manager
Duration:00:06:39
Patch Manager
Duration:00:12:13
Distributor
Duration:00:03:34
6. Release Automation
Duration:00:00:04
Release automation tasks
Duration:00:22:59
Introduction to AWS release automation tools
Duration:00:11:58
Release pipeline for our sample workload
Duration:00:18:17
DevSecOps on AWS
Duration:00:10:23
7. Observability for Reliable Operations
Duration:00:00:05
Observability, monitoring, and distributed tracing on AWS
Duration:00:01:04
Logs
Duration:00:02:53
Metrics
Duration:00:03:06
Traces
Duration:00:02:50
AWS services helping observability
Duration:00:05:45
Amazon CloudWatch
Duration:00:16:57
CloudWatch metrics
Duration:00:11:42
Using metrics to determine workload health
Duration:00:13:07
Custom Metrics and Logs
Duration:00:07:07
Metric streams
Duration:00:02:36
Embedded Metric Format
Duration:00:04:37
Anomaly detection
Duration:00:06:06
AWS X-Ray
Duration:00:14:01
8. Automating Resilience
Duration:00:00:04
Defining resilience
Duration:00:04:01
High availability and disaster recovery
Duration:00:07:29
Achieving high availability on AWS
Duration:00:00:24
Multi-AZ deployment
Duration:00:05:37
Autoscaling
Duration:00:04:29
Application-level high availability
Duration:00:01:28
Additional considerations for high availability
Duration:00:05:14
DR planning and testing
Duration:00:02:39
Declaring a disaster and initiating recovery
Duration:00:03:30
Backup and restore
Duration:00:04:11
Pilot light
Duration:00:04:37
Warm standby
Duration:00:01:58
Active-active
Duration:00:02:52
Considerations for cross-region backup copies
Duration:00:02:36
AWS Backup policies
Duration:00:04:53
Cell-based architecture on AWS for HA and DR
Duration:00:05:35
9. Incident Response Automation
Duration:00:00:04
Principles of incident response
Duration:00:19:29
Responding to events with Amazon EventBridge
Duration:00:15:36
Using runbooks for incident response
Duration:00:03:52
Automating with multi-step workflows
Duration:00:09:50
Testing recoverability
Duration:00:11:13
Chaos engineering
Duration:00:01:46
AWS Fault Injection Service
Duration:00:03:08
10. Auditing, FinOps and Miscellaneous Automation
Duration:00:00:05
Auditing on AWS
Duration:00:08:13
Auditing activities with AWS CloudTrail
Duration:00:17:45
Infrastructure auditing with AWS Config
Duration:00:20:31
FinOps and cost management on AWS
Duration:00:23:05
Miscellaneous automation
Duration:00:10:13
Index
Duration:00:26:56