Site Reliability Engineering: Measuring and Managing Reliability

Coursera Site Reliability Engineering: Measuring and Managing Reliability

Platform
Coursera
Provider
Google Cloud
Effort
2 hours/week
Length
4 weeks
Language
English
Credentials
Paid Certificate Available
Course Link
Overview
This course teaches the theory of Service Level Objectives (SLOs), a principled way of describing and measuring the desired reliability of a service. Upon completion, learners should be able to apply these principles to develop the first SLOs for services they are familiar with in their own organizations.

Learners will also learn how to use Service Level Indicators (SLIs) to quantify reliability and Error Budgets to drive business decisions around engineering for greater reliability. The learner will understand the components of a meaningful SLI and walk through the process of developing SLIs and SLOs for an example service.

WHAT YOU WILL LEARN
  • How to make systems reliable
  • Understanding SLIs, SLOs and SLAs
  • Quantifying risks to and consequences of SLOs
Syllabus
  1. Introduction to SRE
  2. Targeting Reliability
  3. Operating for Reliability
  4. Choosing a Good SLI
  5. Developing SLOs and SLIs
  6. Developing SLOs and SLIs
  7. Consequences of SLO Misses

Taught by
Google Cloud Training
Author
Coursera
Views
1,582
First release
Last update
Rating
0.00 star(s) 0 ratings
Top