Key Features
- 16 hours
- On-Site or Virtual Course with Live Certified Trainer
- Official Certification Exam from the DevOps Institute
- Official DevOps Institute student material
- Practical exercises
- Simulation exam
- 16 PDUs
Course details
- What will I achieve?
- Who is this course for?
- What are the exam characteristics?
- What are the course contents?
- Payment methods
This course is an introduction to the principles & practices that enable an organization to reliably and economically scale critical services. Introducing a site-reliability dimension requires organizational re-alignment, a new focus on engineering & automation, and the adoption of a range of new working paradigms.
The course highlights the evolution of SRE and its future direction, and equips participants with the practices, methods, and tools to engage people across the organization involved in reliability and stability evidenced through the use of real-life scenarios and case stories. Upon completion of the course, participants will have tangible takeaways to leverage when back in the office such as understanding, setting and tracking Service Level Objectives (SLO’s).
What will I achieve?
- Learn the history of SRE and its emergence at Google.
- Learn the inter-relationship of SRE with DevOps and other popular frameworks.
- Know the underlying principles behind SRE.
- Understand Service Level Objectives (SLO’s) and their user focus.
- Understand Service Level Indicators (SLI’s) and the modern monitoring landscape.
- Learn about error budgets and the associated error budget policies.
- Identify toil and its effect on an organization’s productivity.
- Define some practical steps that can help to eliminate toil.
- Understand observability as something to indicate the health of a service.
- Become familiar with SRE tools, automation techniques and the importance of security.
- Learn more about anti-fragility, our approach to failure and failure testing.
- Understand the organizational impact that introducing SRE brings.
Who is this course for?
- Anyone starting or leading a move towards increased reliability
- Anyone interested in modern IT leadership and organizational change approaches
- Business Managers
- Business Stakeholders
- Change Agents
- Consultants
- DevOps Practitioners
- IT Directors
- IT Managers
- IT Team Leaders
- Product Owners
- Scrum Masters
- Software Engineers
- Site Reliability Engineers
- System Integrators
- Tool Providers
What are the exam characteristics?
- Time allocated: 60 minutes
- Number of questions: 40 multiple-choice
- Passing score: 65% (26 correct answers)
- Format: Online; open-book
- Prerequisites: There are no formal prerequisites for this course. At least 4 hours of personal study during the course are recommended.
When will I know my exam results?
When the exam is paper-based, the results will be notified to the participant afterwards through email. When it is web-based, the participant will get the results immediately after finishing the exam.
What happens if I fail the exam?
The participant who fails the exam may take it again any times at extra cost. No time window between exams is required.
What are the course contents?
1 SRE Principles & Practices
- What is Site Reliability Engineering?
- SRE & DevOps: What is the Difference?
- SRE Principles & Practices
2 Service Level Objectives & Error Budgets
- Service Level Objectives (SLO’s)
- Error Budgets
- Error Budget Policies
3 Reducing Toil
- What is Toil?
- Why is Toil Bad?
- Doing Something About Toil
4 Monitoring & Service Level Indicators
- Service Level Indicators (SLI’s)
- Monitoring
- Observability
5 SRE Tools & Automation
- Automation Defined
- Automation Focus
- Hierarchy of Automation Types
- Secure Automation
- Automation Tools
6 Anti-Fragility & Learning from Failure
- Why Learn from Failure
- Benefits of Anti-Fragility
- Shifting the Organizational Balance
7 Organizational Impact of SRE
- Why Organizations Embrace SRE
- Patterns for SRE Adoption
- Sustainable Incident Response
- Blameless Post-Mortems
- SRE & Scale
8 SRE, Other Frameworks, Trends
- SRE & Other Frameworks
- SRE Evolution
- Additional Sources of Information