Menu

Designing and Improving Reliability for Systems and Services – c110043gwpl

Course #: c110043gwpl

Duration: 0.8 Hours

The Designing and Improving Reliability for Systems and Services module covers how to select, define, and implement acceptable reliability targets. These targets help measure systems performance in order to avoid bottlenecks and result in a high reliability architectural solution.

Objectives

  • Manage reliability targets for cloud services
  • Identify reliability bottlenecks
  • Design reliable solutions based on reliability targets
  • Implement a high-availability architecture

Audience

This course is intended for learners who are pursuing professional-level site reliability engineer certification on IBM Cloud.

Prerequisites

Before starting this curriculum, the target audience should understand:
•System Thinking
•DevOps practices
•Cloud Architecture
•Software engineering principles
•System administration
•Network and OSI model
•Networking and security practices for IBM Cloud
•Incident management
•Root cause analysis

The target audience should also be able to:
•Proficiently write code
•Create run books as a reference
•Make system components serviceable
•Interpret data and statistics to determine actions
•Use LogDNA, SysDig, Grafana, Prometheus, Kibana
•Interpret schematics
•Drive incidents to resolution
•Remediate underlying sources of unreliability
•Create and configure VMs
•Create and configure Containers on IBM Kubernetes Service (IKS)/Red Hat OpenShift Kubernetes Services (ROKS)
•Create and configure Containers using OpenShift
•Create and configure Serverless applications
•Configure for high availability and scalability

Topics

Module Introduction
Topic 1: Reliability Targets for Your Service
Topic 2: Key Criteria Used to Identify Reliability Bottlenecks
Topic 3: Designing a Reliable Solution Based on Realistic Reliability Targets
Topic 4: Implementing a High Availability Architecture
Module Summary

Contact us regarding the training