Menu

Managing Incidents on IBM Cloud – c110018gwpl

Course #: c110018gwpl

Duration: 0.8 Hours

The Managing Incidents on IBM Cloud module covers incident characteristics, implementing alerts for incident thresholds, creating runbooks for troubleshooting and mitigating incidents, and problem-solving techniques.

Objectives

  • Learn about incident characteristics
  • Understand the incident management process
  • Learn how to implement alerts for incident thresholds
  • Gain an appreciation for the impacts of incidents on upstream and downstream processes
  • Learn how to create runbooks to troubleshoot and mitigate the most common incidents
  • Learn about automation services available on IBM Cloud
  • Understand essential problem solving techniques

Audience

This course is intended for learners who are pursuing professional-level site reliability engineer certification on IBM Cloud.

Prerequisites

Before starting this curriculum, the target audience should understand:
•System Thinking
•DevOps practices
•Cloud Architecture
•Software engineering principles
•System administration
•Network and OSI model
•Networking and security practices for IBM Cloud
•Incident management
•Root cause analysis

The target audience should also be able to:
•Proficiently write code
•Create run books as a reference
•Make system components serviceable
•Interpret data and statistics to determine actions
•Use LogDNA, SysDig, Grafana, Prometheus, Kibana
•Interpret schematics
•Drive incidents to resolution
•Remediate underlying sources of unreliability
•Create and configure VMs
•Create and configure Containers on IBM Kubernetes Service (IKS)/Red Hat OpenShift Kubernetes Services (ROKS)
•Create and configure Containers using OpenShift
•Create and configure Serverless applications
•Configure for high availability and scalability

Topics

Module Introduction
Topic 1: Incident Characteristics
Topic 2: The Incident Management Process
Topic 3: Implementing Alerts for Incident Thresholds
Topic 4: Impacts of Upstream and Downstream Dependencies
Topic 5: Creating Runbooks to Troubleshoot and Mitigate Common Incidents
Topic 6: Types of Automation Services Available on IBM Cloud
Topic 7: Problem Solving Techniques
Module Summary

Contact us regarding the training