Menu

IBM InfoSphere QualityStage Essentials v11.7 – 2m214gspl

Course #: 2m214gspl

Duration: 4 Days

This course teaches how to build QualityStage parallel jobs that investigate, standardize, match, and consolidate data records. This course covers common data quality issues, QualityStage architecture, QualityStage clients and their functions, importing metadata, running jobs and reviewing results, building Investigate jobs, the Standardize stage and rule sets, identifying matching records and applying multiple Match passes, building a Survive job, and using a Two-Source match.

Students will gain experience by building an application that combines customer data from three source systems into a single master customer record.

Objectives

After completing this course, learners should be able to:

  • List common data quality contaminants
  • Describe QualityStage architecture, clients, and their functions
  • Build and run DataStage and QualityStage jobs and review results
  • Use Character Discrete, Concatenate, and Word Investigations to analyze data fields
  • Build jobs using the Standardize stage
  • Build a QualityStage job to identify matching records
  • Interpret, improve, and consolidate match results

Audience

This course is intended for Data Analysts responsible for data quality using QualityStage, Data Quality Architects, and Data Cleansing Developers.

Prerequisites

  • Participants should have the following skills:
    • Familiarity with the Windows Operating System
    • Familiarity with a text editor
    • Helpful, but not required:
      • Some understanding of elementary statistics principles such as weighted averages and probabilities.

Topics

Unit 1 - Data Quality Issues

Unit 2 - QualityStage Overview

  • Exercise 1: QualityStage Logon

Unit 3: Developing with QualityStage

  • Exercise 1: Import Table Definition Metadata
  • Exercise 2: Build a QualityStage Job

Unit 4: Investigate

  • Exercise 1: Build Investigate Jobs

Unit 5: Standardize

  • Exercise 1: Standardize Country
  • Exercise 2: Select US Records
  • Exercise 3: Standardize USPREP
  • Exercise 4: Standardize USNAME, USADDR, and USAREA
  • Exercise 5: Investigate unhandled Patterns
  • Exercise 6: Apply Rule Set Override

Unit 6: Match

  • Exercise 1: Create Match Frequency Job
  • Exercise 2: One-Source Match Specification
  • Exercise 3: Build One-Source Job using Match Specification

Unit 7: Survive

  • Exercise 1: Survivorship
  • Exercise 2: Create Customer Master Load File

Unit 8: Two-Source Match

  • Exercise 1: Read the Case Study
  • Exercise 2: Prepare the Data Environment
  • Exercise 3: Run the Two-Source Match Job

Contact us regarding the training