THOMAS RAVN TRAPLATONNET March 16th th 2010 San Francisco Platon A leading Information Management consulting firm Independent of software vendors Headquarter in Copenhagen Denmark 220 employees in 9 offices ID: 339011
Download Presentation The PPT/PDF document "Monitor the Quality of your Master Data" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Monitor the Quality of your Master Data
THOMAS RAVNTRA@PLATON.NET
March 16th
th
2010, San FranciscoSlide2
Platon
A leading Information Management consulting firm
Independent of software vendors
Headquarter in Copenhagen, Denmark
220+ employees in 9 offices
300+ customers and 800+ projectsFounded in 1999Employee owned company
“Platon received good feedback in our satisfaction survey. Clients cited the following strengths: experience and skill of consultants, business focus and the ability to remain focused on the needs of the client, and a strong methodological approach”
Gartner
July 2008
2Slide3
Key Concepts and Definitions
MDM
“Information Management is the discipline of managing and l
everaging
information in a company as a strategic asset”
“Master Data Management (MDM) is the structured management of Master Data in terms of definitions, governance, architecture, technology and processes”
Data
Governance
“Data Governance is the cross-functional discipline of managing, improving, monitoring, maintaining, and protecting data”
Information Management
3
“Data Quality Management is the discipline of ensuring high quality data in enterprise systems”
DQMSlide4
Components of an effective MDM approach
4
Formalize business ownership and stewardship around data.
Ensure that Master Data is taken into account each and every time a business process or an IT system is changed.
Control in which systems Master Data is entered and how it is synchronized across systems.
Manage Master Data Repository.
To be able to share data you need to share definitions and business rules. Definitions require management, rigor and documentation.
Capturing Master data efficiently needs to be built into the business processes.
Equally consistent usage of Master Data needs to be ensured across business processes and business functions.
Measure and monitor the quality of dataSlide5
Typical Data Problems - 1
5
No
Name
Address
Purchase90328574
IBM
187 N.Pk. Str. Salem NH 01456
8,494.0090328575
I.B.M. Inc.
187 N.Pk. St. Sarem NH 01456
3,432.00
90328575
International Bus. M.
187 No. Park St Salem NH 04156
2,243.00
09243242
Int. Bus. Machines
187 Park Ave Salem NH 04156
5,900.00
12398732
Inter-Nation Consults
15 Main St. Andover MA 02341
6,800.00
99643413
Int. Bus. Consultants
PO Box 9 Boston MA 02210
10,243.00
43098436
I.B. Manufacturing
Park Blvd. Boston MA 04106
15,999.00
How much did we spend with IBM last year?Slide6
Typical Data Problems - 2
6
Name
Street
Zip
CodeCityCAFÉ SPORTSCLUB15 3rd
Street10001
New YorkCAFÉ SPORT KLUB15 Third St.
.NYC
Is this the same customer?
Are these the same products?
Description, System 2
1 L Cappucino - Mathilde Cafe
FETA W/OLIVES & GARLIC 60G, 45+
1000 ML YOG. PEACH/BANANA
Description, System 1
1/1L Mathilde Cafe Ice Cappucino
45+ FETA M/OLI+HVIDL 60G, 45+
YOGHURT PÆRE/BANAN, 1000MLSlide7
Typical Problems - 3
A common problem is overloading of fields, which is the misuse of a field compared to the intended use. Often because the field the user wanted to use wasn’t available in the applicationSometimes a field might even have been used for different purposes by different parts of the organization
7
Customer No
Name
EmailFax
1234
Johnjohn@mail.comVip Customer
3368Petepete@mail.comTel: 112233442345Bob
bob@mail.comSlide8
Where Does the Bad
Data Come From?
8
State is a required field – regardless of countrySlide9
Where Does the Bad
Data Come From?
9Slide10
Top 5 Sources of Bad Data
Lack of ownership and clearly defined responsinility
Lack of common definitions for data
Lack of control of field usage
Lack of process control
Lack of synchronization between systems 10Slide11
What is Good Data Quality?
11
Larry English:
Quality exists solely in the eye of a customer of a product or service based on the value they perceive
Information quality is consistently meeting ‘end customers’ expectations through information and information services, enabling them to perform their jobs effectively
To define information quality, one must identify the "customer" of the data - the knowledge worker who requires data to perform his or her job
Platon definition:
Data Quality is the degree to which data meets the defined standardsSlide12
“Information producers will create information only to the quality level for which they are
trained,
measured
and held
accountable
.” Larry English“The Law of Information Creation”
12Slide13
Data Standards & Data Quality
It’s all about the Meta Data…
13
Good Meta Data is prequisite to achieve great data quality (inferred
from
the
trained
part of the ”Law of Information
Creation”
)
You can only achieve high quality data if you have standards to measure against!Slide14
Defining Good Data standards
14
Business description
Data entry format and conventions
Definition owner
Stakeholders
Definition and keys
Life cycle
Classification(s)
Hierarchies
For every entity define:
For every
field
define:
Consider what a user needs to know to produce high quality data
Business Owner(s)Slide15
15
Data Standards – An Example
Challenges
Relating the data definitions to the process documentation
Keeping
the definitions up to date
The same piece of information may be entered in multiple different systemsSlide16
Defining Good Data standards
There are two basic approaches to defining your data standardsDefine a system independent Enterprise Information Model and then map attributes to system fields, or
Define data definitions for a system (screen/table) specific view of data
If you have one primary system where a data entity is used, option 2 is preferable
If you have many different systems where the same data entity is used, option 1 is preferable
16Slide17
Generating Garbage
Garbage In = Garbage Out
Quality
Standard1
In + Quality
Standard2
In
= Garbage Out
17Slide18
18
Data Quality Monitoring
Like most other things, data quality can only be managed properly if it is measured and monitored
A data quality monitoring concept is necessary to ensure that you identify
Trends in data quality
Data quality issues before they impact critical business processesAreas where process improvements are neededSlide19
Data Quality Monitoring
For this to work, clearly-defined standards, targets for data quality and follow-up mechanisms are requiredThere is little point in monitoring the quality of your data if no one in the business feels responsible and if clear business rules data have not yet been definedThus a data quality monitoring concept should go hand in hand with a data governance model
19Slide20
The Dimensions of Data Quality
Validity
Accuracy
Consistency
Integrity
Data
Quality
Timeliness
Completeness
Does data reflect the real world objects or a
trusted source?
Are business rules on field and table relationships met?
Are
shared data
elements
synchronized correct across the system landscape?
Do we have all required data?
Are all data values within the valid
domain for the field?
Are data available at the time needed?
20Slide21
KPI Examples in the different dimensions
Dimension
KPI Example
Completeness
Pct
of active customer records with an email addressValidityPct of active US customers with a phone number of 10 digitsAccuracyPct of active customers with an mailing address that is verified as correct against Dun & Bradstreet
ConsistencyPct. of customer records shared
between our CRM system and our ERP system that has identical values for name, address and telephone number. IntegrityPct. of active product records with [type] =
“Service” where [weight] = 0, or Pct. of open sales orders that refer to an active customer.TimelinessPct. of supplier records where the time from request of a new record to completion and release of the record is less then 24 hours
21Slide22
22
The Dimensions of Data Quality
Business Impact
Difficulty of Measurement
Completeness
Validity
Integrity
Timeliness
Consistency
AccuracySlide23
23
The steps in building a monitoring concept
Building a data quality monitoring concept involves the
following five
basic steps:
Identify stakeholdersConduct interviews with stakeholders and selected business usersIdentify data quality candidate KPI’sSelect KPI’s for data quality monitoringFor each KPI, define detailsSlide24
Finding Good Data Quality KPI’s
Perform a thorough data assessment (profiling) exercise searching for common data quality problems and look for abnormalities
Collect
business input
Business process requirements
Data
quality pain points
Business Intelligence
Business
KPIs
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
XXX
DEFINED KPIs
KPIFrq
TargetUoM
ABC
KPI Candidates
To
find good data quality KPIs collect business input through interviews with stakeholders (use Interviewing Technique) and a data assessment. The technique Data Profiling contains more details on how to analyze data
24Slide25
Tying Data Quality KPIs to Business Processes
It is essential that KPIs are not just made up, so your organization has something to measureDon’t measure data quality because it’s great to have high quality data. Measure it because your business processes depend on it
Derive data quality
KPIs
from business process requirements
Start with a high level business process like procurement (also known as a macro process) and then break it down. 25Slide26
Tying Data Quality KPIs to Business Processes
Procurement
No duplicate vendors
Correct industry code for vendors
Correct placement in hierarchy (parent vendor)
Correct email address for vendors
Business Meta Data
DEFINED KPIs
KPI
Frq
Target
UoM
A
B
C
Data quality requirements
Business Meta Data is required to define the actual KPIs.
Ex: A vendor record is uniquely defined as an address of a vendor where we place orders, receive shipments from or…..
Define the data entities used within the process
Material Master
Data
Data Entity Scope
Macro process
Process
Is the required data quality aspect meaningful to monitor?
It may be better to improve data validation or perhaps problems are not experienced
Spend analysis
Vendor Selection
26
Vendor Master
DataSlide27
Tying Data Quality KPIs to Business Processes
Using a simple model like the one illustrated on the previous slide allows you to tie data quality KPIs to business processes and to business stakeholdersThis relationship is critical for the success of the data quality monitoring initiative. Clearly illustrating how poor data quality impacts specific business processes is instrumental in getting the executive support and the business buy in
When conducting data quality KPI interviews you may encounter KPI suggestions like “measure if there is a valid relationship between gross weight and product type”. Ask why this is important and which process this is important for
A particular data quality KPI may be important for multiple different processes. Document the relationship to all relevant processes
27Slide28
Defining Data Quality
KPI’sData quality KPIs should express the important characteristics of quality of a particular data element
Typically units of measures are percentages, ratios, or number of occurrences
For consistency reasons, try to harmonize the measures. If for instance one measure is “number of customers without a postal code” while another is “percentage of customers with a valid VAT-no” a list of measures will look strange, since one measure should be as high as possible, and the other as low as possible
A good simple approach is to define all data quality KPI’s as percentages, with a 100% meaning all records meet the criteria behind this KPI
Be careful not to define too many measures, as this will just make the organizational implementation more difficultPay attention to controlling fields (like material type) that may determine rules like whether a specific attribute is required
28Slide29
Defining Hierarchies
Use hierarchical measures where possible, so that measures can be rolled up in regions and countries for instance
In the below example a KPI related to customer data is broken down in individual countries to allow detailed follow up
A concern here is that fields may be used differently in different countries. Given the below data insight, it might make sense to define a separate KPI’s for
CA and perhaps ignore MX and US
KPI: Customer Fax number correctly formattedUS Customers
CA Customers
MX Customers
5%
43%
77%
Value
Avg. Value
25%
Recs
85,000
38,000
19,000
Data Insight
Fax numbers are not required for US customers since all communication is done via email.
Fax is the primary communication channel with Canadian customers.
Only some customers in Mexico have a fax machine.
29Slide30
Defining KPI Thresholds
Along with each KPI two thresholds should be defined:
Lowest acceptable value
Without specifying the lowest acceptable value (or worst value), it’s difficult to know when to react
If the measure falls below this threshold action is required
Target valueWithout target values, you don’t know when the quality is ok. Remember fit-for-purposeSpecifying a low and target threshold allows for traffic light reporting that provides an easy overviewDefining appropriate thresholds can be difficult as even a single product record with wrong dimensions may cause serious process impact. But without any indication of when to be alerted any form of automated monitoring is difficult
Target Value: 95 %
Lowest acceptable value: 80 %
30Slide31
31Indirect Measures
Consider critical fields (e.g. weight of a product or customer type) where the correct value is of utmost importance, but it’s close to impossible to define the rules to check if a new value entered is correct….
One approach is to measure indirectly by for instance reporting what users have changed these values for which products over the last 24 hours, week or whatever is appropriate in your organizationSlide32
Cross field KPIs and Process KPIs
Common KPIs that are not related to a single field
Number of new customer records created this week
Average time from request to completion of a new material record
Number of materials with a non-unique description (or pct. of materials with a unique description)
Number of vendors, where a different payment is defined in different purchasing organizations
Number of open sales orders referring to an inactive customer
32Slide33
Think Prevention!
Every possible business rule related to completeness, integrity, consistency and validity should be enforced by the system at the time of data entry.If it isn’t, consider implementing a data input validation rule rather than allowing bad data to be entered and then measure it!
However, there are cases, where the business logic of a field is too ambiguous to be enforced by a simple input validation rule.
Process (workflow) adjustments may also be the answer.
33Slide34
34
Documentation of KPIs
KPI Name:
A meaningful name of the KPI that
expresses
what is being
measured
Objective:
Why do you measure this? What business processes are impacted if there data is not ok?
Dimensions:
What data quality dimensions (integrity, validity, etc.) are this KPI related to?
Frequency of measure:
How often do you wish to report on this KPI? Daily,
daily,
weekly
or monthly?
Unit of measure:
What is the unit of the KPI? Number of records, pct of records, number of bad values, etc.?
Lowest acceptable measure:
Threshold that indicates if the data quality aspect the KPI represents is at a minimal acceptable level. The value here must be in the unit of measure of the KPI.
Target value:
At what value is the KPI considered to represent data quality at a high level?
Responsible:
The person responsible for the particular KPI.
Formula:
The tables and fields that are used to analyze and calculate the KPI. This is the functional design formula that forms the basis for the technical implementation.
Hierarchies:
When reporting on a KPI it is very useful to be able to slice and dice the measure according to different dimensions or hierarchies. For a customer data KPI for instance, good hierarchies would be regions, country, company code and account group.
Being able to view the KPI through a hierarchy also makes it easier to follow up with specific groups of business users.
Notes and assumptions:
If certain assumptions are made about the KPI make sure to document
it
hereSlide35
35
Remember!
Quality is in the Eye of the beholder!
Data quality is defined by our Information Customers
Data is not always clean or dirty in itself – it may depend on the viewpoint and a defined standard
Focus on what’s important to those that use the dataSlide36
Monitoring Process
A simple example
36
Publish KPI
Analyze KPIs
Evaluate root cause
Implement Improvements
Plan corrective actions
Low value in KPI?
Y
NSlide37
37
Monitor the Quality of your Master Data
Thomas Ravn
Practice Director, MDM
E: tra@platon.net
M: +1 646-400-2862
PLATON US INC.
5 PENN PLAZA, 23
rd
Floor
NEW YORK NY 10001 www.platon.net