Rafik Mahjoubi Kamel ABDELLAOUI RafikMAHJOUBIoecdorg abdellaouikamelinstn WHAT SDMX IS This is what SDMX provides and enables A model to describe statistical data and ID: 933386
Download Presentation The PPT/PDF document "Apr 2017 SDMX Information Model" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Apr 2017
SDMX Information Model
Rafik
Mahjoubi
Kamel
ABDELLAOUI
Rafik.MAHJOUBI@oecd.org abdellaoui.kamel@ins.tn
Slide2WHAT SDMX IS
This is what SDMX provides and enables
A model to describe statistical data and
metadata and guide how to structure content
A standard for automated communication from machine to machine
A technology supporting standardised IT tools
To take advantage of all this:
Statisticians
must use common description
for data and metadata
The data exchange process is then driven by the common description
Data descriptions are made available for everybody who wants to understand and reuse the data
Slide3SDMX governance
Sponsor Organizations
(Chief Statisticians)
Secretariat
(senior executive officers)
SDMX Technical
Working Group
(SDMX TWG)
Technical Standards
SDMX Statistical
Working Group
(SDMX SWG)
Guidelines and Cross
Domain Artefacts
Slide4The SDMX Components
4
...not just a transmission
format !
4
The SDMXComponents
4
Describe statistics in a standard way
Objects and their relationships
Data Structure Definition (DSD), Concepts, Code List
Central management and standard access
SDMX Registry, SDMX Web Services
Cross
Domain
Concepts
Cross Domain Code Lists
Statistical Domains
SDMX Glossary
Push
Provider generates and sends file to receiver
Pull
Provider opens web service to data
Receiver downloads regularly
Hub
Special case of pull: receiver downloads on end user request
Slide5THE INFORMATION MODEL
Slide6The Information model:
An information model is a representation of concepts, relationships, constraints, rules and operations.
Slide7What things does SDMX need to model?
Statistical data
Through descriptor concepts. These concepts can be further classified into dimensions, attributes and measures
Metadata
Structural metadata
Reference metadata
Slide8SDMX provides a way of modelling statistical data, structural metadata and the data exchange process.
SDMX also defines a model for additional explanatory metadata, so called reference metadata, which is generally in a textual format.
Slide9Slide10Describe the Structure of a Table
Unit Multiplier
Unit
Topic
Time/Frequency
Country
Measure type
Observation
(Concept)
(Concept)
(Concept)
(Concept)
(Concept)
(Concept)
Slide11Roles of concepts
Comprises
Concepts
that
identify
the observation value
Concepts that add additional metadata about the observation value
Concept that is the observation valueAny of these may becodedtextdate/timenumberetc.
Dimensions
Attributes
Measure
Representation
Slide12Data Structure Definition: Concept Usage
Unit Multiplier
Unit
Topic
Time/Frequency
Country
Stock/Flow
Observation
(Dimension)
(Dimension)
(Dimension)
(Attribute)
(Dimension)
(Dimension)
(Attribute)
(Measure)
Slide13Identify Concepts
Source: FAO proof of concept project (2007)
Measurement = 1,000 Kg
Slide14Concepts
Reference Area
Commodity
Frequency and Time
Observation ValueMeasure Type
Unit and Unit Multiplier
Measurement = 1,000 Kg
Slide15Concept Roles
Reference Area
Commodity
Frequency and Time
Observation Value
Measure Type
Unit and Unit Multiplier
Measurement = 1,000 Kg(Dimension)(Dimensions)(Measure)(Dimension)
(Dimension)
(Attributes)
Slide16Identify/Define Code Lists
Purpose of a Code ListList the allowed items for concepts Define a computer readable representationAllows to define labels in multiple languages Agreeing on harmonised code lists is probably the most difficult aspect of defining a data structure definition
Slide17Example of Codelist
Each
C
ode List
is defined uniquely
by: an ID,
a maintenance agency, a version. The name and description can be provided in several languages.Each item is defined by an ID,
The name and description can be provided in several languages.
Slide18A Code List and the items
Commodity code listCL_COMMODITY
IMTS.CL_COMMODITY(1.0)
HS92-12, SITC 1-4, BEC
Code Id
Code Name_XNot specified
HS02_01LIVE ANIMALSHS02_0101Live horses, asses, mules and hinniesHS02_010110Pure-bred breeding horses and assesHS02_010190Live horses, asses, mules and hinnies (excl. pure-bred for breeding)HS02_0101XXLive horses, asses, mules and hinnies // Confidential ItemHS02_0102
Live bovine animalsHS02_010210Pure-bred breeding bovinesHS02_010290Live bovine animals (excl. pure-bred for breeding)HS02_0102XXLive bovine animals // Confidential ItemHS02_0103Live swine
HS02_010310
Pure-bred breeding swine
HS02_010391
Live pure-bred swine, weighing < 50 kg (excl. pure-bred for breeding)
HS02_010392
Live pure-bred swine, weighing >= 50 kg (excl. pure-bred for breeding)HS02_0103XXLive swine // Confidential ItemHS02_0104
Live sheep and goatsHS02_010410
Live sheepHS02_010420
Live goats
HS02_0104XX
Live sheep and goats // Confidential Item
HS02_0105
Live poultry, "fowls of the species Gallus domesticus, ducks, geese, turkeys and guinea fowls"
HS02_010511
Live fowls of the species Gallus
domesticus
, weighing <= 185 g (excl. turkeys and guinea fowls)
Slide19Identifying concepts
Slide20Domain 1
Cross-domain concepts and code lists
FREQ
REF. AREA
Domain 2
Set of used concepts
Cross-domain concepts
COMPARABILITY
Slide21SDMX Cross Domain Code Lists
Slide22Full Data Structure Definition
Slide23Sender
Receiver
Dataset1
Data
Structure
Data exchange, How ?
Slide24Sender
Receiver
Dataset1
SDMX
Dataset
SDMX DSD
Data exchange: the SDMX way
Slide25Data, Structural metadata and Reference metadata
Data
Reference
Metadata
grouped into
Data set
Metadata
set
described by
DSD
MSD
Structural
Metadata
grouped into
described by
Slide26THE CONTENT-ORIENTED GUIDELINES
Slide27Content oriented guidelines
The content-oriented guidelines are a set of recommendations within the scope of the SDMX standard in order to produce maximun interoperability.
Slide28There are three main areas of these content-oriented guidelines:Cross-domain concepts (and code lists).Statistical subject-matter domains.SDMX Glossary.
Slide29Cross-domain concepts
They are a list of statistical concepts, related to statistical processes and data quality.
The list is based on the concepts used by the contributing international organisations.
The concepts can be used at the data side as well as at the metadata side.
Slide30Examples of cross-domain concept
Slide31Examples of cross-domain concept
Slide32A cross-domain concept may have a code list as presentation. This means that the concept might take a limited set of possible items enumerated in its corresponding code list.The code lists associated with cross-domain concepts are called cross-domain code lists.
Slide33Code lists have a general description, a list of codes, their description and annotations that provide additional information on the codes.Examples of cross-domain concepts and code list:FREQ and its associated code list CL_FREQ.
SEX and its associated code list CL_SEX.
Slide34Slide35Slide36Statistical subject-matter domains
Statistical subject matter domains is a high level classification of statistical areas.They refer to statistical activities that have common characteristics with respect to variables, concepts and methodologies for data collection.Examples: price statistics, national accounts, environment statistics or education statistics.It is intended to cover the universe of official statistics.
Slide37Statistical subject-matter domains
Based on the UNECE Classification of International Statistical Activities
Slide38Functions of the classification of statistical domains.A standard against which domain lists of national and international organisations can be mapped to facilitate the exchange of data and metadata.
Provides an identifier for registering and searching statistical data on SDMX registries.Navigation aide for the identification and organisation of corresponding domain groups.
Slide39SDMX Glossary
The Glossary is a vocabulary that recommends a common terminology to be used in order to facilitate communication and understandingThe Glossary is closely linked to the cross-domain concepts as it also contains all these concepts, stating their definitions and context descriptions.
Slide40SDMX Glossary
The Glossary covers a selected range of metadata concepts:General metadata concepts.Metadata terms describing statistical methodologies and data quality.Terms referring specifically to data and metadata exchange.
Slide41Examples of the SDMX Glossary
Slide42Examples of the SDMX Glossary
Slide43IT Architecture for data exchange
Slide44Standard formats for the exchange of data and metadata.SDMX-EDISDMX-MLArchitectures for data exchange:
PushPullData-hubSDMX Tools
Slide45http://www.sdmx.org
Slide46http://ec.europa.eu/eurostat/web/sdmx-infospace/trainings-tutorials/tutorials
Slide47SDMX web site : www.sdmx.org(
Eurostat Info Space): https://webgate.ec.europa.eu/fpfis/mwikis/sdmx/index.php/Main_PageISTAT Training Courses :References