Difference between revisions of "ISO 19157:2013 Geographic information - Data quality"

From ICA Wiki
Jump to navigation Jump to search
Line 1: Line 1:
 +
=== Data quality (ISO 19157:2013) ===
 +
'''Overview of ISO 19157:2013'''
 +
 +
 +
{| style="border-spacing:0;"
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Full name
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| ISO 19157:2013, Geographic information – Data quality
  
==Overview==
 
{| class="wikitable sortable"
 
 
|-
 
|-
| Full name
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Version
|ISO 19157:2013, Geographic information – Data quality.
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Edition 1
 +
 
 
|-
 
|-
| Version
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Amendments
| Edition 1
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| None
 +
 
 
|-
 
|-
| Amendments
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Corrigenda
| None
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| None
 +
 
 
|-
 
|-
| Corrigenda
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Published by
| None
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| ISO/TC 211
 +
 
 
|-
 
|-
| Published by
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Languages
| ISO/TC 211
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| English
 +
 
 
|-
 
|-
| Languages
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Online overview
| English, French
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| [#iso:std:iso:19157:ed-1:v1:en https://www.iso.org/obp/ui/#iso:std:iso:19157:ed-1:v1:en]
 +
 
 
|-
 
|-
| Online overview
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Type of standard
| https://www.iso.org/obp/ui/#iso:std:iso:19157:ed-1:v1:en
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| ISO International Standard
 +
 
 +
Meta level
 +
 
 
|-
 
|-
| Type of standard
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Related standard(s)
| ISO International Standard
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| ISO 19115-1:2013, Geographic information – Metadata – Part 1: Fundamentals
Meta level.
+
 
 +
ISO 19115-2:2009, Geographic information – Metadata – Part 2: Extensions for imagery and gridded data
 +
 
 +
ISO 19158:2012, Geographic information – Quality assurance of data supply
 +
 
 
|-
 
|-
| Related standard(s)
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Application
| ISO 19115-1:2013, Geographic information – Metadata – Part 1: Fundamentals<br/>ISO 19115-2:2009, Geographic information – Metadata – Part 2: Extensions for imagery and gridded data<br/>ISO 19158:2012, Geographic information – Quality assurance of data supply
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| The standard specifies the description, evaluation and reporting of the quality of geographic data.
 +
 
 
|-
 
|-
| Application
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Conformance classes
| The standard specifies the description, evaluation and reporting of the quality of geographic data.  
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Data quality evaluation process
|-
+
 
| Conformance
 
classes
 
|Data quality evaluation process
 
 
Data quality metadata
 
Data quality metadata
 +
 
Standalone quality report
 
Standalone quality report
 +
 
Data quality measure
 
Data quality measure
 +
 
|}
 
|}
 
+
===== Scope =====
==Scope==
 
 
 
 
ISO 19157:2013 establishes the principles for describing the quality for geographic data. It defines components for describing data quality; specifies components and content structure of a register for data quality measures; describes general procedures for evaluating the quality of geographic data; and establishes principles for reporting data quality.
 
ISO 19157:2013 establishes the principles for describing the quality for geographic data. It defines components for describing data quality; specifies components and content structure of a register for data quality measures; describes general procedures for evaluating the quality of geographic data; and establishes principles for reporting data quality.
  
 
The standard also defines a set of data quality measures for use in evaluating and reporting data quality. It is applicable to data producers providing quality information to describe and assess how well a dataset conforms to its product specification and to data users attempting to determine whether or not specific geographic data are of sufficient quality for their particular application.
 
The standard also defines a set of data quality measures for use in evaluating and reporting data quality. It is applicable to data producers providing quality information to describe and assess how well a dataset conforms to its product specification and to data users attempting to determine whether or not specific geographic data are of sufficient quality for their particular application.
  
The standard does not attempt to define minimum acceptable levels of quality for geographic data.
+
The standard does not attempt to define minimum acceptable levels of quality for geographic data.  
 
 
==Implementation benefits==
 
  
 +
===== Implementation benefits =====
 
ISO 19157:2013 provides a standard way for describing the quality of geographic data. Such descriptions are useful when a producer has to evaluate how well a dataset meets the criteria described in its product specification. For example, if the producer outsourced the acquisition of the data, ISO 19157:2013 could be used to evaluate and describe the quality of the received data during acceptance testing.  
 
ISO 19157:2013 provides a standard way for describing the quality of geographic data. Such descriptions are useful when a producer has to evaluate how well a dataset meets the criteria described in its product specification. For example, if the producer outsourced the acquisition of the data, ISO 19157:2013 could be used to evaluate and describe the quality of the received data during acceptance testing.  
  
 
Geographic data are increasingly shared and exchanged. As a result, geographic data are often used for purposes that differ from the purpose for which it was originally captured. Complete descriptions of the quality of a dataset encourage and facilitate the sharing, interchange and use of appropriate datasets.
 
Geographic data are increasingly shared and exchanged. As a result, geographic data are often used for purposes that differ from the purpose for which it was originally captured. Complete descriptions of the quality of a dataset encourage and facilitate the sharing, interchange and use of appropriate datasets.
  
Another benefit of implementing ISO 19157:2013 is that the quality information could assist a user who has to decide whether a specific dataset is appropriate for an intended use or application. If the user has to decide between two or more datasets, standardized quality descriptions simplify comparing the datasets. If ISO 19157:2013 is implemented, quality reports are expressed in a comparable way and there is a common understanding of the quality measures that have been used. A project to develop an XML of ISO 19157:2013 has begun.
+
Another benefit of implementing ISO 19157:2013 is that the quality information could assist a user who has to decide whether a specific dataset is appropriate for an intended use or application. If the user has to decide between two or more datasets, standardized quality descriptions simplify comparing the datasets. If ISO 19157:2013 is implemented, quality reports are expressed in a comparable way and there is a common understanding of the quality measures that have been used. A project to develop an XML of ISO 19157:2013 has begun.  
  
==Implementation guidelines==
+
===== Implementation guidelines =====
 
ISO 19157:2013 cancels and replaces ISO/TS 19138:2006, ISO 19114:2003 and ISO 19113:2002. According to ISO 19157:2013, data quality comprises six elements: completeness, thematic accuracy, logical consistency, temporal quality, positional accuracy and usability. Each element is comprised of a number of sub-elements, for example, completeness (commission and omission), logical consistency (conceptual, domain, format, topological), etc. These elements are used to describe data quality, i.e. how well a specific dataset meets the criteria for the different elements set forth in its product specification or user requirements. Evaluation against the criteria is done either quantitatively or subjectively (non-quantitatively). The latter case applies if a detailed data product specification does not exist or if the data product specification lacks quantitative measures and descriptors. Three metaquality elements – confidence, ‘representativity’ and homogeneity – provide quantitative and qualitative statements about the evaluation against the criteria and its result.  
 
ISO 19157:2013 cancels and replaces ISO/TS 19138:2006, ISO 19114:2003 and ISO 19113:2002. According to ISO 19157:2013, data quality comprises six elements: completeness, thematic accuracy, logical consistency, temporal quality, positional accuracy and usability. Each element is comprised of a number of sub-elements, for example, completeness (commission and omission), logical consistency (conceptual, domain, format, topological), etc. These elements are used to describe data quality, i.e. how well a specific dataset meets the criteria for the different elements set forth in its product specification or user requirements. Evaluation against the criteria is done either quantitatively or subjectively (non-quantitatively). The latter case applies if a detailed data product specification does not exist or if the data product specification lacks quantitative measures and descriptors. Three metaquality elements – confidence, ‘representativity’ and homogeneity – provide quantitative and qualitative statements about the evaluation against the criteria and its result.  
  
Line 66: Line 82:
  
 
ISO 19157:2013 specifies four conformance classes, i.e. the standard can be implemented for four different quality aspects of geo-spatial datasets, each briefly described below.  
 
ISO 19157:2013 specifies four conformance classes, i.e. the standard can be implemented for four different quality aspects of geo-spatial datasets, each briefly described below.  
# Implementing a data quality evaluation process conforming to ISO 19157:2013
 
:A data quality evaluation process conforming to ISO 19157:2013 comprises of four steps:
 
::*Step 1 - Specify the data quality units to be evaluated. Study the data product specification to identify applicable data quality units and their scope. For each data quality unit, identify the applicable data quality element(s). See example in Table 10.25.
 
::*Step 2 - Specify the data quality measures to be used to describe quality of each data quality element of a data quality unit. The requirements in the data product specification provide guidance on applicable data quality measures. See example in Table 10.26. The data quality measures in the table are from the list of standardized data quality measures in ISO 19157:2013. It is also possible to describe user-defined quality measures, see further below, and to maintain a collection of such measures in a catalogue or register.
 
::*Step 3 - Specify the data quality evaluation procedures, i.e. the evaluation method(s) to be applied. The method can be direct (based on inspection of the items in the dataset) or indirect (based on external knowledge, such as lineage metadata). Direct evaluation is further classified by the source against which the evaluation is done: internal if only the data in the dataset is evaluated or external if there is reference to external data (e.g. satellite imagery or ground truth). ISO 19157:2013 includes guidance on how to sample data for evaluation.
 
::*Step 4 - Determine the output of the data quality evaluation, i.e. perform the data quality evaluation described in Steps 1-3 above. Additional results may be produced by aggregating or by deriving from existing results without carrying out a new evaluation. How to report the results of the data quality evaluation is described elsewhere in this chapter.
 
  
{| class="wikitable"
+
''1. Implementing a data quality evaluation process conforming to ISO 19157:2013''
|+Example: Data quality units
+
 
 +
A data quality evaluation process conforming to ISO 19157:2013 comprises of four steps:
 +
 
 +
* Step 1 - Specify the data quality units to be evaluated. Study the data product specification to identify applicable data quality units and their scope. For each data quality unit, identify the applicable data quality element(s). See example in Example: Data quality units.
 +
* Step 2 - Specify the data quality measures to be used to describe quality of each data quality element of a data quality unit. The requirements in the data product specification provide guidance on applicable data quality measures. See example in Example: Data quality measures. The data quality measures in the table are from the list of standardized data quality measures in ISO 19157:2013. It is also possible to describe user-defined quality measures, see further below, and to maintain a collection of such measures in a catalogue or register.
 +
* Step 3 - Specify the data quality evaluation procedures, i.e. the evaluation method(s) to be applied. The method can be direct (based on inspection of the items in the dataset) or indirect (based on external knowledge, such as lineage metadata). Direct evaluation is further classified by the source against which the evaluation is done: internal if only the data in the dataset is evaluated or external if there is reference to external data (e.g. satellite imagery or ground truth). ISO 19157:2013 includes guidance on how to sample data for evaluation.
 +
* Step 4 - Determine the output of the data quality evaluation, i.e. perform the data quality evaluation described in Steps 1-3 above. Additional results may be produced by aggregating or by deriving from existing results without carrying out a new evaluation. How to report the results of the data quality evaluation is described elsewhere in this chapter.
 +
 
 +
'''Example: Data quality units'''
 +
 
 +
 
 +
{| style="border-spacing:0;"
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Data quality unit'''
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Scope'''
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Data quality elements'''
 +
 
 +
|-
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Topographic dataset
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| All features in the dataset
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Completeness (commission and omission), thematic accuracy (correct classification)
 +
 
 +
|-
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Street network
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Street features in the entire dataset
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Logical inconsistency (topological inconsistency)
 +
 
 +
|}
 +
'''Example: Data quality measures'''
 +
 
 +
 
 +
{| style="border-spacing:0;"
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Data quality unit'''
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Data quality element'''
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Data quality measure'''
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Method'''
 +
 
 +
|-
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Topographic dataset
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Completeness (commission)
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 1: Excess item
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct external
 +
 
 +
|-
 +
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 2: Number of excess items
 +
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct external
 +
 
 +
|-
 +
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 3: Number of duplicate feature instances
 +
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct internal
 +
 
 +
|-
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Topographic dataset
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Completeness (omission)
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 1: Missing item
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct external
 +
 
 +
|-
 +
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 2: Number of missing items
 +
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct external
 +
 
 +
|-
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Topographic dataset
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Thematic accuracy (correct classification)
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 1: Number of incorrectly classified features
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct external
 +
 
 +
 
 +
 
 +
 
 +
|-
 +
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 2: Misclassification rate
 +
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct external
 +
 
 +
|-
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Street network
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Logical inconsistency (topological inconsistency)
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 1: Number of missing connections due to undershoots
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct internal
 +
 
 
|-
 
|-
|'''Data quality unit'''
+
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 2: Number of missing connections due to overshoots
|'''Scope'''
+
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct internal
|'''Data quality elements'''
+
 
 
|-
 
|-
|Topographic dataset
+
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 3: Number of invalid self-intersect errors
|All features in the dataset
+
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct internal
|Completeness (commission and omission), thematic accuracy (correct classification)
+
 
 
|-
 
|-
|Street network
+
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 4: Number of invalid self-overlap errors
|Street features in the entire dataset
+
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Direct internal
|Logical inconsistency (topological inconsistency)
+
 
 
|}
 
|}
 +
''2. Implementing data quality metadata conforming to ISO 19157:2013''
 +
 +
Data quality metadata describes the quality of geographic data. ISO 19157:2013 specifies a conceptual model of the different components to be used when describing the quality of geographic data. Overview of the components to be used to describe data quality provides and overview of the components and their relationships to each other. A data dictionary, including definitions for all the components, is provided in the standard. Data quality metadata conforming to ISO 19157:2013 conforms to this conceptual model and is reported in conformance with ISO 19115:2003 and ISO 19115-2:2009
 +
 +
[[Image:]]
 +
 +
'''Overview of the components to be used to describe data quality (Source: ISO 19157:2013)'''
 +
 +
''3. Implementing data quality reports conforming to ISO 19157:2013''
 +
 +
The first (and obvious) requirement is that the quality report comprises quality metadata conforming to ISO 19157:2013 (see 2. above), i.e. it includes sections on all appropriate aspects of quality and the description of components follow the rules defined in the standard. Additional information can be added to the report, but the structure of the report is not prescribed. Example: Section of a data quality report is an example of a section of a data quality report for the quality evaluation process described above.
 +
 +
'''Example: Section of a data quality report'''
 +
 +
 +
{| style="border-spacing:0;"
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Data quality unit'''
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Data quality element'''
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Data quality measure'''
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| '''Result'''
 +
 +
|-
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Topographic dataset
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Completeness (commission)
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 2: Number of excess items
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">1,036</div>
 +
 +
|-
 +
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 3: Number of duplicate feature instances
 +
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">153</div>
 +
 +
|-
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Topographic dataset
 +
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Completeness (omission)
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 2: Number of missing items
 +
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">697</div>
  
{| class="wikitable"
 
|+Example: Data quality measures
 
 
|-
 
|-
|'''Data quality unit'''
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Topographic dataset
|'''Data quality elements'''
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Thematic accuracy (correct classification)
|'''Data quality measure'''
+
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 1: Number of incorrectly classified features
|'''Method'''
+
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">8,774 </div>
 +
 
 +
|-
 +
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 2: Misclassification rate
 +
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">10%</div>
 +
 
 
|-
 
|-
|Topographic dataset
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Street network
|Completeness (commission)
+
| style="border-top:0.0069in solid #00000a;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Logical inconsistency (topological inconsistency)
|Measure 1: Excess item<br/>Measure 2: Number of excess items <br/>Measure 3: Number of duplicate feature instances
+
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 1: Number of missing connections due to undershoots
|Direct external<br/>Direct external <br/>Direct internal
+
| style="border-top:0.0069in solid #00000a;border-bottom:none;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">139</div>
 +
 
 
|-
 
|-
|Topographic dataset
+
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 2: Number of missing connections due to overshoots
|Completeness (omission)
+
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">57</div>
|Measure 1: Missing item <br/>Measure 2: Number of missing items
+
 
|Direct external<br/>Direct external
 
 
|-
 
|-
|Topographic dataset
+
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 3: Number of invalid self-intersect errors
|Thematic accuracy (correct classification)
+
| style="border:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">11</div>
|Measure 1: Number of incorrectly classified features<br/>Measure 2: Misclassification rate
+
 
|Direct external<br/>Direct external
 
 
|-
 
|-
|Street network
+
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| Measure 4: Number of invalid self-overlap errors
|Logical inconsistency (topological inconsistency)
+
| style="border-top:none;border-bottom:0.0069in solid #00000a;border-left:none;border-right:none;padding-top:0in;padding-bottom:0in;padding-left:0.075in;padding-right:0.075in;"| <div align="right">6</div>
|Measure 1: Number of missing connections due to undershoots<br/>Measure 2: Number of missing connections due to overshoots<br/>Measure 3: Number of invalid self-intersect errors<br/>Measure 4: Number of invalid self-overlap errors
+
 
|Direct internal<br/>Direct internal<br/>Direct internal<br/>Direct internal
 
 
|}
 
|}
 +
''4. Implementing data quality measures conforming to ISO 19157:2013''
 +
 +
A data quality measure conforming to ISO 19157:2013 is structurally and semantically well defined and described and modelled as specified in the standard. Such a measure is described by at least an identifier, a name, an element name, definition and a value type. Optional descriptors are an alias, description, a value structure, example, a basic measure and one or more source references and/or parameters. Note that full inspection is most appropriate for small populations or for tests that can be accomplished by automated means. For larger populations, checking a representative part of the data and reporting the quality result as a percentage rate is more appropriate and practical.

Revision as of 14:47, 4 May 2016

Data quality (ISO 19157:2013)

Overview of ISO 19157:2013


Full name ISO 19157:2013, Geographic information – Data quality
Version Edition 1
Amendments None
Corrigenda None
Published by ISO/TC 211
Languages English
Online overview [#iso:std:iso:19157:ed-1:v1:en https://www.iso.org/obp/ui/#iso:std:iso:19157:ed-1:v1:en]
Type of standard ISO International Standard

Meta level

Related standard(s) ISO 19115-1:2013, Geographic information – Metadata – Part 1: Fundamentals

ISO 19115-2:2009, Geographic information – Metadata – Part 2: Extensions for imagery and gridded data

ISO 19158:2012, Geographic information – Quality assurance of data supply

Application The standard specifies the description, evaluation and reporting of the quality of geographic data.
Conformance classes Data quality evaluation process

Data quality metadata

Standalone quality report

Data quality measure

Scope

ISO 19157:2013 establishes the principles for describing the quality for geographic data. It defines components for describing data quality; specifies components and content structure of a register for data quality measures; describes general procedures for evaluating the quality of geographic data; and establishes principles for reporting data quality.

The standard also defines a set of data quality measures for use in evaluating and reporting data quality. It is applicable to data producers providing quality information to describe and assess how well a dataset conforms to its product specification and to data users attempting to determine whether or not specific geographic data are of sufficient quality for their particular application.

The standard does not attempt to define minimum acceptable levels of quality for geographic data.

Implementation benefits

ISO 19157:2013 provides a standard way for describing the quality of geographic data. Such descriptions are useful when a producer has to evaluate how well a dataset meets the criteria described in its product specification. For example, if the producer outsourced the acquisition of the data, ISO 19157:2013 could be used to evaluate and describe the quality of the received data during acceptance testing.

Geographic data are increasingly shared and exchanged. As a result, geographic data are often used for purposes that differ from the purpose for which it was originally captured. Complete descriptions of the quality of a dataset encourage and facilitate the sharing, interchange and use of appropriate datasets.

Another benefit of implementing ISO 19157:2013 is that the quality information could assist a user who has to decide whether a specific dataset is appropriate for an intended use or application. If the user has to decide between two or more datasets, standardized quality descriptions simplify comparing the datasets. If ISO 19157:2013 is implemented, quality reports are expressed in a comparable way and there is a common understanding of the quality measures that have been used. A project to develop an XML of ISO 19157:2013 has begun.

Implementation guidelines

ISO 19157:2013 cancels and replaces ISO/TS 19138:2006, ISO 19114:2003 and ISO 19113:2002. According to ISO 19157:2013, data quality comprises six elements: completeness, thematic accuracy, logical consistency, temporal quality, positional accuracy and usability. Each element is comprised of a number of sub-elements, for example, completeness (commission and omission), logical consistency (conceptual, domain, format, topological), etc. These elements are used to describe data quality, i.e. how well a specific dataset meets the criteria for the different elements set forth in its product specification or user requirements. Evaluation against the criteria is done either quantitatively or subjectively (non-quantitatively). The latter case applies if a detailed data product specification does not exist or if the data product specification lacks quantitative measures and descriptors. Three metaquality elements – confidence, ‘representativity’ and homogeneity – provide quantitative and qualitative statements about the evaluation against the criteria and its result.

Quality information can be provided for different units of data, e.g. a dataset series, a dataset or a subset of a dataset with common characteristics. A data quality unit comprises of a scope and data quality elements. The scope specifies the extent, spatial and/or temporal and/or common characteristic(s) of the unit for which the quality information is provided.

In ISO 19157:2013, quality related information provided by purpose, usage and lineage of geographic data conforms to ISO 19115-1:2014 (described in chapter 11).

ISO 19157:2013 specifies four conformance classes, i.e. the standard can be implemented for four different quality aspects of geo-spatial datasets, each briefly described below.

1. Implementing a data quality evaluation process conforming to ISO 19157:2013

A data quality evaluation process conforming to ISO 19157:2013 comprises of four steps:

  • Step 1 - Specify the data quality units to be evaluated. Study the data product specification to identify applicable data quality units and their scope. For each data quality unit, identify the applicable data quality element(s). See example in Example: Data quality units.
  • Step 2 - Specify the data quality measures to be used to describe quality of each data quality element of a data quality unit. The requirements in the data product specification provide guidance on applicable data quality measures. See example in Example: Data quality measures. The data quality measures in the table are from the list of standardized data quality measures in ISO 19157:2013. It is also possible to describe user-defined quality measures, see further below, and to maintain a collection of such measures in a catalogue or register.
  • Step 3 - Specify the data quality evaluation procedures, i.e. the evaluation method(s) to be applied. The method can be direct (based on inspection of the items in the dataset) or indirect (based on external knowledge, such as lineage metadata). Direct evaluation is further classified by the source against which the evaluation is done: internal if only the data in the dataset is evaluated or external if there is reference to external data (e.g. satellite imagery or ground truth). ISO 19157:2013 includes guidance on how to sample data for evaluation.
  • Step 4 - Determine the output of the data quality evaluation, i.e. perform the data quality evaluation described in Steps 1-3 above. Additional results may be produced by aggregating or by deriving from existing results without carrying out a new evaluation. How to report the results of the data quality evaluation is described elsewhere in this chapter.

Example: Data quality units


Data quality unit Scope Data quality elements
Topographic dataset All features in the dataset Completeness (commission and omission), thematic accuracy (correct classification)
Street network Street features in the entire dataset Logical inconsistency (topological inconsistency)

Example: Data quality measures


Data quality unit Data quality element Data quality measure Method
Topographic dataset Completeness (commission) Measure 1: Excess item Direct external
Measure 2: Number of excess items Direct external
Measure 3: Number of duplicate feature instances Direct internal
Topographic dataset Completeness (omission) Measure 1: Missing item Direct external
Measure 2: Number of missing items Direct external
Topographic dataset Thematic accuracy (correct classification) Measure 1: Number of incorrectly classified features Direct external



Measure 2: Misclassification rate Direct external
Street network Logical inconsistency (topological inconsistency) Measure 1: Number of missing connections due to undershoots Direct internal
Measure 2: Number of missing connections due to overshoots Direct internal
Measure 3: Number of invalid self-intersect errors Direct internal
Measure 4: Number of invalid self-overlap errors Direct internal

2. Implementing data quality metadata conforming to ISO 19157:2013

Data quality metadata describes the quality of geographic data. ISO 19157:2013 specifies a conceptual model of the different components to be used when describing the quality of geographic data. Overview of the components to be used to describe data quality provides and overview of the components and their relationships to each other. A data dictionary, including definitions for all the components, is provided in the standard. Data quality metadata conforming to ISO 19157:2013 conforms to this conceptual model and is reported in conformance with ISO 19115:2003 and ISO 19115-2:2009

[[Image:]]

Overview of the components to be used to describe data quality (Source: ISO 19157:2013)

3. Implementing data quality reports conforming to ISO 19157:2013

The first (and obvious) requirement is that the quality report comprises quality metadata conforming to ISO 19157:2013 (see 2. above), i.e. it includes sections on all appropriate aspects of quality and the description of components follow the rules defined in the standard. Additional information can be added to the report, but the structure of the report is not prescribed. Example: Section of a data quality report is an example of a section of a data quality report for the quality evaluation process described above.

Example: Section of a data quality report


Data quality unit Data quality element Data quality measure Result
Topographic dataset Completeness (commission) Measure 2: Number of excess items
1,036
Measure 3: Number of duplicate feature instances
153
Topographic dataset Completeness (omission) Measure 2: Number of missing items
697
Topographic dataset Thematic accuracy (correct classification) Measure 1: Number of incorrectly classified features
8,774
Measure 2: Misclassification rate
10%
Street network Logical inconsistency (topological inconsistency) Measure 1: Number of missing connections due to undershoots
139
Measure 2: Number of missing connections due to overshoots
57
Measure 3: Number of invalid self-intersect errors
11
Measure 4: Number of invalid self-overlap errors
6

4. Implementing data quality measures conforming to ISO 19157:2013

A data quality measure conforming to ISO 19157:2013 is structurally and semantically well defined and described and modelled as specified in the standard. Such a measure is described by at least an identifier, a name, an element name, definition and a value type. Optional descriptors are an alias, description, a value structure, example, a basic measure and one or more source references and/or parameters. Note that full inspection is most appropriate for small populations or for tests that can be accomplished by automated means. For larger populations, checking a representative part of the data and reporting the quality result as a percentage rate is more appropriate and practical.