Data Quality Reporting

 

DQRs are categorized based on the type of information provided and include categories of severity for quality issues. Two categories for data quality issues impact the use of the data:

  • Incorrect – indicates the values are inaccurate and should not be used
  • Suspect – indicates the values are exhibiting some indication of an underlying issue and additional screening is required.

For both levels, if a method of correcting the data exists, it will be described in the text.

Situations may also arise during instrument operation that do not affect the quality of the data but may be important to be aware of when using the data. Such situations may be documented as DQRs set to the category “Transparent.” We urge the data user to read these for useful background information.

The final step in ordering data from Data Discovery allows users to control how incorrect and suspect DQRs affect their data order. By default, all incorrect DQRs are applied to automatically filter the incorrect data, which results in replacing these data with missing value indicators. This default filtering may optionally be turned off by the user. Conversely, data flagged as Suspect are not automatically replaced with missing values. This option is not set by default and must be selected by the user. The user does not currently have the ability to select or deselect individual DQRs before the data are filtered. A user must apply the individual DQR information manually or use the DQR web service to apply individual DQRs to his/her data order. DQRs categorized as Transparent are not used to filter data values during data ordering.

The DQR web service has been developed to help users apply DQRs after downloading data. The web service requires only a few additional lines of code in the user’s analysis, and programming language.