Frequently Asked Questions

See also:

KNMI Climate Explorer

ICA&D

C3S2_311_Lot3

FAQ

	FAQ
	The answers given below to the Frequently Asked Questions are intended to be brief rather than comprehensive. For more details, see the documentation provided under Project info.

	Why the ECA&D project?
	What basic data, series and stations are used?
	What is the difference between downloadable and non-downloadable data?
	Why more than one definition for min., mean & max. temperature, etc.?
	What is the difference between blend and non-blend?
	Why doesn't ECA use the WMO station numbers as id's?
	What quality control and homogeneity procedures are applied?
	Why are values slightly different from the file that I downloaded earlier?
	How to obtain daily data that are not available for download at this website?
	How are the smoothed lines in the indices plots calculated?
	What procedure is used to calculate the trends?
	Why do some stations not appear on the trend map, although a time series plot is available?
	Why do some values differ from the values I obtained from the national meteo office(s)?


	Why the ECA&D project?
	The objective of ECA&D is to combine collation of daily series of observations at meteorological stations, quality control, analysis of extremes and dissemination of both the daily data and the analysis results. Integration of these activities in one project proves to be essential for success. New versions of the daily dataset will be issued at regular intervals. The network of participants is graduately extended to countries in the Middle East and countries in North Africa. [top]

	What basic data, series and stations are used?
	The ECA dataset consists of daily station series obtained from climatological divisions of National Meteorological and Hydrological Services and station series maintained by observatories and research centres throughout Europe and the Mediterranean. For details of the individual data providers see the participants list. A comprehensive overview of all available data is provided in the data dictionary. The series are quality controlled and flags (“OK”, “suspect” or “missing”) for individual data are attached. Homogeneity testing has resulted in classification of series in “useful”, “doubtful” or “suspect”. Note that these categories only hold for the particular time intervals for which the tests were applied. It is recommended to use the results of the homogeneity tests for selecting appropriate series and time intervals. The series have not been homogenized in the sense that values are changed. [top]

	What is the difference between downloadable and non-downloadable data?
	Part of the daily data in ECA&D is for stations which are labelled “non-downloadable”. This indicates that the daily data for these stations are not available from this website, but can be available from the data provider itself. “Non-downloadable” daily data are used together with “downloadable” daily data to calculate derived value-added products, such as indices of extremes or daily maps of gridded data (E-OBS). The derived products are made available irrespective of the “non-downloadable”/“downloadable” status of the daily data these products are based on. See our data policy for more details. [top]

	Why more than one definition for min., mean & max. temperature, etc.?
	Different countries estimate daily average temperatures using different methods and formulae. Also, the time intervals for observing minimum and maximum temperature differ and so does the time interval for 24h accumulated rainfall. Each series is therefore labeled with the appropriate element id.[top]

	What is the difference between blend and non-blend?
	The series collected from participating countries generally do not contain data for the most recent years. This is partly due to the time that is needed for data quality control and archiving at the home institutions of the participants, and partly the result of the efforts required to include the data in the ECA database. To make available for each station a time series that is as complete as possible, we have included an automated update procedure that relies on the daily data from SYNOP messages that are distributed in near real time over the Global Telecommunication System (GTS). In this procedure the gaps in a daily series are also infilled with observations from nearby stations, provided that they are within 12.5km distance and that height differences are less than 25m. The download options under daily data allow to select blend or non-blend. The non-blended series are the series as provided by the participants. The blended series underwent the process described above. In case a blended series is chosen, information on the underlying series that are used in the blending process is provided. The datasets that can be downloaded from the website make use of only the downloadable stations series. The derived ECA&D products such as indices of extremes or E-OBS make use of a blended dataset made from both the downloadable and non-downloadable station series, see our data policy. Therefore results created from the downloaded datasets might not necessarily be the same as those given elsewhere on this website. [top]

	Why doesn't ECA&D use the WMO station numbers as id's?
	WMO station numbers are not used as unique identifier for the daily ECA series, because not all stations with data have assigned WMO numbers. [top]

	What quality control and homogeneity procedures are applied?
	Series of the best possible quality are provided for ECA&D by the participating institutions. In addition, common quality control procedures are applied to all series using various algorithms (see Project info > ATBD). These quality control procedures lead to flags (“OK”, “suspect” or “missing”) assigned to individual data. Although data validation has been careful, it can never be excluded that some errors remain undetected. The risk for such errors is greatest in the recent data that stem from synoptical messages, because these data did not undergo the validation process in the participating institutions. Apart from errors at individual days, changes in observation practices may have introduced inhomogeneities of non-climatic origin in long time series. These inhomogeneities may severely affect the assessment of changes in extremes. For evaluation of the homogeneity of the time series in ECA&D a two step testing procedure was followed (see Project info > ATBD). First, four common homogeneity tests were applied to evaluate the daily series in fixed time periods using the testing variables: (1) the annual mean of the diurnal temperature range DTR ( = maximum temperature - minimum temperature), (2) the annual mean of the absolute day-to-day differences of the diurnal temperature range vDTR, (3) the annual wet day count RR1 (threshold 1 mm), (4) the annual number of snow days SD1 (threshold 1 cm), (5) the annual mean of daily sea level pressure PP, (6) the annual sunshine duration SS, (7) the annual mean of daily average relative humidity RH, and (8) the annual mean of daily average cloud cover CC. Second, the test results were condensed for each series into three classes: useful-doubtful-suspect. The four common homogeneity tests are: Standard Normal Homogeneity test, BuisHand Range test, PETtitt test and von NEUmann ratio test. Note that the above homogeneity analysis is subject to further research, as there is no well established testing procedure for daily data. Also, an open question is how to apply the test results. This is dependent on the particular application. For the indices of extremes analysed in ECA&D we have choosen to present trend results only for the series that are useful or doubtful, but in other cases other choices may be made (see e.g. the publications section). There is a clear need for additional research on techniques for homogenisation of daily data in order to create high quality daily datasets for the assessment of extremes without abandoning entire series or throwing out real extremes. This is of particular importance in areas where the density of stations with long daily data series is already low. [top]

	Why are values slightly different from the file that I downloaded earlier?
	All the files on this website are frequently updated to include the latest available observations. Updating includes not only adding the most recent data, but also the inclusion of any late reports of earlier dates. In addition, the older series may have changed, because of improved data quality control or data archaeology by the data providing institutions. [top]

	How to obtain daily data that are not available for download at this website?
	The ECA&D website makes available all daily series for which the conditions of use do allow publication. For some stations, we are only allowed to use the daily series for the analysis of extremes within the ECA&D project without releasing them. These stations do appear in the data dictionary and the indices section of the website as well as in the publications, but they are absent from the daily data section. These non-downloadable series and sometimes even more series might be available from the data provider directly. Please direct your inquiries to obtain these data to the ECA&D Project Team. [top]

	How are the smoothed lines in the indices plots calculated?
	The red smoothed line in the plots is calculated using the lowess smoother function with parameters: f=1/5, iter=3, using Fortran open-source code from wsc@research.bell-labs.com, W. S. Cleveland, Bell Laboratories, Murray Hill NJ 07974. References: Cleveland, W.S. (1979). Robust locally weighted regression and smoothing scatterplots. J.Amer.Statist.Assoc., 74, 829-836. Cleveland, W.S. (1981). LOWESS: A program for smoothing scatterplots by robust locally weighted regression. The American Statistician, 35, 54. [top]

	What procedure is used to calculate the trends?
	Trends are calculated by calculating a least-squares optimal linear fit using NAG's E02ADF routine. References: Numerical Algorithms Group website: http://www.nag.co.uk/numeric/FL/manual/html/FLlibrarymanual.asp references in the NAG Fortran Library Routine Document E02ADF [top]

	Why do some stations not appear on the trend map, although a time series plot is available?
	For a trend value to be calculated, that station must hold valid index data for at least 70% of the period for which the trend is calculated. For example, for a trend period 1901-2000 (100 years), at least 70 years must have valid data. Also, the homogeneity test result for the underlying series must be 'useful' or 'doubtful' for this period. If the test result is 'suspect' or less than 70% of the trend period holds valid index data, the trend for that station is not calculated and therefore not plotted on the trend map. A time series plot is produced if any valid index data is available for the station in question, with the only restriction that index values for an individual year are only calculated if no more than 3% of the days are missing. [top]

	Why do some values differ from the values I obtained from the NMHSs?
	ECA makes use of two kind of data sources: data that are issued by the national meteorological offices or other participants (the so called participant data) and data from synoptical messages. The difference between these two kinds of data is that data from the participants is generally validated, whereas synoptical data is not validated. In ECA&D synoptical messages are temporarily used to extend data series, to make the series as actual as possible. But as soon as participant data become available, the synoptical data are replaced. Non-validated synoptical data can be distinguished from validated participant data by the first figure from the source ID (SOUID) given in each data file: a source starting with 9 represents non-validated synoptical data, whereas a source starting with 1 indicates validated participant data. [top]