The CAMELS data set: catchment attributes and meteorology for large-sample studies

Abstract. We present a new data set of attributes for 671 catchments in the contiguous USA (CONUS). This complements the daily hydrometeorological time series provided by Newman et al. (2015b) and opens new opportunities to explore how the interplay between landscape attributes shapes hydrological processes and catchment behavior. To produce this extension, we synthesized diverse and complementary data sets to describe topography, climate, hydrology, soil and vegetation characteristics at the catchment scale. The spatial variations among basins over the CONUS are discussed and compared using a series of maps. The large number of catchments, combined with the diversity of their geophysical characteristics, makes this new data well suited for large-sample studies and comparative hydrology. An essential feature, that differentiates this data set from similar ones, is that it both provides quantitative estimates of diverse catchment attributes, and involves assessments of the limitations of the data and methods used to compute those attributes. This data set will be publicly available and we encourage the community to further extend it. The hydrometeorological time series provided by Newman et al. (2015b) together with the catchment attributes introduced in this paper constitute the CAMELS data set: Catchment Attributes and MEteorology for Large-sample Studies.

Download Full-text

The CAMELS data set: catchment attributes and meteorology for large-sample studies

Hydrology and Earth System Sciences ◽

10.5194/hess-21-5293-2017 ◽

2017 ◽

Vol 21 (10) ◽

pp. 5293-5313 ◽

Cited By ~ 64

Author(s):

Nans Addor ◽

Andrew J. Newman ◽

Naoki Mizukami ◽

Martyn P. Clark

Keyword(s):

Time Series ◽

Data Sets ◽

Catchment Scale ◽

Data Set ◽

Meteorological Forcing ◽

Large Sample ◽

Source Data ◽

Daily Time Series ◽

Comparative Hydrology ◽

Daily Time

Abstract. We present a new data set of attributes for 671 catchments in the contiguous United States (CONUS) minimally impacted by human activities. This complements the daily time series of meteorological forcing and streamflow provided by Newman et al. (2015b). To produce this extension, we synthesized diverse and complementary data sets to describe six main classes of attributes at the catchment scale: topography, climate, streamflow, land cover, soil, and geology. The spatial variations among basins over the CONUS are discussed and compared using a series of maps. The large number of catchments, combined with the diversity of the attributes we extracted, makes this new data set well suited for large-sample studies and comparative hydrology. In comparison to the similar Model Parameter Estimation Experiment (MOPEX) data set, this data set relies on more recent data, it covers a wider range of attributes, and its catchments are more evenly distributed across the CONUS. This study also involves assessments of the limitations of the source data sets used to compute catchment attributes, as well as detailed descriptions of how the attributes were computed. The hydrometeorological time series provided by Newman et al. (2015b, https://doi.org/10.5065/D6MW2F4D) together with the catchment attributes introduced in this paper (https://doi.org/10.5065/D6G73C3Q) constitute the freely available CAMELS data set, which stands for Catchment Attributes and MEteorology for Large-sample Studies.

Download Full-text

The CAMELS-CL dataset: catchment attributes and meteorology for large sample studies – Chile dataset

10.5194/hess-2018-23 ◽

2018 ◽

Cited By ~ 2

Author(s):

Camila Alvarez-Garreton ◽

Pablo A. Mendoza ◽

Juan Pablo Boisier ◽

Nans Addor ◽

Mauricio Galleguillos ◽

...

Keyword(s):

Time Series ◽

South America ◽

The United States ◽

Data Sets ◽

Human Intervention ◽

Data Set ◽

Large Sample ◽

Daily Streamflow ◽

The Andes ◽

Hydrological Signatures

Abstract. We introduce the first catchment data set for large sample studies in Chile (South America). The data set includes 516 catchments and provides catchment boundaries, daily streamflow records and basin-averaged time series of the following hydrometeorological variables: 1) daily precipitation retrieved from four gridded sources; 2) daily maximum, minimum and mean temperature; 3) daily potential evapotranspiration (PET); 4) 8-day accumulated PET; and 5) daily snow water equivalent. In addition to the hydro-meteorological time series, we use diverse data sets to extract key landscape attributes characterizing climatic, hydrological, topographic, geological and land cover features. We also describe the degree of anthropic intervention within the catchments by relying on publicly available water rights data for the country. The information is synthetized in 64 catchment attributes describing the landscape and water use characteristics of each catchment. To facilitate the use of the dataset presented here and promote common standards in large-sample studies, we computed most catchment attributes introduced by Addor et al., (2017) in their Catchment Attributes and MEteorology for Large-sample Studies dataset (CAMELS dataset) created for the United States, and proposed several others. Following this nomenclature, we named our dataset CAMELS-CL, which stands for CAMELS dataset in Chile. Based on the constructed dataset, we analysed the main spatial patterns of catchment attributes and the relationships between them. In general, the topographic attributes were explained by the Andes Cordillera; climatic attributes revealed the basic features of Chilean climate; and hydrological signatures revealed the leading patterns of catchment hydrologic responses, resulting from complex, non-linear process interactions across a range of spatiotemporal scales, enhanced by heterogeneities in topography, soils, vegetation, geology and other landscape properties. Further, we analysed human influence in catchment behaviour by relating hydrological signatures with a novel human intervention attribute. Our findings reveal that larger human intervention results in decreased annual flows, runoff ratios, decreased elasticity of runoff with respect to precipitation, and decreased flashiness of runoff, especially in drier catchments. CAMELS-CL provides unprecedented information in South America, a continent largely underrepresented in large-sample studies. The proximity of the Andes means that this dataset includes high-elevation catchments, which are generally poorly represented world-wide due to data-scarcity. The CAMELS-CL dataset can be used to address a myriad of applications, including catchment classification and regionalization studies, the modelling of water availability under different management scenarios, the characterisation of drought history and projections, and the exploration of climate change impacts on hydrological processes. This effort is part of an international initiative to create a multi-national large sample data sets freely available for the community.

Download Full-text

Loi and Gong Low-Degree Rotational Splittings

Symposium - International Astronomical Union ◽

10.1017/s0074180900238515 ◽

1998 ◽

Vol 185 ◽

pp. 167-168

Author(s):

T. Appourchaux ◽

M.C. Rabello-Soares ◽

L. Gizon

Keyword(s):

Time Series ◽

The Other ◽

Data Sets ◽

Data Set ◽

Low Degree ◽

Fourier Spectra

Two different data sets have been used to derive low-degree rotational splittings. One data set comes from the Luminosity Oscillations Imager of VIRGO on board SOHO; the observation starts on 27 March 96 and ends on 26 March 97, and are made of intensity time series of 12 pixels (Appourchaux et al, 1997, Sol. Phys., 170, 27). The other data set was kindly made available by the GONG project; the observation starts on 26 August 1995 and ends on 21 August 1996, and are made of complex Fourier spectra of velocity time series for l = 0 − 9. For the GONG data, the contamination of l = 1 from the spatial aliases of l = 6 and l = 9 required some cleaning. To achieve this, we applied the inverse of the leakage matrix of l = 1, 6 and 9 to the original Fourier spectra of the same degrees; cleaning of all 3 degrees was achieved simultaneously (Appourchaux and Gizon, 1997, these proceedings).

Download Full-text

An assessment of Bayesian bias estimator for numerical weather prediction

Nonlinear Processes in Geophysics ◽

10.5194/npg-15-1013-2008 ◽

2008 ◽

Vol 15 (6) ◽

pp. 1013-1022 ◽

Cited By ~ 2

Author(s):

J. Son ◽

D. Hou ◽

Z. Toth

Keyword(s):

Time Series ◽

Numerical Weather Prediction ◽

Sampling Error ◽

Weather Prediction ◽

Training Data ◽

Statistical Characteristics ◽

Forecast Errors ◽

Data Sets ◽

Data Set ◽

Numerical Weather

Abstract. Various statistical methods are used to process operational Numerical Weather Prediction (NWP) products with the aim of reducing forecast errors and they often require sufficiently large training data sets. Generating such a hindcast data set for this purpose can be costly and a well designed algorithm should be able to reduce the required size of these data sets. This issue is investigated with the relatively simple case of bias correction, by comparing a Bayesian algorithm of bias estimation with the conventionally used empirical method. As available forecast data sets are not large enough for a comprehensive test, synthetically generated time series representing the analysis (truth) and forecast are used to increase the sample size. Since these synthetic time series retained the statistical characteristics of the observations and operational NWP model output, the results of this study can be extended to real observation and forecasts and this is confirmed by a preliminary test with real data. By using the climatological mean and standard deviation of the meteorological variable in consideration and the statistical relationship between the forecast and the analysis, the Bayesian bias estimator outperforms the empirical approach in terms of the accuracy of the estimated bias, and it can reduce the required size of the training sample by a factor of 3. This advantage of the Bayesian approach is due to the fact that it is less liable to the sampling error in consecutive sampling. These results suggest that a carefully designed statistical procedure may reduce the need for the costly generation of large hindcast datasets.

Download Full-text

Two decades of satellite observations of AOD over mainland China using ATSR-2, AATSR and MODIS/Terra: data set evaluation and large-scale patterns

Atmospheric Chemistry and Physics ◽

10.5194/acp-18-1573-2018 ◽

2018 ◽

Vol 18 (3) ◽

pp. 1573-1592 ◽

Cited By ~ 48

Author(s):

Gerrit de Leeuw ◽

Larisa Sogacheva ◽

Edith Rodriguez ◽

Konstantinos Kourtidis ◽

Aristeidis K. Georgoulias ◽

...

Keyword(s):

Time Series ◽

European Space Agency ◽

Mainland China ◽

Satellite Observations ◽

Data Sets ◽

Data Set ◽

Modis Aod ◽

The Difference ◽

Along Track ◽

Aerosol Properties

Abstract. The retrieval of aerosol properties from satellite observations provides their spatial distribution over a wide area in cloud-free conditions. As such, they complement ground-based measurements by providing information over sparsely instrumented areas, albeit that significant differences may exist in both the type of information obtained and the temporal information from satellite and ground-based observations. In this paper, information from different types of satellite-based instruments is used to provide a 3-D climatology of aerosol properties over mainland China, i.e., vertical profiles of extinction coefficients from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP), a lidar flying aboard the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) satellite and the column-integrated extinction (aerosol optical depth – AOD) available from three radiometers: the European Space Agency (ESA)'s Along-Track Scanning Radiometer version 2 (ATSR-2), Advanced Along-Track Scanning Radiometer (AATSR) (together referred to as ATSR) and NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) aboard the Terra satellite, together spanning the period 1995–2015. AOD data are retrieved from ATSR using the ATSR dual view (ADV) v2.31 algorithm, while for MODIS Collection 6 (C6) the AOD data set is used that was obtained from merging the AODs obtained from the dark target (DT) and deep blue (DB) algorithms, further referred to as the DTDB merged AOD product. These data sets are validated and differences are compared using Aerosol Robotic Network (AERONET) version 2 L2.0 AOD data as reference. The results show that, over China, ATSR slightly underestimates the AOD and MODIS slightly overestimates the AOD. Consequently, ATSR AOD is overall lower than that from MODIS, and the difference increases with increasing AOD. The comparison also shows that neither of the ATSR and MODIS AOD data sets is better than the other one everywhere. However, ATSR ADV has limitations over bright surfaces which the MODIS DB was designed for. To allow for comparison of MODIS C6 results with previous analyses where MODIS Collection 5.1 (C5.1) data were used, also the difference between the C6 and C5.1 merged DTDB data sets from MODIS/Terra over China is briefly discussed. The AOD data sets show strong seasonal differences and the seasonal features vary with latitude and longitude across China. Two-decadal AOD time series, averaged over all of mainland China, are presented and briefly discussed. Using the 17 years of ATSR data as the basis and MODIS/Terra to follow the temporal evolution in recent years when the environmental satellite Envisat was lost requires a comparison of the data sets for the overlapping period to show their complementarity. ATSR precedes the MODIS time series between 1995 and 2000 and shows a distinct increase in the AOD over this period. The two data series show similar variations during the overlapping period between 2000 and 2011, with minima and maxima in the same years. MODIS extends this time series beyond the end of the Envisat period in 2012, showing decreasing AOD.

Download Full-text

Uncertainty of Climatol adjustment algorithm for daily time series of additive climate variables

10.5194/egusphere-egu2020-5365 ◽

2020 ◽

Author(s):

Oleg Skrynyk ◽

Enric Aguilar ◽

José A. Guijarro ◽

Sergiy Bubin

Keyword(s):

Time Series ◽

Climate Model ◽

Data Sets ◽

Climate Variables ◽

The European Union ◽

Data Set ◽

Raw Data ◽

Climate Signal ◽

Daily Time Series ◽

Daily Time

Before using climatological time series in research studies, it is necessary to perform their quality control and homogenization in order to remove possible artefacts (inhomogeneities) usually present in the raw data sets. In the vast majority of cases, the homogenization procedure allows to improve the consistency of the data, which then can be verified by means of the statistical comparison of the raw and homogenized time series. However, a new question then arises: how far are the homogenized data from the true climate signal or, in other words, what errors could still be present in homogenized data?The main objective of our work is to estimate the uncertainty produced by the adjustment algorithm of the widely used Climatol homogenization software when homogenizing daily time series of the additive climate variables. We focused our efforts on the minimum and maximum air temperature. In order to achieve our goal we used a benchmark data set created by the INDECIS* project. The benchmark contains clean data, extracted from an output of the Royal Netherlands Meteorological Institute Regional Atmospheric Climate Model (version 2) driven by Hadley Global Environment Model 2 - Earth System, and inhomogeneous data, created by introducing realistic breaks and errors.The statistical evaluation of discrepancies between the homogenized (by means of Climatol with predefined break points) and clean data sets was performed using both a set of standard parameters and a metrics introduced in our work. All metrics used clearly identifies the main features of errors (systematic and random) present in the homogenized time series. We calculated the metrics for every time series (only over adjusted segments) as well as their averaged values as measures of uncertainties in the whole data set.In order to determine how the two key parameters of the raw data collection, namely the length of time series and station density, influence the calculated measures of the adjustment error we gradually decreased the length of the period and number of stations in the area under study. The total number of cases considered was 56, including 7 time periods (1950-2005, 1954-2005, &#8230;, 1974-2005) and 8 different quantities of stations (100, 90, &#8230;, 30). Additionally, in order to find out how stable are the calculated metrics for each of the 56 cases and determine their confidence intervals we performed 100 random permutations in the introduced inhomogeneity time series and repeated our calculations With that the total number of homogenization exercises performed was 5600 for each of two climate variables.Lastly, the calculated metrics were compared with the corresponding values, obtained for raw time series. The comparison showed some substantial improvement of the metric values after homogenization in each of the 56 cases considered (for the both variables).-------------------*INDECIS is a part of ERA4CS, an ERA-NET initiated by JPI Climate, and funded by FORMAS (SE), DLR (DE), BMWFW (AT), IFD (DK), MINECO (ES), ANR (FR) with co-funding by the European Union (Grant 690462). The work has been partially supported by the Ministry of Education and Science of Kazakhstan (Grant BR05236454) and Nazarbayev University (Grant 090118FD5345).

Download Full-text

3D ray+Born migration/inversion—Part 2: Application to the SEG/EAGE overthrust experiment

Geophysics ◽

10.1190/1.1598129 ◽

2003 ◽

Vol 68 (4) ◽

pp. 1357-1370 ◽

Cited By ~ 20

Author(s):

Stéphane Operto ◽

Gilles Lambaré ◽

Pascal Podvin ◽

Philippe Thierry

Keyword(s):

Spatial Resolution ◽

Seismic Data ◽

Velocity Model ◽

Nonuniform Distribution ◽

Data Sets ◽

Data Set ◽

Quantitative Estimates ◽

3D Processing ◽

Type Data ◽

New Formula

The SEG/EAGE overthrust model is a synthetic onshore velocity model that was used to generate several large synthetic seismic data sets using acoustic finite‐difference modeling. From this database, several realistic subdata sets were extracted and made available for testing 3D processing methods. For example, classic onshore‐type data‐acquisition geometries are available such as a swath acquisition, which is characterized by a nonuniform distribution of long offsets with azimuth and midpoints. In this paper, we present an application of 2.5D and 3D ray‐Born migration/inversion to several classical data sets from the SEG/EAGE overthrust experiment. The method is formulated as a linearized inversion of the scattered wavefield. The method allows quantitative estimates of short wavelength components of the velocity model. First, we apply a 3D migration/inversion formula formerly developed for marine acquisitions to the swath data set. The migrated sections exhibit significant amplitude artifacts and acquisition footprints, also revealed by the shape of the local spatial resolution filters. From the analysis of these spatial resolution filters, we propose a new formula significantly improving the migrated dip section. We also present 3D migrated results for the strike section and a small 3D target containing a channel. Finally, the applications demonstrate, that the ray+Born migration formula must be adapted to the acquisition geometry to obtain reliable estimates of the true amplitude of the model perturbations. This adaptation is relatively straightforward in the frame of the ray+Born formalism and can be guided by the analysis of the resolution operator.

Download Full-text

Streamflow indices to identify catchment drivers of hydrograph

10.5194/hess-2021-307 ◽

2021 ◽

Author(s):

Jeenu Mathai ◽

Pradeep Mujumdar

Keyword(s):

Large Scale ◽

Catchment Scale ◽

Regional Variability ◽

Data Set ◽

Temporal Asymmetry ◽

Hydrologic Processes ◽

Statistical Assumptions ◽

Comparative Hydrology ◽

Time Irreversibility ◽

Characteristic Features

Abstract. Streamflow indices are flow descriptors that quantify the streamflow dynamics, which are usually determined for a specific basin and are distinct from other basin features. The flow descriptors are appropriate for large-scale and comparative hydrology studies, independent of statistical assumptions and can distinguish signals that indicate basin behavior over time. In this paper, the characteristic features of the hydrograph's temporal asymmetry due to its different underlying hydrologic processes are primarily highlighted. Streamflow indices linked to each limb of the hydrograph within the time-irreversibility paradigm are distinguished with respect to its processes driving the rising and falling limbs. Various streamflow indices relating the rising and falling limbs, and the catchment attributes such as climate, topography, vegetation, geology and soil are then correlated. Finally, the key attributes governing rising and falling limbs are identified. The novelty of the work is on differentiating hydrographs by their time irreversibility property and offering an alternative way to recognize primary drivers of streamflow hydrographs. A set of streamflow indices at the catchment scale for 671 basins in the Contiguous United States (CONUS) is presented here. These streamflow indices complement the catchment attributes provided earlier (Addor et al., 2017) for the CAMELS data set. A series of spatial maps describing the streamflow indices and their regional variability over the CONUS is illustrated in this study.

Download Full-text

An end-to-end Novel Forecasting Model for Crime Prediction based on Big Data

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f9153.038620 ◽

2020 ◽

Vol 8 (6) ◽

pp. 3704-3708

Keyword(s):

Time Series ◽

Big Data ◽

Data Analytics ◽

Linear Time ◽

Big Data Analytics ◽

Series Data ◽

Data Sets ◽

Data Set ◽

Main Category ◽

Crime Prediction

Big data analytics is a field in which we analyse and process information from large or convoluted data sets to be managed by methods of data-processing. Big data analytics is used in analysing the data and helps in predicting the best outcome from the data sets. Big data analytics can be very useful in predicting crime and also gives the best possible solution to solve that crime. In this system we will be using the past crime data set to find out the pattern and through that pattern we will be predicting the range of the incident. The range of the incident will be determined by the decision model and according to the range the prediction will be made. The data sets will be nonlinear and in the form of time series so in this system we will be using the prophet model algorithm which is used to analyse the non-linear time series data. The prophet model categories in three main category and i.e. trends, seasonality, and holidays. This system will help crime cell to predict the possible incident according to the pattern which will be developed by the algorithm and it also helps to deploy right number of resources to the highly marked area where there is a high chance of incidents to occur. The system will enhance the crime prediction system and will help the crime department to use their resources more efficiently.

Download Full-text

The SPARC water vapour assessment II: Comparison of stratospheric and lower mesospheric water vapour time series observed from satellites

10.5194/amt-2018-33 ◽

2018 ◽

Author(s):

Farahnaz Khosrawi ◽

Stefan Lossow ◽

Gabriele P. Stiller ◽

Karen H. Rosenlof ◽

Joachim Urban ◽

...

Keyword(s):

Time Series ◽

Water Vapour ◽

Data Sets ◽

Data Set ◽

The Future ◽

Modelling Studies ◽

The Difference ◽

The Tropics ◽

The Antarctic ◽

Satellite Instruments

Abstract. Time series of stratospheric and lower mesospheric water vapour using 33 data sets from 15 different satellite instruments were compared in the framework of the second SPARC (Stratosphere-troposphere Processes And their Role in Climate) water vapour assessment (WAVAS-II). This comparison aimed to provide a comprehensive overview of the typical uncertainties in the observational database that can be considered in the future in observational and modelling studies addressing e.g stratospheric water vapour trends. The time series comparisons are presented for the three latitude bands, the Antarctic (80°–70° S), the tropics (15° S–15° N) and the northern hemisphere mid-latitudes (50° N–60° N) at four different altitudes (0.1, 3, 10 and 80 hPa) covering the stratosphere and lower mesosphere. The combined temporal coverage of observations from the 15 satellite instruments allowed considering the time period 1986–2014. In addition to the qualitative comparison of the time series, the agreement of the data sets is assessed quantitatively in the form of the spread (i.e. the difference between the maximum and minimum volume mixing ratio among the data sets), the (Pearson) correlation coefficient and the drift (i.e. linear changes of the difference between time series over time). Generally, good agreement between the time series was found in the middle stratosphere while larger differences were found in the lower mesosphere and near the tropopause. Concerning the latitude bands, the largest differences were found in the Antarctic while the best agreement was found for the tropics. From our assessment we find that all data sets can be considered in the future in observational and modelling studies addressing e.g. stratospheric and lower mesospheric water vapour variability and trends when data set specific characteristics (e.g. a drift) and restrictions (e.g. temporal and spatial coverage) are taken into account.

Download Full-text