Measurement of high-quality diffraction data with a Nonius KappaCCD diffractometer: finding the optimal experimental parameters

The influence of the different experimental parameters on the quality of the diffraction data collected on tetrafluoroterephthalonitrile (TFT) with a Nonius KappaCCD instrument has been examined. Data sets measured with different scan widths (0.25°, 0.50°, 1.0°) and scan times (70 s/° and 140 s/°) were compared with a highly redundant data set collected with an Enraf–Nonius CAD4 point detector diffractometer. As part of this analysis it was investigated how the parameters employed during the data reduction performed with theEvalCCDandSORTAVprograms affect the quality of the data. The KappaCCD data sets did not show any significant contamination from λ/2 radiation and possess good internal consistency with lowRintvalues. Decreasing the scan width seems to increase the standard uncertainties, which conversely are improved by an increase in the scan time. The suitability of the KappaCCD instrument to measure data to be used in charge density studies was also examined by performing a charge density data collection with the KappaCCD instrument. The same multipole model was used in the refinement of these data and of the CAD4 data. The two refinements gave almost identical parameters and residual electron densities. The topological analysis of the resulting static electron densities shows that the bond critical points have the same characteristics.

Download Full-text

Towards extracting the charge density from normal-resolution data

Journal of Applied Crystallography ◽

10.1107/s0021889809034621 ◽

2009 ◽

Vol 42 (6) ◽

pp. 1110-1121 ◽

Cited By ~ 45

Author(s):

B. Dittrich ◽

C. B. Hübschle ◽

J. J. Holstein ◽

F. P. A. Fabbiani

Keyword(s):

Charge Density ◽

Dipole Moments ◽

Atomic Displacement ◽

Limiting Factor ◽

Data Sets ◽

Data Set ◽

Redundant Data ◽

Multipole Refinement ◽

Atomic Displacement Parameters ◽

Resolution Data

The limiting factor for charge-density studies is crystal quality. Although area detection and low temperatures enable redundant data collection, only compounds that form well diffracting single crystals without disorder are amenable to these studies. If thermal motion and electron density ρ(r) were de-convoluted, multipole parameters could also be refined with lower-resolution data, such as those commonly collected for macromolecules. Using the invariom database for first refining conventional parameters (x,y,zand atomic displacement parameters), de-convolution can be achieved. In a subsequent least-squares refinement of multipole parameters only, information on the charge density becomes accessible also for data not fulfilling charge-density requirements. A critical aspect of this procedure is the missing information on the correlation between refined and non-refined parameters. This correlation is investigated in detail by comparing a full multipole refinement on high-resolution and a blocked refinement on `normal-resolution' data sets of ciprofloxacin hexahydrate. Topological properties and dipole moments are shown to be in excellent agreement for the two refinements. A `normal-resolution' data set of ciprofloxacin hydrochloride 1.4-hydrate is also evaluated in this manner.

Download Full-text

NUCOME: A comprehensive database of nucleosome organization referenced landscapes in mammalian genomes

BMC Bioinformatics ◽

10.1186/s12859-021-04239-9 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Xiaolan Chen ◽

Hui Yang ◽

Guifen Liu ◽

Yong Zhang

Keyword(s):

Data Quality ◽

Cell Types ◽

Data Sets ◽

Data Quality Control ◽

Data Set ◽

Redundant Data ◽

Nucleosome Organization ◽

Public Data ◽

Mammalian Genomes

Abstract Background Nucleosome organization is involved in many regulatory activities in various organisms. However, studies integrating nucleosome organization in mammalian genomes are very limited mainly due to the lack of comprehensive data quality control (QC) assessment and uneven data quality of public data sets. Results The NUCOME is a database focused on filtering qualified nucleosome organization referenced landscapes covering various cell types in human and mouse based on QC metrics. The filtering strategy guarantees the quality of nucleosome organization referenced landscapes and exempts users from redundant data set selection and processing. The NUCOME database provides standardized, qualified data source and informative nucleosome organization features at a whole-genome scale and on the level of individual loci. Conclusions The NUCOME provides valuable data resources for integrative analyses focus on nucleosome organization. The NUCOME is freely available at http://compbio-zhanglab.org/NUCOME.

Download Full-text

A batch-wise non-linear fitting and analysis tool for treating large X-ray diffraction data sets

Journal of Applied Crystallography ◽

10.1107/s0021889805035351 ◽

2006 ◽

Vol 39 (2) ◽

pp. 262-266 ◽

Cited By ~ 7

Author(s):

R. J. Davies

Keyword(s):

Diffraction Data ◽

Operation Mode ◽

Large Data ◽

Scattering Data ◽

Data Sets ◽

Analysis Tool ◽

Data Set ◽

X Ray ◽

Linear Fitting ◽

Non Linear

Synchrotron sources offer high-brilliance X-ray beams which are ideal for spatially and time-resolved studies. Large amounts of wide- and small-angle X-ray scattering data can now be generated rapidly, for example, during routine scanning experiments. Consequently, the analysis of the large data sets produced has become a complex and pressing issue. Even relatively simple analyses become difficult when a single data set can contain many thousands of individual diffraction patterns. This article reports on a new software application for the automated analysis of scattering intensity profiles. It is capable of batch-processing thousands of individual data files without user intervention. Diffraction data can be fitted using a combination of background functions and non-linear peak functions. To compliment the batch-wise operation mode, the software includes several specialist algorithms to ensure that the results obtained are reliable. These include peak-tracking, artefact removal, function elimination and spread-estimate fitting. Furthermore, as well as non-linear fitting, the software can calculate integrated intensities and selected orientation parameters.

Download Full-text

A SELF-ORGANIZING MAP FOR MIXED CONTINUOUS AND CATEGORICAL DATA

International Journal of Computing ◽

10.47839/ijc.10.1.733 ◽

2011 ◽

pp. 24-32 ◽

Cited By ~ 1

Author(s):

Nicoleta Rogovschi ◽

Mustapha Lebbah ◽

Younès Bennani

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Mixed Data ◽

Categorical Variables ◽

Data Sets ◽

Self Organizing Map ◽

Data Set ◽

Public Data ◽

Self Organizing

Most traditional clustering algorithms are limited to handle data sets that contain either continuous or categorical variables. However data sets with mixed types of variables are commonly used in data mining field. In this paper we introduce a weighted self-organizing map for clustering, analysis and visualization mixed data (continuous/binary). The learning of weights and prototypes is done in a simultaneous manner assuring an optimized data clustering. More variables has a high weight, more the clustering algorithm will take into account the informations transmitted by these variables. The learning of these topological maps is combined with a weighting process of different variables by computing weights which influence the quality of clustering. We illustrate the power of this method with data sets taken from a public data set repository: a handwritten digit data set, Zoo data set and other three mixed data sets. The results show a good quality of the topological ordering and homogenous clustering.

Download Full-text

Influence of the weights in IHS and Brovey methods for pan-sharpening WorldView-3 satellite images

International Journal of Engineering & Technology ◽

10.14419/ijet.v6i3.7702 ◽

2017 ◽

Vol 6 (3) ◽

pp. 71 ◽

Cited By ~ 5

Author(s):

Claudio Parente ◽

Massimiliano Pepe

Keyword(s):

Satellite Images ◽

Urban Landscape ◽

Spectral Response ◽

Rural Landscape ◽

Spectral Radiance ◽

Data Sets ◽

Data Set ◽

Inertial Moment ◽

The Impact

The purpose of this paper is to investigate the impact of weights in pan-sharpening methods applied to satellite images. Indeed, different data sets of weights have been considered and compared in the IHS and Brovey methods. The first dataset contains the same weight for each band while the second takes in account the weighs obtained by spectral radiance response; these two data sets are most common in pan-sharpening application. The third data set is resulting by a new method. It consists to compute the inertial moment of first order of each band taking in account the spectral response. For testing the impact of the weights of the different data sets, WorlView-3 satellite images have been considered. In particular, two different scenes (the first in urban landscape, the latter in rural landscape) have been investigated. The quality of pan-sharpened images has been analysed by three different quality indexes: Root mean square error (RMSE), Relative average spectral error (RASE) and Erreur Relative Global Adimensionnelle de Synthèse (ERGAS).

Download Full-text

Validation of SCIAMACHY AMC-DOAS water vapour columns

Atmospheric Chemistry and Physics ◽

10.5194/acp-5-1835-2005 ◽

2005 ◽

Vol 5 (7) ◽

pp. 1835-1841 ◽

Cited By ~ 31

Author(s):

S. Noël ◽

M. Buchwitz ◽

H. Bovensmann ◽

J. P. Burrows

Keyword(s):

Water Vapour ◽

Visible Spectral Region ◽

Optical Absorption Spectroscopy ◽

Data Sets ◽

Monthly Means ◽

Data Set ◽

Weather Forecasts ◽

Microwave Imager ◽

Scanning Imaging

Abstract. A first validation of water vapour total column amounts derived from measurements of the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) in the visible spectral region has been performed. For this purpose, SCIAMACHY water vapour data have been determined for the year 2003 using an extended version of the Differential Optical Absorption Spectroscopy (DOAS) method, called Air Mass Corrected (AMC-DOAS). The SCIAMACHY results are compared with corresponding water vapour measurements by the Special Sensor Microwave Imager (SSM/I) and with model data from the European Centre for Medium-Range Weather Forecasts (ECMWF). In confirmation of previous results it could be shown that SCIAMACHY derived water vapour columns are typically slightly lower than both SSM/I and ECMWF data, especially over ocean areas. However, these deviations are much smaller than the observed scatter of the data which is caused by the different temporal and spatial sampling and resolution of the data sets. For example, the overall difference with ECMWF data is only -0.05 g/cm2 whereas the typical scatter is in the order of 0.5 g/cm2. Both values show almost no variation over the year. In addition, first monthly means of SCIAMACHY water vapour data have been computed. The quality of these monthly means is currently limited by the availability of calibrated SCIAMACHY spectra. Nevertheless, first comparisons with ECMWF data show that SCIAMACHY (and similar instruments) are able to provide a new independent global water vapour data set.

Download Full-text

Charge Density of L-Alanyl-glycyl-L-alanine Based on X-Ray Data Collection Periods from 4 to 130 Hours

Zeitschrift für Naturforschung B ◽

10.1515/znb-2007-0512 ◽

2007 ◽

Vol 62 (5) ◽

pp. 696-704 ◽

Cited By ~ 5

Author(s):

Diana Förster ◽

Armin Wagner ◽

Christian B. Hübschle ◽

Carsten Paulmann ◽

Peter Luger

Keyword(s):

Synchrotron Radiation ◽

Low Temperature ◽

Charge Density ◽

Data Sets ◽

Data Set ◽

X Ray ◽

Atomic Properties ◽

Multipole Refinement ◽

Refinement Strategy ◽

Experimental Charge Density

Abstract The charge density of the tripeptide L-alanyl-glycyl-L-alanine was determined from three X-ray data sets measured at different experimental setups and under different conditions. Two of the data sets were measured with synchrotron radiation (beamline F1 of Hasylab/DESY, Germany and beamline X10SA of SLS, Paul-Scherer-Institute, Switzerland) at temperatures around 100 K while a third data set was measured under home laboratory conditions (MoKα radiation) at a low temperature of 20 K. The multipole refinement strategy to derive the experimental charge density was the same in all cases, so that the obtained charge density properties could directly be compared. While the general analysis of the three data sets suggested a small preference for one of the synchrotron data sets (Hasylab F1), a comparison of topological and atomic properties gave in no case an indication for a preference of any of the three data sets. It follows that even the 4 h data set measured at the SLS performed equally well compared to the data sets of substantially longer exposure time.

Download Full-text

A PROBABILISTIC SELF-ORGANIZING MAP FOR BINARY DATA TOPOGRAPHIC CLUSTERING

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026808002351 ◽

2008 ◽

Vol 07 (04) ◽

pp. 363-383 ◽

Cited By ~ 10

Author(s):

MUSTAPHA LEBBAH ◽

YOUNÈS BENNANI ◽

NICOLETA ROGOVSCHI

Keyword(s):

Binary Data ◽

Learning Algorithm ◽

Data Sets ◽

Self Organizing Map ◽

Data Set ◽

Binary Coding ◽

Public Data ◽

Multivariate Binary Data ◽

Self Organizing

This paper introduces a probabilistic self-organizing map for topographic clustering, analysis and visualization of multivariate binary data or categorical data using binary coding. We propose a probabilistic formalism dedicated to binary data in which cells are represented by a Bernoulli distribution. Each cell is characterized by a prototype with the same binary coding as used in the data space and the probability of being different from this prototype. The learning algorithm, Bernoulli on self-organizing map, that we propose is an application of the EM standard algorithm. We illustrate the power of this method with six data sets taken from a public data set repository. The results show a good quality of the topological ordering and homogenous clustering.

Download Full-text

DBSCANI: Noise-Resistant Method for Missing Value Imputation

Journal of Intelligent Systems ◽

10.1515/jisys-2014-0172 ◽

2016 ◽

Vol 25 (3) ◽

pp. 431-440 ◽

Cited By ~ 1

Author(s):

Archana Purwar ◽

Sandeep Kumar Singh

Keyword(s):

Spatial Data ◽

Missing Values ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Data Sets ◽

Quality Of Data ◽

Data Set ◽

Dbscan Clustering ◽

Density Based Clustering

AbstractThe quality of data is an important task in the data mining. The validity of mining algorithms is reduced if data is not of good quality. The quality of data can be assessed in terms of missing values (MV) as well as noise present in the data set. Various imputation techniques have been studied in MV study, but little attention has been given on noise in earlier work. Moreover, to the best of knowledge, no one has used density-based spatial clustering of applications with noise (DBSCAN) clustering for MV imputation. This paper proposes a novel technique density-based imputation (DBSCANI) built on density-based clustering to deal with incomplete values in the presence of noise. Density-based clustering algorithm proposed by Kriegal groups the objects according to their density in spatial data bases. The high-density regions are known as clusters, and the low-density regions refer to the noise objects in the data set. A lot of experiments have been performed on the Iris data set from life science domain and Jain’s (2D) data set from shape data sets. The performance of the proposed method is evaluated using root mean square error (RMSE) as well as it is compared with existing K-means imputation (KMI). Results show that our method is more noise resistant than KMI on data sets used under study.

Download Full-text

GLEAM v3: satellite-based land evaporation and root-zone soil moisture

10.5194/gmd-2016-162 ◽

2016 ◽

Cited By ~ 25

Author(s):

Brecht Martens ◽

Diego G. Miralles ◽

Hans Lievens ◽

Robin van der Schalie ◽

Richard A. M. de Jeu ◽

...

Keyword(s):

Soil Moisture ◽

Eddy Covariance ◽

Optical Depth ◽

Root Zone ◽

European Space Agency ◽

Data Sets ◽

Data Set ◽

Previous Version ◽

The One

Abstract. The Global Land Evaporation Amsterdam Model (GLEAM) is a set of algorithms dedicated to the estimation of terrestrial evaporation and root-zone soil moisture from satellite data. Ever since its development in 2011, the model has been regularly revised aiming at the optimal incorporation of new satellite-observed geophysical variables, and improving the representation of physical processes. In this study, the next version of this model (v3) is presented. Key changes relative to the previous version include: (1) a revised formulation of the evaporative stress, (2) an optimized drainage algorithm, and (3) a new soil moisture data assimilation system. GLEAM v3 is used to produce three new data sets of terrestrial evaporation and root-zone soil moisture, including a 35-year data set spanning the period 1980–2014 (v3.0a, based on satellite-observed soil moisture, vegetation optical depth and snow water equivalents, reanalysis air temperature and radiation, and a multi-source precipitation product), and two fully satellite-based data sets. The latter two share most of their forcing, except for the vegetation optical depth and soil moisture products, which are based on observations from different passive and active C- and L-band microwave sensors (European Space Agency Climate Change Initiative data sets) for the first data set (v3.0b, spanning the period 2003–2015) and observations from the Soil Moisture and Ocean Salinity satellite in the second data set (v3.0c, spanning the period 2011–2015). These three data sets are described in detail, compared against analogous data sets generated using the previous version of GLEAM (v2), and validated against measurements from 64 eddy-covariance towers and 2338 soil moisture sensors across a broad range of ecosystems. Results indicate that the quality of the v3 soil moisture is consistently better than the one from v2: average correlations against in situ surface soil moisture measurements increase from 0.61 to 0.64 in case of the v3.0a data set and the representation of soil moisture in the second layer improves as well, with correlations increasing from 0.47 to 0.53. Similar improvements are observed for the two fully satellite-based data sets. Despite regional differences, the quality of the evaporation fluxes remains overall similar as the one obtained using the previous version of GLEAM, with average correlations against eddy-covariance measurements between 0.78 and 0.80 for the three different data sets. These global data sets of terrestrial evaporation and root-zone soil moisture are now openly available at http://GLEAM.eu and may be used for large-scale hydrological applications, climate studies and research on land-atmosphere feedbacks.

Download Full-text