Pixel-level Signal Modelling with Spatial Correlation for Two-Colour Microarrays

Author(s):  
Claus T Ekstrøm ◽  
Søren Bak ◽  
Mats Rudemo

Statistical models for spot shapes and signal intensities are used in image analysis of laser scans of microarrays. Most models have essentially been based on the assumption of independent pixel intensity values, but models that allow for spatial correlation among neighbouring pixels can accommodate errors in the microarray slide and should improve the model fit. Five spatial correlation structures, exponential, Gaussian, linear, rational quadratic and spherical, are compared for a dataset with 50-mer two-colour oligonucleotide microarrays and 452 probes for selected Arabidopsis genes. Substantial improvement in model fit is obtained for all five correlation structures compared to the model with independent pixel values, and the Gaussian and the spherical models seem to be slightly better than the other three models. We also conclude that for the data set analysed the correlation seems negligible for non-neighbouring pixels.

2008 ◽  
Vol 67 (1) ◽  
pp. 51-60 ◽  
Author(s):  
Stefano Passini

The relation between authoritarianism and social dominance orientation was analyzed, with authoritarianism measured using a three-dimensional scale. The implicit multidimensional structure (authoritarian submission, conventionalism, authoritarian aggression) of Altemeyer’s (1981, 1988) conceptualization of authoritarianism is inconsistent with its one-dimensional methodological operationalization. The dimensionality of authoritarianism was investigated using confirmatory factor analysis in a sample of 713 university students. As hypothesized, the three-factor model fit the data significantly better than the one-factor model. Regression analyses revealed that only authoritarian aggression was related to social dominance orientation. That is, only intolerance of deviance was related to high social dominance, whereas submissiveness was not.


Author(s):  
Parisa Torkaman

The generalized inverted exponential distribution is introduced as a lifetime model with good statistical properties. This paper, the estimation of the probability density function and the cumulative distribution function of with five different estimation methods: uniformly minimum variance unbiased(UMVU), maximum likelihood(ML), least squares(LS), weighted least squares (WLS) and percentile(PC) estimators are considered. The performance of these estimation procedures, based on the mean squared error (MSE) by numerical simulations are compared. Simulation studies express that the UMVU estimator performs better than others and when the sample size is large enough the ML and UMVU estimators are almost equivalent and efficient than LS, WLS and PC. Finally, the result using a real data set are analyzed.


2020 ◽  
Vol 27 (4) ◽  
pp. 329-336 ◽  
Author(s):  
Lei Xu ◽  
Guangmin Liang ◽  
Baowen Chen ◽  
Xu Tan ◽  
Huaikun Xiang ◽  
...  

Background: Cell lytic enzyme is a kind of highly evolved protein, which can destroy the cell structure and kill the bacteria. Compared with antibiotics, cell lytic enzyme will not cause serious problem of drug resistance of pathogenic bacteria. Thus, the study of cell wall lytic enzymes aims at finding an efficient way for curing bacteria infectious. Compared with using antibiotics, the problem of drug resistance becomes more serious. Therefore, it is a good choice for curing bacterial infections by using cell lytic enzymes. Cell lytic enzyme includes endolysin and autolysin and the difference between them is the purpose of the break of cell wall. The identification of the type of cell lytic enzymes is meaningful for the study of cell wall enzymes. Objective: In this article, our motivation is to predict the type of cell lytic enzyme. Cell lytic enzyme is helpful for killing bacteria, so it is meaningful for study the type of cell lytic enzyme. However, it is time consuming to detect the type of cell lytic enzyme by experimental methods. Thus, an efficient computational method for the type of cell lytic enzyme prediction is proposed in our work. Method: We propose a computational method for the prediction of endolysin and autolysin. First, a data set containing 27 endolysins and 41 autolysins is built. Then the protein is represented by tripeptides composition. The features are selected with larger confidence degree. At last, the classifier is trained by the labeled vectors based on support vector machine. The learned classifier is used to predict the type of cell lytic enzyme. Results: Following the proposed method, the experimental results show that the overall accuracy can attain 97.06%, when 44 features are selected. Compared with Ding's method, our method improves the overall accuracy by nearly 4.5% ((97.06-92.9)/92.9%). The performance of our proposed method is stable, when the selected feature number is from 40 to 70. The overall accuracy of tripeptides optimal feature set is 94.12%, and the overall accuracy of Chou's amphiphilic PseAAC method is 76.2%. The experimental results also demonstrate that the overall accuracy is improved by nearly 18% when using the tripeptides optimal feature set. Conclusion: The paper proposed an efficient method for identifying endolysin and autolysin. In this paper, support vector machine is used to predict the type of cell lytic enzyme. The experimental results show that the overall accuracy of the proposed method is 94.12%, which is better than some existing methods. In conclusion, the selected 44 features can improve the overall accuracy for identification of the type of cell lytic enzyme. Support vector machine performs better than other classifiers when using the selected feature set on the benchmark data set.


1995 ◽  
Vol 3 (3) ◽  
pp. 133-142 ◽  
Author(s):  
M. Hana ◽  
W.F. McClure ◽  
T.B. Whitaker ◽  
M. White ◽  
D.R. Bahler

Two artificial neural network models were used to estimate the nicotine in tobacco: (i) a back-propagation network and (ii) a linear network. The back-propagation network consisted of an input layer, an output layer and one hidden layer. The linear network consisted of an input layer and an output layer. Both networks used the generalised delta rule for learning. Performances of both networks were compared to the multiple linear regression method MLR of calibration. The nicotine content in tobacco samples was estimated for two different data sets. Data set A contained 110 near infrared (NIR) spectra each consisting of reflected energy at eight wavelengths. Data set B consisted of 200 NIR spectra with each spectrum having 840 spectral data points. The Fast Fourier transformation was applied to data set B in order to compress each spectrum into 13 Fourier coefficients. For data set A, the linear regression model gave better results followed by the back-propagation network which was followed by the linear network. The true performance of the linear regression model was better than the back-propagation and the linear networks by 14.0% and 18.1%, respectively. For data set B, the back-propagation network gave the best result followed by MLR and the linear network. Both the linear network and MLR models gave almost the same results. The true performance of the back-propagation network model was better than the MLR and linear network by 35.14%.


e-Polymers ◽  
2007 ◽  
Vol 7 (1) ◽  
Author(s):  
Wenbo Luo ◽  
Said Jazouli ◽  
Toan Vu-Khanh

AbstractThe creep behavior of a commercial grade polycarbonate was investigated in this study. 10 different constant stresses ranging from 8 MPa to 50 MPa were applied to the specimen, and the resultant creep strains were measured at room temperature. It was found that the creep could be modeled linearly below 15 MPa, and nonlinearly above 15 MPa. Different nonlinear viscoelastic models have been briefly reviewed and used to fit the test data. It is shown that the Findley model is a special case of the Schapery model, and both the Findley model and the simplified multiple integral representation are suitable for properly describing the creep behavior of the polycarbonate investigated in this paper; however, the Findley model fit the data better than the simplified multiple integral with three terms.


2010 ◽  
Vol 3 (1) ◽  
pp. 95-103 ◽  
Author(s):  
M. Rivas Casado ◽  
D. Parsons ◽  
N. Magan ◽  
R. Weightman ◽  
P. Battilani ◽  
...  

The heterogeneous three-dimensional spatial distribution of mycotoxins has proven to be one of the main limitations for the design of effective sampling protocols. Current sample collection protocols for mycotoxins have been designed to estimate the mean concentration and fail to characterise the spatial distribution of the mycotoxin concentration due to the aggregation of the incremental samples. Geostatistical techniques have been successfully applied to overcome similar problems in many research areas. However, little work has been developed on the use of geostatistics for the design of sampling protocols for mycotoxins. This paper focuses on the analysis of the two and three-dimensional spatial structure of fumonisins B1 (FB1) and B2 (FB2) in maize in a bulk store using a geostatistical approach and on how results help determine the number and location of incremental samples to be collected. The spatial correlation between FB1 and FB2, as well as between the number of kernels infected and the level of contamination was investigated. For this purpose, a bed of maize was sampled at different depths to generate a unique three-dimensional data set of FB1 and FB2. The analysis found no clear evidence of spatial structure in either the two-dimensional or three-dimensional analyses. The number of Fusarium infected kernels was not a good indicator for the prediction of fumonisin concentration and there was no spatial correlation between the concentrations of the two fumonisins.


F1000Research ◽  
2020 ◽  
Vol 8 ◽  
pp. 2024
Author(s):  
Joshua P. Zitovsky ◽  
Michael I. Love

Allelic imbalance occurs when the two alleles of a gene are differentially expressed within a diploid organism and can indicate important differences in cis-regulation and epigenetic state across the two chromosomes. Because of this, the ability to accurately quantify the proportion at which each allele of a gene is expressed is of great interest to researchers. This becomes challenging in the presence of small read counts and/or sample sizes, which can cause estimators for allelic expression proportions to have high variance. Investigators have traditionally dealt with this problem by filtering out genes with small counts and samples. However, this may inadvertently remove important genes that have truly large allelic imbalances. Another option is to use pseudocounts or Bayesian estimators to reduce the variance. To this end, we evaluated the accuracy of four different estimators, the latter two of which are Bayesian shrinkage estimators: maximum likelihood, adding a pseudocount to each allele, approximate posterior estimation of GLM coefficients (apeglm) and adaptive shrinkage (ash). We also wrote C++ code to quickly calculate ML and apeglm estimates and integrated it into the apeglm package. The four methods were evaluated on two simulations and one real data set. Apeglm consistently performed better than ML according to a variety of criteria, and generally outperformed use of pseudocounts as well. Ash also performed better than ML in one of the simulations, but in the other performance was more mixed. Finally, when compared to five other packages that also fit beta-binomial models, the apeglm package was substantially faster and more numerically reliable, making our package useful for quick and reliable analyses of allelic imbalance. Apeglm is available as an R/Bioconductor package at http://bioconductor.org/packages/apeglm.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 3523-3526

This paper describes an efficient algorithm for classification in large data set. While many algorithms exist for classification, they are not suitable for larger contents and different data sets. For working with large data sets various ELM algorithms are available in literature. However the existing algorithms using fixed activation function and it may lead deficiency in working with large data. In this paper, we proposed novel ELM comply with sigmoid activation function. The experimental evaluations demonstrate the our ELM-S algorithm is performing better than ELM,SVM and other state of art algorithms on large data sets.


2011 ◽  
Vol 48-49 ◽  
pp. 102-105
Author(s):  
Guo Zhen Cheng ◽  
Dong Nian Cheng ◽  
He Lei

Detecting network traffic anomaly is very important for network security. But it has high false alarm rate, low detect rate and that can’t perform real-time detection in the backbone very well due to its nonlinearity, nonstationarity and self-similarity. Therefore we propose a novel detection method—EMD-DS, and prove that it can reduce mean error rate of anomaly detection efficiently after EMD. On the KDD CUP 1999 intrusion detection evaluation data set, this detector detects 85.1% attacks at low false alarm rate which is better than some other systems.


Sign in / Sign up

Export Citation Format

Share Document