scholarly journals An Improved Comparison of Chemometric Analyses for the Identification of Acids and Bases With Colorimetric Sensor Arrays

2018 ◽  
Vol 10 (2) ◽  
pp. 36 ◽  
Author(s):  
Michael James Kangas ◽  
Christina L Wilson ◽  
Raychelle M Burks ◽  
Jordyn Atwater ◽  
Rachel M Lukowicz ◽  
...  

Colorimetric sensor arrays incorporating red, green, and blue (RGB) image analysis use value changes from multiple sensors for the identification and quantification of various analytes. RGB data can be easily obtained using image analysis software such as ImageJ. Subsequent chemometric analysis is becoming a key component of colorimetric array RGB data analysis, though literature contains mainly principal component analysis (PCA) and hierarchical cluster analysis (HCA). Seeking to expand the chemometric methods toolkit for array analysis, we explored the performance of nine chemometric methods were compared for the task of classifying 631 solutions (0.1 to 3 M) of acetic acid, malonic acid, lysine, and ammonia using an eight sensor colorimetric array. PCA and LDA (linear discriminant analysis) were effective for visualizing the dataset. For classification, linear discriminant analysis (LDA), (k nearest neighbors) KNN, (soft independent modelling by class analogy) SIMCA, recursive partitioning and regression trees (RPART), and hit quality index (HQI) were very effective with each method classifying compounds with over 90% correct assignments. Support vector machines (SVM) and partial least squares – discriminant analysis (PLS-DA) struggled with ~85 and 39% correct assignments, respectively. Additional mathematical treatments of the data set, such as incrementally increasing the exponents, did not improve the performance of LDA and KNN. The literature precedence indicates that the most common methods for analyzing colorimetric arrays are PCA, LDA, HCA, and KNN. To our knowledge, this is the first report of comparing and contrasting several more diverse chemometric methods to analyze the same colorimetric array data.

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Heping Li ◽  
Yu Ren ◽  
Fan Yu ◽  
Dongliang Song ◽  
Lizhe Zhu ◽  
...  

To facilitate the enhanced reliability of Raman-based tumor detection and analytical methodologies, an ex vivo Raman spectral investigation was conducted to identify distinct compositional information of healthy (H), ductal carcinoma in situ (DCIS), and invasive ductal carcinoma (IDC). Then, principal component analysis-linear discriminant analysis (PCA-LDA) and principal component analysis-support vector machine (PCA-SVM) models were constructed for distinguishing spectral features among different tissue groups. Spectral analysis highlighted differences in levels of unsaturated and saturated lipids, carotenoids, protein, and nucleic acid between healthy and cancerous tissue and variations in the levels of nucleic acid, protein, and phenylalanine between DCIS and IDC. Both classification models were principal component analysis-linear discriminant analysis to be extremely efficient on discriminating tissue pathological types with 99% accuracy for PCA-LDA and 100%, 100%, and 96.7% for PCA-SVM analysis based on linear kernel, polynomial kernel, and radial basis function (RBF), respectively, while PCA-SVM algorithm greatly simplified the complexity of calculation without sacrificing performance. The present study demonstrates that Raman spectroscopy combined with multivariate analysis technology has considerable potential for improving the efficiency and performance of breast cancer diagnosis.


2010 ◽  
Vol 08 (06) ◽  
pp. 995-1011 ◽  
Author(s):  
HAO ZHENG ◽  
HONGWEI WU

Metagenomics is an emerging field in which the power of genomic analysis is applied to an entire microbial community, bypassing the need to isolate and culture individual microbial species. Assembling of metagenomic DNA fragments is very much like the overlap-layout-consensus procedure for assembling isolated genomes, but is augmented by an additional binning step to differentiate scaffolds, contigs and unassembled reads into various taxonomic groups. In this paper, we employed n-mer oligonucleotide frequencies as the features and developed a hierarchical classifier (PCAHIER) for binning short (≤ 1,000 bps) metagenomic fragments. The principal component analysis was used to reduce the high dimensionality of the feature space. The hierarchical classifier consists of four layers of local classifiers that are implemented based on the linear discriminant analysis. These local classifiers are responsible for binning prokaryotic DNA fragments into superkingdoms, of the same superkingdom into phyla, of the same phylum into genera, and of the same genus into species, respectively. We evaluated the performance of the PCAHIER by using our own simulated data sets as well as the widely used simHC synthetic metagenome data set from the IMG/M system. The effectiveness of the PCAHIER was demonstrated through comparisons against a non-hierarchical classifier, and two existing binning algorithms (TETRA and Phylopythia).


2014 ◽  
pp. 61-67
Author(s):  
A. Amari ◽  
N. El Bari ◽  
B. Bouchikhi

An electronic nose based system, which employs an array of six inexpensive commercial gas sensors based on tin dioxide (Figaro Engineering Inc., Japan), has been used to analyse the freshness states of anchovies. Fresh anchovies were stored in a refrigerator at 4 ± 1°C over a period of 15 days. Electronic nose measurements need no sample preparation and the results indicated that the spoilage process of anchovies could be followed by using this technique. Conductance responses of volatile compounds produced during storage of anchovy were monitored and the result were analysed by multivariate analysis methods. In this paper principal component analysis (PCA) and linear discriminant analysis (LDA) were used to investigate whether the electronic nose was able to distinguishing among different freshness states (fresh, moderated and non-fresh samples). The loadings analysis was used to identify the sensors responsible for discrimination in the current pattern file. Therefore, the support vector machines (SVM) method was applied to the new subset, with only the selected sensors, to confirm that a subset of a few sensors can be chosen to explain all the variance. The results obtained prove that the electronic nose can discriminate successfully different freshness state using LDA analysis. Some sensors have the highest influence in the current pattern file for electronic nose. Support vector machine (SVM) model, applied to the new subset of sensors show the good performance.


2019 ◽  
Vol 8 (2) ◽  
pp. 6198-6203

Recently, manufacturing industry faces lots of problem in predicting the customer behavior and group for matching their outcome with the profit. The organizations are finding difficult in identifying the customer behavior for the purpose of predicting the product design so as to increase the profit. The prediction of customer group is a challenging task for all the organization due to the current growing entrepreneurs. This results in using the machine learning algorithms to cluster the customer group for predicting the demand of the customers. This helps in decision making process of manufacturing the products. This paper attempts to predict the customer group for the wine data set extracted from UCI Machine Learning repository. The wine data set is subjected to dimensionality reduction with principal component analysis and linear discriminant analysis. A Performance analysis is done with various classification algorithms and comparative study is done with the performance metric such as accuracy, precision, recall, and f-score. Experimental results shows that after applying dimensionality reduction, the 2 component LDA reduced wine data set with the kernel SVM, Random Forest classifier is found to be effective with the accuracy of 100% compared to other classifiers.


2003 ◽  
Vol 17 (2) ◽  
pp. 261-269 ◽  
Author(s):  
Han Witjes ◽  
Mark Rijpkema ◽  
Marinette van der Graaf ◽  
Willem Melssen ◽  
Arend Heerschap ◽  
...  

Agriculture ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 109
Author(s):  
Diding Suhandy ◽  
Meinilwita Yulia

The postharvest processing factors including cherry processing methods highly influence the final quality of coffee beverages, especially in the composition of several coffee metabolites such as glucose, fructose, the amino acid (glutamic acid), and chlorogenic acids (CGA) as well as trigonelline contents. In this research, UV spectroscopy combined with chemometrics was used to classify a ground roasted Lampung robusta specialty coffee according to differences in the cherry processing methods. A total of 360 samples of Lampung robusta specialty coffee with 1 g of weight for each sample from three different cherry processing methods were prepared as samples: 100 samples of pure dry coffee (DRY), 100 samples of pure semi-dry coffee (SMD), 100 samples of pure wet coffee (WET) and 60 samples of adulterated coffee (ADT) (SMD coffee was adulterated with DRY and WET coffee). All samples were extracted using a standard protocol as explained by previous works. A low-cost benchtop UV-visible spectrometer (Genesys™ 10S UV-Vis, Thermo Scientific, Waltham, MA, USA) was utilized to obtain UV spectral data in the interval of 190–400 nm using the fast scanning mode. Using the first three principal components (PCs) with a total of 93% of explained variance, there was a clear separation between samples. The samples were clustered into four possible groups according to differences in cherry processing methods: dry, semi-dry, wet, and adulterated. Four supervised classification methods, partial least squares–discriminant analysis (PLS-DA), principal component analysis–linear discriminant analysis (PCA-LDA), linear discriminant analysis (LDA) and support vector machine classification (SVMC) were selected to classify the Lampung robusta specialty coffee according to differences in the cherry processing methods. PCA-LDA is the best classification method with 91.7% classification accuracy in prediction. PLS-DA, LDA and SVMC give an accuracy of 56.7%, 80.0% and 85.0%, respectively. The present research suggested that UV spectroscopy combining with chemometrics will be highly useful in Lampung robusta specialty coffee authentication.


Sensors ◽  
2022 ◽  
Vol 22 (1) ◽  
pp. 367
Author(s):  
Janez Lapajne ◽  
Matej Knapič ◽  
Uroš Žibrat

Hyperspectral imaging is a popular tool used for non-invasive plant disease detection. Data acquired with it usually consist of many correlated features; hence most of the acquired information is redundant. Dimensionality reduction methods are used to transform the data sets from high-dimensional, to low-dimensional (in this study to one or a few features). We have chosen six dimensionality reduction methods (partial least squares, linear discriminant analysis, principal component analysis, RandomForest, ReliefF, and Extreme gradient boosting) and tested their efficacy on a hyperspectral data set of potato tubers. The extracted or selected features were pipelined to support vector machine classifier and evaluated. Tubers were divided into two groups, healthy and infested with Meloidogyne luci. The results show that all dimensionality reduction methods enabled successful identification of inoculated tubers. The best and most consistent results were obtained using linear discriminant analysis, with 100% accuracy in both potato tuber inside and outside images. Classification success was generally higher in the outside data set, than in the inside. Nevertheless, accuracy was in all cases above 0.6.


2020 ◽  
Vol 15 ◽  
Author(s):  
Mohanad Mohammed ◽  
Henry Mwambi ◽  
Bernard Omolo

Background: Colorectal cancer (CRC) is the third most common cancer among women and men in the USA, and recent studies have shown an increasing incidence in less developed regions, including Sub-Saharan Africa (SSA). We developed a hybrid (DNA mutation and RNA expression) signature and assessed its predictive properties for the mutation status and survival of CRC patients. Methods: Publicly-available microarray and RNASeq data from 54 matched formalin-fixed paraffin-embedded (FFPE) samples from the Affymetrix GeneChip and RNASeq platforms, were used to obtain differentially expressed genes between mutant and wild-type samples. We applied the support-vector machines, artificial neural networks, random forests, k-nearest neighbor, naïve Bayes, negative binomial linear discriminant analysis, and the Poisson linear discriminant analysis algorithms for classification. Cox proportional hazards model was used for survival analysis. Results: Compared to the genelist from each of the individual platforms, the hybrid genelist had the highest accuracy, sensitivity, specificity, and AUC for mutation status, across all the classifiers and is prognostic for survival in patients with CRC. NBLDA method was the best performer on the RNASeq data while the SVM method was the most suitable classifier for CRC across the two data types. Nine genes were found to be predictive of survival. Conclusion: This signature could be useful in clinical practice, especially for colorectal cancer diagnosis and therapy. Future studies should determine the effectiveness of integration in cancer survival analysis and the application on unbalanced data, where the classes are of different sizes, as well as on data with multiple classes.


Sign in / Sign up

Export Citation Format

Share Document