Design of optical linear discriminant filter for classification of subwavelength concave and convex defects on dielectric substrates

AbstractmiRNAs (or microRNAs) are small, endogenous, and noncoding RNAs construct of about 22 nucleotides. Cumulative evidence from biological experiments shows that miRNAs play a fundamental and important role in various biological processes. Therefore, the classification of miRNA is a critical problem in computational biology. Due to the short length of mature miRNAs, many researchers are working on precursor miRNAs (pre-miRNAs) with longer sequences and more structural features. Pre-miRNAs can be divided into two groups as mirtrons and canonical miRNAs in terms of biogenesis differences. Compared to mirtrons, canonical miRNAs are more conserved and easier to be identified. Many existing pre-miRNA classification methods rely on manual feature extraction. Moreover, these methods focus on either sequential structure or spatial structure of pre-miRNAs. To overcome the limitations of previous models, we propose a nucleotide-level hybrid deep learning method based on a CNN and LSTM network together. The prediction resulted in 0.943 (%95 CI ± 0.014) accuracy, 0.935 (%95 CI ± 0.016) sensitivity, 0.948 (%95 CI ± 0.029) specificity, 0.925 (%95 CI ± 0.016) F1 Score and 0.880 (%95 CI ± 0.028) Matthews Correlation Coefficient. When compared to the closest results, our proposed method revealed the best results for Acc., F1 Score, MCC. These were 2.51%, 1.00%, and 2.43% higher than the closest ones, respectively. The mean of sensitivity ranked first like Linear Discriminant Analysis. The results indicate that the hybrid CNN and LSTM networks can be employed to achieve better performance for pre-miRNA classification. In future work, we study on investigation of new classification models that deliver better performance in terms of all the evaluation criteria.

Download Full-text

Classification of Tea Aromas Using Multi-Nanoparticle Based Chemiresistor Arrays

Sensors ◽

10.3390/s19112547 ◽

2019 ◽

Vol 19 (11) ◽

pp. 2547 ◽

Cited By ~ 4

Author(s):

Tuo Gao ◽

Yongchen Wang ◽

Chengwu Zhang ◽

Zachariah A. Pittman ◽

Alexandra M. Oliveira ◽

...

Keyword(s):

Chemical Sensor ◽

Sensor Arrays ◽

Machine Learning Algorithms ◽

Single Type ◽

Multivariate Techniques ◽

Integrated Sensor ◽

Resistance Change ◽

Linear Discriminant ◽

Lift Off

Nanoparticle based chemical sensor arrays with four types of organo-functionalized gold nanoparticles (AuNPs) were introduced to classify 35 different teas, including black teas, green teas, and herbal teas. Integrated sensor arrays were made using microfabrication methods including photolithography and lift-off processing. Different types of nanoparticle solutions were drop-cast on separate active regions of each sensor chip. Sensor responses, expressed as the ratio of resistance change to baseline resistance (ΔR/R0), were used as input data to discriminate different aromas by statistical analysis using multivariate techniques and machine learning algorithms. With five-fold cross validation, linear discriminant analysis (LDA) gave 99% accuracy for classification of all 35 teas, and 98% and 100% accuracy for separate datasets of herbal teas, and black and green teas, respectively. We find that classification accuracy improves significantly by using multiple types of nanoparticles compared to single type nanoparticle arrays. The results suggest a promising approach to monitor the freshness and quality of tea products.

Download Full-text

Classification of two phase flows using linear discriminant analysis and expectation maximization clustering of video footage

International Journal of Multiphase Flow ◽

10.1016/j.ijmultiphaseflow.2011.11.011 ◽

2012 ◽

Vol 40 ◽

pp. 106-112 ◽

Cited By ~ 7

Author(s):

B. Ameel ◽

K. De Kerpel ◽

H. Canière ◽

C. T’Joen ◽

H. Huisseune ◽

...

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Expectation Maximization ◽

Two Phase Flows ◽

Two Phase ◽

Linear Discriminant ◽

Video Footage

Download Full-text

Classification of Homo sapiens gene behavior using linear discriminant analysis fused with minimum entropy mapping

Medical & Biological Engineering & Computing ◽

10.1007/s11517-021-02324-y ◽

2021 ◽

Vol 59 (3) ◽

pp. 673-691

Author(s):

Joyshri Das ◽

Soma Barman (Mandal)

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Homo Sapiens ◽

Minimum Entropy ◽

Linear Discriminant

Download Full-text

Comparison of Linear Discriminant Analysis, Support Vector Machines and Naive Bayes Methods in the Classification of Neonatal Hyperspectral Signatures

2021 29th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu53274.2021.9477861 ◽

2021 ◽

Author(s):

Mucahit Cihan ◽

Murat Ceylan

Keyword(s):

Support Vector Machines ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Naive Bayes ◽

Support Vector ◽

Linear Discriminant ◽

Bayes Methods ◽

Vector Machines ◽

Hyperspectral Signatures

Download Full-text

Linear Discriminant Analysis of the wavelet domain features for automatic classification of human chromosomes

2008 9th International Conference on Signal Processing ◽

10.1109/icosp.2008.4697261 ◽

2008 ◽

Cited By ~ 2

Author(s):

M. Javan Roshtkhari ◽

S. Kamaledin Setarehdan

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Automatic Classification ◽

Wavelet Domain ◽

Human Chromosomes ◽

Linear Discriminant

Download Full-text

Identification and Classification of Fungal Colonies in Moldy Paddy Based on Computer Vision

Transactions of the ASABE ◽

10.13031/trans.12797 ◽

2018 ◽

Vol 61 (5) ◽

pp. 1497-1504

Author(s):

Zhenjie Wang ◽

Ke Sun ◽

Lihui Du ◽

Jian Yuan ◽

Kang Tu ◽

...

Keyword(s):

Computer Vision ◽

Discriminant Analysis ◽

Texture Features ◽

Fungal Species ◽

Projection Algorithm ◽

Support Vector ◽

Linear Discriminant ◽

Sequential Selection ◽

The Individual

Abstract. In this study, computer vision was used for the identification and classification of fungi on moldy paddy. To develop a rapid and efficient method for the classification of common fungal species found in stored paddy, computer vision was used to acquire images of individual colonies of growing fungi for three consecutive days. After image processing, the color, shape, and texture features were acquired and used in a subsequent discriminant analysis. Both linear (i.e., linear discriminant analysis and partial least squares discriminant analysis) and nonlinear (i.e., random forest and support vector machine [SVM]) pattern recognition models were employed for the classification of fungal colonies, and the results were compared. The results indicate that when using all of the features for three consecutive days, the performance of the nonlinear tools was superior to that of the linear tools, especially in the case of the SVM models, which achieved an accuracy of 100% on the calibration sets and an accuracy of 93.2% to 97.6% on the prediction sets. After sequential selection of projection algorithm, ten common features were selected for building the classification models. The results showed that the SVM model achieved an overall accuracy of 95.6%, 98.3%, and 99.0% on the prediction sets on days 2, 3, and 4, respectively. This work demonstrated that computer vision with several features is suitable for the identification and classification of fungi on moldy paddy based on the form of the individual colonies at an early growth stage during paddy storage. Keywords: Classification, Computer vision, Fungal colony, Feature selection, SVM.

Download Full-text

A comprehensive simulation study on classification of RNA-Seq data

10.7287/peerj.preprints.2761 ◽

2017 ◽

Author(s):

Gokmen Zararsiz ◽

Dinçer Göksülük ◽

Selçuk Korkmaz ◽

Vahap Eldem ◽

Gözde Ertürk Zararsız ◽

...

Keyword(s):

Discriminant Analysis ◽

Sample Size ◽

Linear Discriminant Analysis ◽

Differential Expression ◽

Simulation Study ◽

Rna Seq ◽

Linear Discriminant ◽

Number Of Genes ◽

Expression Rate

RNA sequencing (RNA-Seq) is a powerful technique for thegene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies.Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of geneexpression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data hierarchically closer to microarrays and apply microarray-based classifiers.In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such asoverdispersion, sample size, number of genes, number of classes, differential-expression rate, andthe transformation method on model performances.A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate, and number of genes and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM clas sifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html .

Download Full-text

Analysis of EPID Transmission Fluence Maps Using Machine Learning Models and CNN for Identifying Position Errors in the Treatment of GO Patients

Frontiers in Oncology ◽

10.3389/fonc.2021.721591 ◽

2021 ◽

Vol 11 ◽

Author(s):

Guyu Dai ◽

Xiangbin Zhang ◽

Wenjie Liu ◽

Zhibin Li ◽

Guangyu Wang ◽

...

Keyword(s):

Machine Learning ◽

Error Type ◽

Imaging Device ◽

Linear Discriminant ◽

Position Errors ◽

Dose Monitoring ◽

Type 3

PurposeTo find a suitable method for analyzing electronic portal imaging device (EPID) transmission fluence maps for the identification of position errors in the in vivo dose monitoring of patients with Graves’ ophthalmopathy (GO).MethodsPosition errors combining 0-, 2-, and 4-mm errors in the left-right (LR), anterior-posterior (AP), and superior-inferior (SI) directions in the delivery of 40 GO patient radiotherapy plans to a human head phantom were simulated and EPID transmission fluence maps were acquired. Dose difference (DD) and structural similarity (SSIM) maps were calculated to quantify changes in the fluence maps. Three types of machine learning (ML) models that utilize radiomics features of the DD maps (ML 1 models), features of the SSIM maps (ML 2 models), and features of both DD and SSIM maps (ML 3 models) as inputs were used to perform three types of position error classification, namely a binary classification of the isocenter error (type 1), three binary classifications of LR, SI, and AP direction errors (type 2), and an eight-element classification of the combined LR, SI, and AP direction errors (type 3). Convolutional neural network (CNN) was also used to classify position errors using the DD and SSIM maps as input.ResultsThe best-performing ML 1 model was XGBoost, which achieved accuracies of 0.889, 0.755, 0.778, 0.833, and 0.532 in the type 1, type 2-LR, type 2-AP, type 2-SI, and type 3 classification, respectively. The best ML 2 model was XGBoost, which achieved accuracies of 0.856, 0.731, 0.736, 0.949, and 0.491, respectively. The best ML 3 model was linear discriminant classifier (LDC), which achieved accuracies of 0.903, 0.792, 0.870, 0.931, and 0.671, respectively. The CNN achieved classification accuracies of 0.925, 0.833, 0.875, 0.949, and 0.689, respectively.ConclusionML models and CNN using combined DD and SSIM maps can analyze EPID transmission fluence maps to identify position errors in the treatment of GO patients. Further studies with large sample sizes are needed to improve the accuracy of CNN.

Download Full-text

Linear discriminant analysis based on gas chromatographic measurements for geographical prediction of USA medical domestic cannabis

Acta Chromatographica ◽

10.1556/1326.2020.00782 ◽

2020 ◽

Author(s):

Ramia Z. Al Bakain ◽

Yahya S. Al-Degs ◽

James V. Cizdziel ◽

Mahmoud A. Elsohly

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Principal Components ◽

Supervised Classification ◽

Geographical Origin ◽

Active Components ◽

Linear Discriminant ◽

Sample Extraction ◽

Analytical Range

AbstractFifty four domestically produced cannabis samples obtained from different USA states were quantitatively assayed by GC–FID to detect 22 active components: 15 terpenoids and 7 cannabinoids. The profiles of the selected compounds were used as inputs for samples grouping to their geographical origins and for building a geographical prediction model using Linear Discriminant Analysis. The proposed sample extraction and chromatographic separation was satisfactory to select 22 active ingredients with a wide analytical range between 5.0 and 1,000 µg/mL. Analysis of GC-profiles by Principle Component Analysis retained three significant variables for grouping job (Δ9-THC, CBN, and CBC) and the modest discrimination of samples based on their geographical origin was reported. PCA was able to separate many samples of Oregon and Vermont while a mixed classification was observed for the rest of samples. By using LDA as a supervised classification method, excellent separation of cannabis samples was attained leading to a classification of new samples not being included in the model. Using two principal components and LDA with GC–FID profiles correctly predict the geographical of 100% Washington cannabis, 86% of both Oregon and Vermont samples, and finally, 71% of Ohio samples.

Download Full-text