Design of optical linear discriminant filter for classification of subwavelength concave and convex defects on dielectric substrates

Author(s):  
Jun-ichiro Sugisaka ◽  
Takashi Yasui ◽  
Koichi Hirayama
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Abdulkadir Tasdelen ◽  
Baha Sen

AbstractmiRNAs (or microRNAs) are small, endogenous, and noncoding RNAs construct of about 22 nucleotides. Cumulative evidence from biological experiments shows that miRNAs play a fundamental and important role in various biological processes. Therefore, the classification of miRNA is a critical problem in computational biology. Due to the short length of mature miRNAs, many researchers are working on precursor miRNAs (pre-miRNAs) with longer sequences and more structural features. Pre-miRNAs can be divided into two groups as mirtrons and canonical miRNAs in terms of biogenesis differences. Compared to mirtrons, canonical miRNAs are more conserved and easier to be identified. Many existing pre-miRNA classification methods rely on manual feature extraction. Moreover, these methods focus on either sequential structure or spatial structure of pre-miRNAs. To overcome the limitations of previous models, we propose a nucleotide-level hybrid deep learning method based on a CNN and LSTM network together. The prediction resulted in 0.943 (%95 CI ± 0.014) accuracy, 0.935 (%95 CI ± 0.016) sensitivity, 0.948 (%95 CI ± 0.029) specificity, 0.925 (%95 CI ± 0.016) F1 Score and 0.880 (%95 CI ± 0.028) Matthews Correlation Coefficient. When compared to the closest results, our proposed method revealed the best results for Acc., F1 Score, MCC. These were 2.51%, 1.00%, and 2.43% higher than the closest ones, respectively. The mean of sensitivity ranked first like Linear Discriminant Analysis. The results indicate that the hybrid CNN and LSTM networks can be employed to achieve better performance for pre-miRNA classification. In future work, we study on investigation of new classification models that deliver better performance in terms of all the evaluation criteria.


Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2547 ◽  
Author(s):  
Tuo Gao ◽  
Yongchen Wang ◽  
Chengwu Zhang ◽  
Zachariah A. Pittman ◽  
Alexandra M. Oliveira ◽  
...  

Nanoparticle based chemical sensor arrays with four types of organo-functionalized gold nanoparticles (AuNPs) were introduced to classify 35 different teas, including black teas, green teas, and herbal teas. Integrated sensor arrays were made using microfabrication methods including photolithography and lift-off processing. Different types of nanoparticle solutions were drop-cast on separate active regions of each sensor chip. Sensor responses, expressed as the ratio of resistance change to baseline resistance (ΔR/R0), were used as input data to discriminate different aromas by statistical analysis using multivariate techniques and machine learning algorithms. With five-fold cross validation, linear discriminant analysis (LDA) gave 99% accuracy for classification of all 35 teas, and 98% and 100% accuracy for separate datasets of herbal teas, and black and green teas, respectively. We find that classification accuracy improves significantly by using multiple types of nanoparticles compared to single type nanoparticle arrays. The results suggest a promising approach to monitor the freshness and quality of tea products.


2018 ◽  
Vol 61 (5) ◽  
pp. 1497-1504
Author(s):  
Zhenjie Wang ◽  
Ke Sun ◽  
Lihui Du ◽  
Jian Yuan ◽  
Kang Tu ◽  
...  

Abstract. In this study, computer vision was used for the identification and classification of fungi on moldy paddy. To develop a rapid and efficient method for the classification of common fungal species found in stored paddy, computer vision was used to acquire images of individual colonies of growing fungi for three consecutive days. After image processing, the color, shape, and texture features were acquired and used in a subsequent discriminant analysis. Both linear (i.e., linear discriminant analysis and partial least squares discriminant analysis) and nonlinear (i.e., random forest and support vector machine [SVM]) pattern recognition models were employed for the classification of fungal colonies, and the results were compared. The results indicate that when using all of the features for three consecutive days, the performance of the nonlinear tools was superior to that of the linear tools, especially in the case of the SVM models, which achieved an accuracy of 100% on the calibration sets and an accuracy of 93.2% to 97.6% on the prediction sets. After sequential selection of projection algorithm, ten common features were selected for building the classification models. The results showed that the SVM model achieved an overall accuracy of 95.6%, 98.3%, and 99.0% on the prediction sets on days 2, 3, and 4, respectively. This work demonstrated that computer vision with several features is suitable for the identification and classification of fungi on moldy paddy based on the form of the individual colonies at an early growth stage during paddy storage. Keywords: Classification, Computer vision, Fungal colony, Feature selection, SVM.


2017 ◽  
Author(s):  
Gokmen Zararsiz ◽  
Dinçer Göksülük ◽  
Selçuk Korkmaz ◽  
Vahap Eldem ◽  
Gözde Ertürk Zararsız ◽  
...  

RNA sequencing (RNA-Seq) is a powerful technique for thegene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies.Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of geneexpression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data hierarchically closer to microarrays and apply microarray-based classifiers.In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such asoverdispersion, sample size, number of genes, number of classes, differential-expression rate, andthe transformation method on model performances.A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate, and number of genes and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM clas sifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html .


2021 ◽  
Vol 11 ◽  
Author(s):  
Guyu Dai ◽  
Xiangbin Zhang ◽  
Wenjie Liu ◽  
Zhibin Li ◽  
Guangyu Wang ◽  
...  

PurposeTo find a suitable method for analyzing electronic portal imaging device (EPID) transmission fluence maps for the identification of position errors in the in vivo dose monitoring of patients with Graves’ ophthalmopathy (GO).MethodsPosition errors combining 0-, 2-, and 4-mm errors in the left-right (LR), anterior-posterior (AP), and superior-inferior (SI) directions in the delivery of 40 GO patient radiotherapy plans to a human head phantom were simulated and EPID transmission fluence maps were acquired. Dose difference (DD) and structural similarity (SSIM) maps were calculated to quantify changes in the fluence maps. Three types of machine learning (ML) models that utilize radiomics features of the DD maps (ML 1 models), features of the SSIM maps (ML 2 models), and features of both DD and SSIM maps (ML 3 models) as inputs were used to perform three types of position error classification, namely a binary classification of the isocenter error (type 1), three binary classifications of LR, SI, and AP direction errors (type 2), and an eight-element classification of the combined LR, SI, and AP direction errors (type 3). Convolutional neural network (CNN) was also used to classify position errors using the DD and SSIM maps as input.ResultsThe best-performing ML 1 model was XGBoost, which achieved accuracies of 0.889, 0.755, 0.778, 0.833, and 0.532 in the type 1, type 2-LR, type 2-AP, type 2-SI, and type 3 classification, respectively. The best ML 2 model was XGBoost, which achieved accuracies of 0.856, 0.731, 0.736, 0.949, and 0.491, respectively. The best ML 3 model was linear discriminant classifier (LDC), which achieved accuracies of 0.903, 0.792, 0.870, 0.931, and 0.671, respectively. The CNN achieved classification accuracies of 0.925, 0.833, 0.875, 0.949, and 0.689, respectively.ConclusionML models and CNN using combined DD and SSIM maps can analyze EPID transmission fluence maps to identify position errors in the treatment of GO patients. Further studies with large sample sizes are needed to improve the accuracy of CNN.


Author(s):  
Ramia Z. Al Bakain ◽  
Yahya S. Al-Degs ◽  
James V. Cizdziel ◽  
Mahmoud A. Elsohly

AbstractFifty four domestically produced cannabis samples obtained from different USA states were quantitatively assayed by GC–FID to detect 22 active components: 15 terpenoids and 7 cannabinoids. The profiles of the selected compounds were used as inputs for samples grouping to their geographical origins and for building a geographical prediction model using Linear Discriminant Analysis. The proposed sample extraction and chromatographic separation was satisfactory to select 22 active ingredients with a wide analytical range between 5.0 and 1,000 µg/mL. Analysis of GC-profiles by Principle Component Analysis retained three significant variables for grouping job (Δ9-THC, CBN, and CBC) and the modest discrimination of samples based on their geographical origin was reported. PCA was able to separate many samples of Oregon and Vermont while a mixed classification was observed for the rest of samples. By using LDA as a supervised classification method, excellent separation of cannabis samples was attained leading to a classification of new samples not being included in the model. Using two principal components and LDA with GC–FID profiles correctly predict the geographical of 100% Washington cannabis, 86% of both Oregon and Vermont samples, and finally, 71% of Ohio samples.


Sign in / Sign up

Export Citation Format

Share Document