scholarly journals Deep learning based on stacked sparse autoencoder applied to viral genome classification of SARS-CoV-2 virus

2021 ◽  
Author(s):  
Gracielly G. F. Coutinho ◽  
Gabriel B. M. Câmara ◽  
Raquel de M. Barbosa ◽  
Marcelo A. C. Fernandes

Since December 2019, the world has been intensely affected by the COVID-19 pandemic, caused by the SARS-CoV-2 virus, first identified in Wuhan, China. In the case of a novel virus identification, the early elucidation of taxonomic classification and origin of the virus genomic sequence is essential for strategic planning, containment, and treatments. Deep learning techniques have been successfully used in many viral classification problems associated with viral infections diagnosis, metagenomics, phylogenetic, and analysis. This work proposes to generate an efficient viral genome classifier for the SARS-CoV-2 virus using the deep neural network (DNN) based on the stacked sparse autoencoder (SSAE) technique. We performed four different experiments to provide different levels of taxonomic classification of the SARS-CoV-2 virus. The confusion matrix presented the validation and test sets and the ROC curve for the validation set. In all experiments, the SSAE technique provided great performance results. In this work, we explored the utilization of image representations of the complete genome sequences as the SSAE input to provide a viral classification of the SARS-CoV-2. For that, a dataset based on k-mers image representation, with k=6, was applied. The results indicated the applicability of using this deep learning technique in genome classification problems.

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6661
Author(s):  
Lars Schmarje ◽  
Johannes Brünger ◽  
Monty Santarossa ◽  
Simon-Martin Schröder ◽  
Rainer Kiko ◽  
...  

Deep learning has been successfully applied to many classification problems including underwater challenges. However, a long-standing issue with deep learning is the need for large and consistently labeled datasets. Although current approaches in semi-supervised learning can decrease the required amount of annotated data by a factor of 10 or even more, this line of research still uses distinct classes. For underwater classification, and uncurated real-world datasets in general, clean class boundaries can often not be given due to a limited information content in the images and transitional stages of the depicted objects. This leads to different experts having different opinions and thus producing fuzzy labels which could also be considered ambiguous or divergent. We propose a novel framework for handling semi-supervised classifications of such fuzzy labels. It is based on the idea of overclustering to detect substructures in these fuzzy labels. We propose a novel loss to improve the overclustering capability of our framework and show the benefit of overclustering for fuzzy labels. We show that our framework is superior to previous state-of-the-art semi-supervised methods when applied to real-world plankton data with fuzzy labels. Moreover, we acquire 5 to 10% more consistent predictions of substructures.


2019 ◽  
Author(s):  
Ismael Araujo ◽  
Juan Gamboa ◽  
Adenilton Silva

To recognize patterns that are usually imperceptible by human beings has been one of the main advantages of using machine learning algorithms The use of Deep Learning techniques has been promising to the classification problems, especially the ones related to image classification. The classification of gases detected by an artificial nose is one other area where Deep Learning techniques can be used to seek classification improvements. Succeeding in a classification task can result in many advantages to quality control, as well as to preventing accidents. In this work, it is presented some Deep Learning models specifically created to the task of gas classification.


Author(s):  
Gurjit S. Randhawa ◽  
Maximillian P.M. Soltysiak ◽  
Hadi El Roz ◽  
Camila P.E. de Souza ◽  
Kathleen A. Hill ◽  
...  

AbstractAs of February 20, 2020, the 2019 novel coronavirus (renamed to COVID-19) spread to 30 countries with 2130 deaths and more than 75500 confirmed cases. COVID-19 is being compared to the infamous SARS coronavirus, which resulted, between November 2002 and July 2003, in 8098 confirmed cases worldwide with a 9.6% death rate and 774 deaths. Though COVID-19 has a death rate of 2.8% as of 20 February, the 75752 confirmed cases in a few weeks (December 8, 2019 to February 20, 2020) are alarming, with cases likely being under-reported given the comparatively longer incubation period. Such outbreaks demand elucidation of taxonomic classification and origin of the virus genomic sequence, for strategic planning, containment, and treatment. This paper identifies an intrinsic COVID-19 genomic signature and uses it together with a machine learning-based alignment-free approach for an ultra-fast, scalable, and highly accurate classification of whole COVID-19 genomes. The proposed method combines supervised machine learning with digital signal processing for genome analyses, augmented by a decision tree approach to the machine learning component, and a Spearman’s rank correlation coefficient analysis for result validation. These tools are used to analyze a large dataset of over 5000 unique viral genomic sequences, totalling 61.8 million bp. Our results support a hypothesis of a bat origin and classify COVID-19 as Sarbecovirus, within Betacoronavirus. Our method achieves high levels of classification accuracy and discovers the most relevant relationships among over 5,000 viral genomes within a few minutes, ab initio, using raw DNA sequence data alone, and without any specialized biological knowledge, training, gene or genome annotations. This suggests that, for novel viral and pathogen genome sequences, this alignment-free whole-genome machine-learning approach can provide a reliable real-time option for taxonomic classification.


2016 ◽  
Vol 2016 ◽  
pp. 1-12 ◽  
Author(s):  
Hongmei Liu ◽  
Lianfeng Li ◽  
Jian Ma

The main challenge of fault diagnosis lies in finding good fault features. A deep learning network has the ability to automatically learn good characteristics from input data in an unsupervised fashion, and its unique layer-wise pretraining and fine-tuning using the backpropagation strategy can solve the difficulties of training deep multilayer networks. Stacked sparse autoencoders or other deep architectures have shown excellent performance in speech recognition, face recognition, text classification, image recognition, and other application domains. Thus far, however, there have been very few research studies on deep learning in fault diagnosis. In this paper, a new rolling bearing fault diagnosis method that is based on short-time Fourier transform and stacked sparse autoencoder is first proposed; this method analyzes sound signals. After spectrograms are obtained by short-time Fourier transform, stacked sparse autoencoder is employed to automatically extract the fault features, and softmax regression is adopted as the method for classifying the fault modes. The proposed method, when applied to sound signals that are obtained from a rolling bearing test rig, is compared with empirical mode decomposition, Teager energy operator, and stacked sparse autoencoder when using vibration signals to verify the performance and effectiveness of the proposed method.


2018 ◽  
Vol 19 (S7) ◽  
Author(s):  
Antonino Fiannaca ◽  
Laura La Paglia ◽  
Massimo La Rosa ◽  
Giosue’ Lo Bosco ◽  
Giovanni Renda ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document