scholarly journals Predicting environmentally responsive transgenerational differential DNA methylated regions (epimutations) in the genome using a hybrid deep-machine learning approach

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Pegah Mavaie ◽  
Lawrence Holder ◽  
Daniel Beck ◽  
Michael K. Skinner

Abstract Background Deep learning is an active bioinformatics artificial intelligence field that is useful in solving many biological problems, including predicting altered epigenetics such as DNA methylation regions. Deep learning (DL) can learn an informative representation that addresses the need for defining relevant features. However, deep learning models are computationally expensive, and they require large training datasets to achieve good classification performance. Results One approach to addressing these challenges is to use a less complex deep learning network for feature selection and Machine Learning (ML) for classification. In the current study, we introduce a hybrid DL-ML approach that uses a deep neural network for extracting molecular features and a non-DL classifier to predict environmentally responsive transgenerational differential DNA methylated regions (DMRs), termed epimutations, based on the extracted DL-based features. Various environmental toxicant induced epigenetic transgenerational inheritance sperm epimutations were used to train the model on the rat genome DNA sequence and use the model to predict transgenerational DMRs (epimutations) across the entire genome. Conclusion The approach was also used to predict potential DMRs in the human genome. Experimental results show that the hybrid DL-ML approach outperforms deep learning and traditional machine learning methods.

2021 ◽  
Author(s):  
Yerang Park ◽  
Young Jae Kim ◽  
Woong Ju ◽  
Kyehyun Nam ◽  
Soonyung Kim ◽  
...  

Abstract Human and material resources are scarce in countries such as developing countries with a high rate of cervical cancer. In such an environment, the introduction of automatic diagnostic technology that can replace specialists is urgent. Finding best method of the known methods can accelerate the adoption of computer-aided diagnostic tools for cervical cancer. In this paper, we would like to investigate which method, machine learning or deep learning, has higher classification performance in diagnosing cervical cancer.Using 4,119 sheets, cervical cancer was classified to positive or negative class using Resnet-50 for deep learning, XGB, SVM and RF for mechine learning. In both experiments, square images which of vaginal wall regions are cut were used. In the machine learning, 10 major features were extracted from a total of 300 features.All tests were validated by 5-fold cross-validation, and receiver operating characteristics(ROC) analysis yielded the following AUC: Resnet-50 0.97(CI 95% 0.949-0.976), XGB 0.82(CI 95% 0.797-0.851), SVM 0.84(CI 95% 0.801-0.854), RF 0.79(CI 95% 0.804-0.856). Deep learning was 0.15 point higher (p < 0.05) than the average (0.82) of three machine learning methods.We propose an better algorithm among the previously known or newly proposed algorithms for diagnosis of cervical cancer using cervicography images.


Author(s):  
Yuejun Liu ◽  
Yifei Xu ◽  
Xiangzheng Meng ◽  
Xuguang Wang ◽  
Tianxu Bai

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.


Sensors ◽  
2019 ◽  
Vol 19 (1) ◽  
pp. 210 ◽  
Author(s):  
Zied Tayeb ◽  
Juri Fedjaev ◽  
Nejla Ghaboosi ◽  
Christoph Richter ◽  
Lukas Everding ◽  
...  

Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g., hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause by the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: (1) A long short-term memory (LSTM); (2) a spectrogram-based convolutional neural network model (CNN); and (3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (any manual) feature engineering. Results were evaluated on our own publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from “BCI Competition IV”. Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.


Author(s):  
Bhanu Chander

Artificial intelligence (AI) is defined as a machine that can do everything a human being can do and produce better results. Means AI enlightening that data can produce a solution for its own results. Inside the AI ellipsoidal, Machine learning (ML) has a wide variety of algorithms produce more accurate results. As a result of technology, improvement increasing amounts of data are available. But with ML and AI, it is very difficult to extract such high-level, abstract features from raw data, moreover hard to know what feature should be extracted. Finally, we now have deep learning; these algorithms are modeled based on how human brains process the data. Deep learning is a particular kind of machine learning that provides flexibility and great power, with its attempts to learn in multiple levels of representation with the operations of multiple layers. Deep learning brief overview, platforms, Models, Autoencoders, CNN, RNN, and Appliances are described appropriately. Deep learning will have many more successes in the near future because it requires very little engineering by hand.


Author(s):  
Yogita Hande ◽  
Akkalashmi Muddana

Presently, the advances of the internet towards a wide-spread growth and the static nature of traditional networks has limited capacity to cope with organizational business needs. The new network architecture software defined networking (SDN) appeared to address these challenges and provides distinctive features. However, these programmable and centralized approaches of SDN face new security challenges which demand innovative security mechanisms like intrusion detection systems (IDS's). The IDS of SDN are designed currently with a machine learning approach; however, a deep learning approach is also being explored to achieve better efficiency and accuracy. In this article, an overview of the SDN with its security concern and IDS as a security solution is explained. A survey of existing security solutions designed to secure the SDN, and a comparative study of various IDS approaches based on a deep learning model and machine learning methods are discussed in the article. Finally, we describe future directions for SDN security.


Atmosphere ◽  
2019 ◽  
Vol 10 (5) ◽  
pp. 251 ◽  
Author(s):  
Wael Ghada ◽  
Nicole Estrella ◽  
Annette Menzel

Rain microstructure parameters assessed by disdrometers are commonly used to classify rain into convective and stratiform. However, different types of disdrometer result in different values for these parameters. This in turn potentially deteriorates the quality of rain type classifications. Thies disdrometer measurements at two sites in Bavaria in southern Germany were combined with cloud observations to construct a set of clear convective and stratiform intervals. This reference dataset was used to study the performance of classification methods from the literature based on the rain microstructure. We also explored the possibility of improving the performance of these methods by tuning the decision boundary. We further identified highly discriminant rain microstructure parameters and used these parameters in five machine-learning classification models. Our results confirm the potential of achieving high classification performance by applying the concepts of machine learning compared to already available methods. Machine-learning classification methods provide a concrete and flexible procedure that is applicable regardless of the geographical location or the device. The suggested procedure for classifying rain types is recommended prior to studying rain microstructure variability or any attempts at improving radar estimations of rain intensity.


BMC Genomics ◽  
2019 ◽  
Vol 20 (S11) ◽  
Author(s):  
Tianle Ma ◽  
Aidong Zhang

Abstract Background Comprehensive molecular profiling of various cancers and other diseases has generated vast amounts of multi-omics data. Each type of -omics data corresponds to one feature space, such as gene expression, miRNA expression, DNA methylation, etc. Integrating multi-omics data can link different layers of molecular feature spaces and is crucial to elucidate molecular pathways underlying various diseases. Machine learning approaches to mining multi-omics data hold great promises in uncovering intricate relationships among molecular features. However, due to the “big p, small n” problem (i.e., small sample sizes with high-dimensional features), training a large-scale generalizable deep learning model with multi-omics data alone is very challenging. Results We developed a method called Multi-view Factorization AutoEncoder (MAE) with network constraints that can seamlessly integrate multi-omics data and domain knowledge such as molecular interaction networks. Our method learns feature and patient embeddings simultaneously with deep representation learning. Both feature representations and patient representations are subject to certain constraints specified as regularization terms in the training objective. By incorporating domain knowledge into the training objective, we implicitly introduced a good inductive bias into the machine learning model, which helps improve model generalizability. We performed extensive experiments on the TCGA datasets and demonstrated the power of integrating multi-omics data and biological interaction networks using our proposed method for predicting target clinical variables. Conclusions To alleviate the overfitting problem in deep learning on multi-omics data with the “big p, small n” problem, it is helpful to incorporate biological domain knowledge into the model as inductive biases. It is very promising to design machine learning models that facilitate the seamless integration of large-scale multi-omics data and biomedical domain knowledge for uncovering intricate relationships among molecular features and clinical features.


2019 ◽  
Vol 9 (23) ◽  
pp. 5003 ◽  
Author(s):  
Francesco Zola ◽  
Jan Lukas Bruse ◽  
Maria Eguimendia ◽  
Mikel Galar ◽  
Raul Orduna Urrutia

The Bitcoin network not only is vulnerable to cyber-attacks but currently represents the most frequently used cryptocurrency for concealing illicit activities. Typically, Bitcoin activity is monitored by decreasing anonymity of its entities using machine learning-based techniques, which consider the whole blockchain. This entails two issues: first, it increases the complexity of the analysis requiring higher efforts and, second, it may hide network micro-dynamics important for detecting short-term changes in entity behavioral patterns. The aim of this paper is to address both issues by performing a “temporal dissection” of the Bitcoin blockchain, i.e., dividing it into smaller temporal batches to achieve entity classification. The idea is that a machine learning model trained on a certain time-interval (batch) should achieve good classification performance when tested on another batch if entity behavioral patterns are similar. We apply cascading machine learning principles—a type of ensemble learning applying stacking techniques—introducing a “k-fold cross-testing” concept across batches of varying size. Results show that blockchain batch size used for entity classification could be reduced for certain classes (Exchange, Gambling, and eWallet) as classification rates did not vary significantly with batch size; suggesting that behavioral patterns did not change significantly over time. Mixer and Market class detection, however, can be negatively affected. A deeper analysis of Mining Pool behavior showed that models trained on recent data perform better than models trained on older data, suggesting that “typical” Mining Pool behavior may be represented better by recent data. This work provides a first step towards uncovering entity behavioral changes via temporal dissection of blockchain data.


2019 ◽  
Author(s):  
Abdul Karim ◽  
Vahid Riahi ◽  
Avinash Mishra ◽  
Abdollah Dehzangi ◽  
M. A. Hakim Newton ◽  
...  

Abstract Representing molecules in the form of only one type of features and using those features to predict their activities is one of the most important approaches for machine-learning-based chemical-activity-prediction. For molecular activities like quantitative toxicity prediction, the performance depends on the type of features extracted and the machine learning approach used. For such cases, using one type of features and machine learning model restricts the prediction performance to specific representation and model used. In this paper, we study quantitative toxicity prediction and propose a machine learning model for the same. Our model uses an ensemble of heterogeneous predictors instead of typically using homogeneous predictors. The predictors that we use vary either on the type of features used or on the deep learning architecture employed. Each of these predictors presumably has its own strengths and weaknesses in terms of toxicity prediction. Our motivation is to make a combined model that utilizes different types of features and architectures to obtain better collective performance that could go beyond the performance of each individual predictor. We use six predictors in our model and test the model on four standard quantitative toxicity benchmark datasets. Experimental results show that our model outperforms the state-of-the-art toxicity prediction models in 8 out of 12 accuracy measures. Our experiments show that ensembling heterogeneous predictor improves the performance over single predictors and homogeneous ensembling of single predictors.The results show that each data representation or deep learning based predictor has its own strengths and weaknesses, thus employing a model ensembling multiple heterogeneous predictors could go beyond individual performance of each data representation or each predictor type.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xiaoshuo Li ◽  
Wenjun Tan ◽  
Pan Liu ◽  
Qinghua Zhou ◽  
Jinzhu Yang

Novel coronavirus pneumonia (NCP) has become a global pandemic disease, and computed tomography-based (CT) image analysis and recognition are one of the important tools for clinical diagnosis. In order to assist medical personnel to achieve an efficient and fast diagnosis of patients with new coronavirus pneumonia, this paper proposes an assisted diagnosis algorithm based on ensemble deep learning. The method combines the Stacked Generalization ensemble learning with the VGG16 deep learning to form a cascade classifier, and the information constituting the cascade classifier comes from multiple subsets of the training set, each of which is used to collect deviant information about the generalization behavior of the data set, such that this deviant information fills the cascade classifier. The algorithm was experimentally validated for classifying patients with novel coronavirus pneumonia, patients with common pneumonia (CP), and normal controls, and the algorithm achieved a prediction accuracy of 93.57%, sensitivity of 94.21%, specificity of 93.93%, precision of 89.40%, and F1-score of 91.74% for the three categories. The results show that the method proposed in this paper has good classification performance and can significantly improve the performance of deep neural networks for multicategory prediction tasks.


Sign in / Sign up

Export Citation Format

Share Document