scholarly journals AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Li-Ming Xiao ◽  
Yun-Qi Wan ◽  
Zhen-Ran Jiang

Abstract Background More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability, which makes researchers have to trade-off accuracy and interpretability. It is necessary to develop a method that can not only match deep learning-based methods in performance but also with good interpretability that can be comparable to conventional machine learning methods. Results To overcome these problems, we propose an intrinsically interpretable method called AttCRISPR based on deep learning to predict the on-target activity. The advantage of AttCRISPR lies in using the ensemble learning strategy to stack available encoding-based methods and embedding-based methods with strong interpretability. Comparison with the state-of-the-art methods using WT-SpCas9, eSpCas9(1.1), SpCas9-HF1 datasets, AttCRISPR can achieve an average Spearman value of 0.872, 0.867, 0.867, respectively on several public datasets, which is superior to these methods. Furthermore, benefits from two attention modules—one spatial and one temporal, AttCRISPR has good interpretability. Through these modules, we can understand the decisions made by AttCRISPR at both global and local levels without other post hoc explanations techniques. Conclusion With the trained models, we reveal the preference for each position-dependent nucleotide on the sgRNA (short guide RNA) sequence in each dataset at a global level. And at a local level, we prove that the interpretability of AttCRISPR can be used to guide the researchers to design sgRNA with higher activity.

2021 ◽  
pp. 1-12
Author(s):  
Mukul Kumar ◽  
Nipun Katyal ◽  
Nersisson Ruban ◽  
Elena Lyakso ◽  
A. Mary Mekala ◽  
...  

Over the years the need for differentiating various emotions from oral communication plays an important role in emotion based studies. There have been different algorithms to classify the kinds of emotion. Although there is no measure of fidelity of the emotion under consideration, which is primarily due to the reason that most of the readily available datasets that are annotated are produced by actors and not generated in real-world scenarios. Therefore, the predicted emotion lacks an important aspect called authenticity, which is whether an emotion is actual or stimulated. In this research work, we have developed a transfer learning and style transfer based hybrid convolutional neural network algorithm to classify the emotion as well as the fidelity of the emotion. The model is trained on features extracted from a dataset that contains stimulated as well as actual utterances. We have compared the developed algorithm with conventional machine learning and deep learning techniques by few metrics like accuracy, Precision, Recall and F1 score. The developed model performs much better than the conventional machine learning and deep learning models. The research aims to dive deeper into human emotion and make a model that understands it like humans do with precision, recall, F1 score values of 0.994, 0.996, 0.995 for speech authenticity and 0.992, 0.989, 0.99 for speech emotion classification respectively.


Author(s):  
Nourhan Mohamed Zayed ◽  
Heba A. Elnemr

Deep learning (DL) is a special type of machine learning that attains great potency and flexibility by learning to represent input raw data as a nested hierarchy of essences and representations. DL consists of more layers than conventional machine learning that permit higher levels of abstractions and improved prediction from data. More abstract representations computed in terms of less abstract ones. The goal of this chapter is to present an intensive survey of existing literature on DL techniques over the last years especially in the medical imaging analysis field. All these techniques and algorithms have their points of interest and constraints. Thus, analysis of various techniques and transformations, submitted prior in writing, for plan and utilization of DL methods from medical image analysis prospective will be discussed. The authors provide future research directions in DL area and set trends and identify challenges in the medical imaging field. Furthermore, as quantity of medicinal application demands increase, an extended study and investigation in DL area becomes very significant.


2020 ◽  
Vol 12 (12) ◽  
pp. 5074
Author(s):  
Jiyoung Woo ◽  
Jaeseok Yun

Spam posts in web forum discussions cause user inconvenience and lower the value of the web forum as an open source of user opinion. In this regard, as the importance of a web post is evaluated in terms of the number of involved authors, noise distorts the analysis results by adding unnecessary data to the opinion analysis. Here, in this work, an automatic detection model for spam posts in web forums using both conventional machine learning and deep learning is proposed. To automatically differentiate between normal posts and spam, evaluators were asked to recognize spam posts in advance. To construct the machine learning-based model, text features from posted content using text mining techniques from the perspective of linguistics were extracted, and supervised learning was performed to distinguish content noise from normal posts. For the deep learning model, raw text including and excluding special characters was utilized. A comparison analysis on deep neural networks using the two different recurrent neural network (RNN) models of the simple RNN and long short-term memory (LSTM) network was also performed. Furthermore, the proposed model was applied to two web forums. The experimental results indicate that the deep learning model affords significant improvements over the accuracy of conventional machine learning associated with text features. The accuracy of the proposed model using LSTM reaches 98.56%, and the precision and recall of the noise class reach 99% and 99.53%, respectively.


Author(s):  
Roopa B. Hegde ◽  
Vidya Kudva ◽  
Keerthana Prasad ◽  
Brij Mohan Singh ◽  
Shyamala Guruvare

2021 ◽  
Vol 5 (1) ◽  
pp. 34-42
Author(s):  
Refika Sultan Doğan ◽  
Bülent Yılmaz

AbstractDetermination of polyp types requires tissue biopsy during colonoscopy and then histopathological examination of the microscopic images which tremendously time-consuming and costly. The first aim of this study was to design a computer-aided diagnosis system to classify polyp types using colonoscopy images (optical biopsy) without the need for tissue biopsy. For this purpose, two different approaches were designed based on conventional machine learning (ML) and deep learning. Firstly, classification was performed using random forest approach by means of the features obtained from the histogram of gradients descriptor. Secondly, simple convolutional neural networks (CNN) based architecture was built to train with the colonoscopy images containing colon polyps. The performances of these approaches on two (adenoma & serrated vs. hyperplastic) or three (adenoma vs. hyperplastic vs. serrated) category classifications were investigated. Furthermore, the effect of imaging modality on the classification was also examined using white-light and narrow band imaging systems. The performance of these approaches was compared with the results obtained by 3 novice and 4 expert doctors. Two-category classification results showed that conventional ML approach achieved significantly better than the simple CNN based approach did in both narrow band and white-light imaging modalities. The accuracy reached almost 95% for white-light imaging. This performance surpassed the correct classification rate of all 7 doctors. Additionally, the second task (three-category) results indicated that the simple CNN architecture outperformed both conventional ML based approaches and the doctors. This study shows the feasibility of using conventional machine learning or deep learning based approaches in automatic classification of colon types on colonoscopy images.


2019 ◽  
Vol 58 (01) ◽  
pp. 031-041 ◽  
Author(s):  
Sara Rabhi ◽  
Jérémie Jakubowicz ◽  
Marie-Helene Metzger

Objective The objective of this article was to compare the performances of health care-associated infection (HAI) detection between deep learning and conventional machine learning (ML) methods in French medical reports. Methods The corpus consisted in different types of medical reports (discharge summaries, surgery reports, consultation reports, etc.). A total of 1,531 medical text documents were extracted and deidentified in three French university hospitals. Each of them was labeled as presence (1) or absence (0) of HAI. We started by normalizing the records using a list of preprocessing techniques. We calculated an overall performance metric, the F1 Score, to compare a deep learning method (convolutional neural network [CNN]) with the most popular conventional ML models (Bernoulli and multi-naïve Bayes, k-nearest neighbors, logistic regression, random forests, extra-trees, gradient boosting, support vector machines). We applied the hyperparameter Bayesian optimization for each model based on its HAI identification performances. We included the set of text representation as an additional hyperparameter for each model, using four different text representations (bag of words, term frequency–inverse document frequency, word2vec, and Glove). Results CNN outperforms all other conventional ML algorithms for HAI classification. The best F1 Score of 97.7% ± 3.6% and best area under the curve score of 99.8% ± 0.41% were achieved when CNN was directly applied to the processed clinical notes without a pretrained word2vec embedding. Through receiver operating characteristic curve analysis, we could achieve a good balance between false notifications (with a specificity equal to 0.937) and system detection capability (with a sensitivity equal to 0.962) using the Youden's index reference. Conclusions The main drawback of CNNs is their opacity. To address this issue, we investigated CNN inner layers' activation values to visualize the most meaningful phrases in a document. This method could be used to build a phrase-based medical assistant algorithm to help the infection control practitioner to select relevant medical records. Our study demonstrated that deep learning approach outperforms other classification learning algorithms for automatically identifying HAIs in medical reports.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7078
Author(s):  
Yueting Wang ◽  
Minzan Li ◽  
Ronghua Ji ◽  
Minjuan Wang ◽  
Lihua Zheng

Visible-near-infrared spectrum (Vis-NIR) spectroscopy technology is one of the most important methods for non-destructive and rapid detection of soil total nitrogen (STN) content. In order to find a practical way to build STN content prediction model, three conventional machine learning methods and one deep learning approach are investigated and their predictive performances are compared and analyzed by using a public dataset called LUCAS Soil (19,019 samples). The three conventional machine learning methods include ordinary least square estimation (OLSE), random forest (RF), and extreme learning machine (ELM), while for the deep learning method, three different structures of convolutional neural network (CNN) incorporated Inception module are constructed and investigated. In order to clarify effectiveness of different pre-treatments on predicting STN content, the three conventional machine learning methods are combined with four pre-processing approaches (including baseline correction, smoothing, dimensional reduction, and feature selection) are investigated, compared, and analyzed. The results indicate that the baseline-corrected and smoothed ELM model reaches practical precision (coefficient of determination (R2) = 0.89, root mean square error of prediction (RMSEP) = 1.60 g/kg, and residual prediction deviation (RPD) = 2.34). While among three different structured CNN models, the one with more 1 × 1 convolutions preforms better (R2 = 0.93; RMSEP = 0.95 g/kg; and RPD = 3.85 in optimal case). In addition, in order to evaluate the influence of data set characteristics on the model, the LUCAS data set was divided into different data subsets according to dataset size, organic carbon (OC) content and countries, and the results show that the deep learning method is more effective and practical than conventional machine learning methods and, on the premise of enough data samples, it can be used to build a robust STN content prediction model with high accuracy for the same type of soil with similar agricultural treatment.


2020 ◽  
Vol 3 (3) ◽  
pp. 202-213
Author(s):  
Lu Chen ◽  
Chunchao Xia ◽  
Huaiqiang Sun

ABSTRACT Deep learning (DL) is a recently proposed subset of machine learning methods that has gained extensive attention in the academic world, breaking benchmark records in areas such as visual recognition and natural language processing. Different from conventional machine learning algorithm, DL is able to learn useful representations and features directly from raw data through hierarchical nonlinear transformations. Because of its ability to detect abstract and complex patterns, DL has been used in neuroimaging studies of psychiatric disorders, which are characterized by subtle and diffuse alterations. Here, we provide a brief review of recent advances and associated challenges in neuroimaging studies of DL applied to psychiatric disorders. The results of these studies indicate that DL could be a powerful tool in assisting the diagnosis of psychiatric diseases. We conclude our review by clarifying the main promises and challenges of DL application in psychiatric disorders, and possible directions for future research.


Author(s):  
Zhewei Ye ◽  
Qinjue Yi

At present, beam pumping units are the most extensively-applied component in rod pumping systems, and the analysis of the indicator diagram of a rod pump is an important means of judging its downhole working condition. However, the synthetic study and judgment of the indicator diagram by manual means has a low efficiency, large error, and poor immediacy, and it is difficult to apply the conclusions in time and accurately to adjust the operating parameters of the pumping units. Moreover, expert systems rely on expert experience and conventional machine learning requires manual pre-selection of geometric features such as moments and vector curves, which will reduce the accuracy of recognition when similar indicator diagrams appear. To solve the above technical defects, in this paper, a deep-learning convolutional neural network (CNN) is proposed using the CNN model based on AlexNet. The automatic recognition of the indicator diagram is thus realized, and, on the basis of previous studies, this model simplifies the structure of the model and takes into account 15 common downhole working conditions of the pumping unit. In this model, the batch normalization (BN) layer is used to replace the local response normalization (LRN) and dropout layers and all kinds of indicator diagrams are put into the same model frame for automatic identification. The experimental application of the measured data shows that the model not only has a short training time, but also has a working-condition diagnosis accuracy of 96.05%, which can solve the deficiencies and defects of artificial identification, expert systems, and conventional machine learning to a certain extent. A deep-learning CNN can provide a new reference for fast working-condition diagnosis of indicator diagram, making indicator-diagram judgment timely and accurate, and thus it is possible to provide a direct basis for parameter adjustment of pumping units.


Sign in / Sign up

Export Citation Format

Share Document