scholarly journals DNA sequences performs as natural language processing by exploiting deep learning algorithm for the identification of N4-methylcytosine

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Abdul Wahab ◽  
Hilal Tayara ◽  
Zhenyu Xuan ◽  
Kil To Chong

AbstractN4-methylcytosine is a biochemical alteration of DNA that affects the genetic operations without modifying the DNA nucleotides such as gene expression, genomic imprinting, chromosome stability, and the development of the cell. In the proposed work, a computational model, 4mCNLP-Deep, used the word embedding approach as a vector formulation by exploiting deep learning based CNN algorithm to predict 4mC and non-4mC sites on the C.elegans genome dataset. Diversity of ranges employed for the experimental such as corpus k-mer and k-fold cross-validation to obtain the prevailing capabilities. The 4mCNLP-Deep outperform from the state-of-the-art predictor by achieving the results in five evaluation metrics by following; Accuracy (ACC) as 0.9354, Mathew’s correlation coefficient (MCC) as 0.8608, Specificity (Sp) as 0.89.96, Sensitivity (Sn) as 0.9563, and Area under curve (AUC) as 0.9731 by using 3-mer corpus word2vec and 3-fold cross-validation and attained the increment of 1.1%, 0.6%, 0.58%, 0.77%, and 4.89%, respectively. At last, we developed the online webserver http://nsclbio.jbnu.ac.kr/tools/4mCNLP-Deep/, for the experimental researchers to get the results easily.

Author(s):  
Usman Ahmed ◽  
Jerry Chun-Wei Lin ◽  
Gautam Srivastava

Deep learning methods have led to a state of the art medical applications, such as image classification and segmentation. The data-driven deep learning application can help stakeholders to collaborate. However, limited labelled data set limits the deep learning algorithm to generalize for one domain into another. To handle the problem, meta-learning helps to learn from a small set of data. We proposed a meta learning-based image segmentation model that combines the learning of the state-of-the-art model and then used it to achieve domain adoption and high accuracy. Also, we proposed a prepossessing algorithm to increase the usability of the segments part and remove noise from the new test image. The proposed model can achieve 0.94 precision and 0.92 recall. The ability to increase 3.3% among the state-of-the-art algorithms.


Lot of research has gone into Natural language processing and the state of the art algorithms in deep learning that unambiguously helps in converting an English text into a data structure without loss of meaning. Also with the advent of neural networks for learning word representations as vectors has helped a lot in revolutionizing the automatic feature extraction from text data corpus. A combination of word embedding and the use of a deep learning algorithm like a convolution neural network helped in better accuracy for text classification. In this era of Internet of things and the voluminous amounts of data that is overwhelming the users determining the veracity of the data is a very challenging task. There are many truth discovery algorithms in literature that help in resolving the conflicts that arise due to multiple sources of data. These algorithms help in estimating the trustworthiness of the data and reliability of the sources. In this paper, a convolution based truth discovery with multitasking is proposed to estimate the genuineness of the data for a given text corpus. The proposed algorithm has been tested on analysing the genuineness of Quora questions dataset and experimental results showed an improved accuracy and speed over other existing approaches.


2014 ◽  
Vol 641-642 ◽  
pp. 1287-1290
Author(s):  
Lan Zhang ◽  
Yu Feng Nie ◽  
Zhen Hai Wang

Deep neural network as a part of deep learning algorithm is a state-of-the-art approach to find higher level representations of input data which has been introduced to many practical and challenging learning problems successfully. The primary goal of deep learning is to use large data to help solving a given task on machine learning. We propose an methodology for image de-noising project defined by this model and conduct training a large image database to get the experimental output. The result shows the robustness and efficient our our algorithm.


2020 ◽  
Vol 12 (4) ◽  
pp. 68-81
Author(s):  
Xin Zheng ◽  
Jun Li ◽  
Qingrong Wu

Since the explosive growth of we-medias today, personalized recommendation is playing an increasingly important role to help users to find their target articles in vast amounts of data. Deep learning, on the other hand, has shown good results in image processing, computer vision, natural language processing, and other fields. But it's a relative blank in the application of we-media articles recommendation. Combining the new features of we-media articles, this paper puts forward a recommendation algorithm of we-media articles based on topic model, Latent Dirichlet Allocation (LDA), and deep learning algorithm, Recurrent Neural Networks (RNNs). Experiments on the real datasets show that the combined method outperforms the traditional collaborative filtering recommendation and non-personalized recommendation method.


Author(s):  
L. Meyer ◽  
F. Lemarchand ◽  
P. Sidiropoulos

Abstract. The accurate split of large areas of land into discrete fields is a crucial step for several agriculture-related remote sensing pipelines. This work aims to fully automate this tedious and resource-demanding process using a state-of-the-art deep learning algorithm with only a single Sentinel-2 image as input. The Mask R-CNN, which has forged its success upon instance segmentation for objects from everyday life, is adapted for the field boundary detection problem. Such model automatically generates closed geometries without any heavy post-processing. When tested with satellite imagery from Denmark, this tailored model correctly predicts field boundaries with an overall accuracy of 0.79. Besides, it demonstrates a robust knowledge generalisation with positive results over different geographies, as it gets an overall accuracy of 0.71 when used over areas in France.


Agronomy ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2364
Author(s):  
Ali Mirzazadeh ◽  
Afshin Azizi ◽  
Yousef Abbaspour-Gilandeh ◽  
José Luis Hernández-Hernández ◽  
Mario Hernández-Hernández ◽  
...  

Estimation of crop damage plays a vital role in the management of fields in the agriculture sector. An accurate measure of it provides key guidance to support agricultural decision-making systems. The objective of the study was to propose a novel technique for classifying damaged crops based on a state-of-the-art deep learning algorithm. To this end, a dataset of rapeseed field images was gathered from the field after birds’ attacks. The dataset consisted of three classes including undamaged, partially damaged, and fully damaged crops. Vgg16 and Res-Net50 as pre-trained deep convolutional neural networks were used to classify these classes. The overall classification accuracy reached 93.7% and 98.2% for the Vgg16 and the ResNet50 algorithms, respectively. The results indicated that a deep neural network has a high ability in distinguishing and categorizing different image-based datasets of rapeseed. The findings also revealed a great potential of deep learning-based models to classify other damaged crops.


2020 ◽  
Author(s):  
Shufang Wu ◽  
Zhencheng Fang ◽  
Jie Tan ◽  
Mo Li ◽  
Chunhui Wang ◽  
...  

ABSTRACTBackgroundProkaryotic viruses referred to as phages can be divided into virulent and temperate phages. Distinguishing virulent and temperate phage-derived sequences in metavirome data is important for their role in interactions with bacterial hosts and regulations of microbial communities. However there is no experimental or computational approach to classify sequences of these two in culture-independent metavirome effectively, we present a new computational method DeePhage, which can directly and rapidly judge each read or contig as a virulent or temperate phage-derived fragment.FindingsDeePhage utilizes a “one-hot” encoding form to have an overall and detailed representation of DNA sequences. Sequence signatures are detected via a deep learning algorithm, namely a convolutional neural network to extract valuable local features. DeePhage makes better performance than the most related method PHACTS. The accuracy of DeePhage on five-fold validation reach as high as 88%, nearly 30% higher than PHACTS. Evaluation on real metavirome shows DeePhage annotated 54.4% of reliable contigs while PHACTS annotated 44.5%. While running on the same machine, DeePhage reduces computational time than PHACTS by 810 times. Besides, we proposed a new strategy to explore phage transformations in the microbial community by direct detection of the temperate viral fragments from metagenome and metavirome. The detectable transformation of temperate phages provided us a new insight into the potential treatment for human disease.ConclusionsDeePhage is the first tool that can rapidly and efficiently identify two kinds of phage fragments especially for metagenomics analysis with satisfactory performance. DeePhage is freely available via http://cqb.pku.edu.cn/ZhuLab/DeePhage or https://github.com/shufangwu/DeePhage.


2021 ◽  
Vol 8 ◽  
Author(s):  
Edoardo Cipolletta ◽  
Maria Chiara Fiorentino ◽  
Sara Moccia ◽  
Irene Guidotti ◽  
Walter Grassi ◽  
...  

Objectives: This study aims to develop an automatic deep-learning algorithm, which is based on Convolutional Neural Networks (CNNs), for ultrasound informative-image selection of hyaline cartilage at metacarpal head level. The algorithm performance and that of three beginner sonographers were compared with an expert assessment, which was considered the gold standard.Methods: The study was divided into two steps. In the first one, an automatic deep-learning algorithm for image selection was developed using 1,600 ultrasound (US) images of the metacarpal head cartilage (MHC) acquired in 40 healthy subjects using a very high-frequency probe (up to 22 MHz). The algorithm task was to identify US images defined informative as they show enough information to fulfill the Outcome Measure in Rheumatology US definition of healthy hyaline cartilage. The algorithm relied on VGG16 CNN, which was fine-tuned to classify US images in informative and non-informative ones. A repeated leave-four-subject out cross-validation was performed using the expert sonographer assessment as gold-standard. In the second step, the expert assessed the algorithm and the beginner sonographers' ability to obtain US informative images of the MHC.Results: The VGG16 CNN showed excellent performance in the first step, with a mean area (AUC) under the receiver operating characteristic curve, computed among the 10 models obtained from cross-validation, of 0.99 ± 0.01. The model that reached the best AUC on the testing set, which we named “MHC identifier 1,” was then evaluated by the expert sonographer. The agreement between the algorithm, and the expert sonographer was almost perfect [Cohen's kappa: 0.84 (95% confidence interval: 0.71–0.98)], whereas the agreement between the expert and the beginner sonographers using conventional assessment was moderate [Cohen's kappa: 0.63 (95% confidence interval: 0.49–0.76)]. The conventional obtainment of US images by beginner sonographers required 6.0 ± 1.0 min, whereas US videoclip acquisition by a beginner sonographer lasted only 2.0 ± 0.8 min.Conclusion: This study paves the way for the automatic identification of informative US images for assessing MHC. This may redefine the US reliability in the evaluation of MHC integrity, especially in terms of intrareader reliability and may support beginner sonographers during US training.


2018 ◽  
Vol 4 ◽  
pp. e167 ◽  
Author(s):  
Iam Palatnik de Sousa

A learning algorithm is proposed for the task of Arabic Handwritten Character and Digit recognition. The architecture consists on an ensemble of different Convolutional Neural Networks. The proposed training algorithm uses a combination of adaptive gradient descent on the first epochs and regular stochastic gradient descent in the last epochs, to facilitate convergence. Different validation strategies are tested, namely Monte Carlo Cross-Validation and K-fold Cross Validation. Hyper-parameter tuning was done by using the MADbase digits dataset. State of the art validation and testing classification accuracies were achieved, with average values of 99.74% and 99.47% respectively. The same algorithm was then trained and tested with the AHCD character dataset, also yielding state of the art validation and testing classification accuracies: 98.60% and 98.42% respectively.


Sign in / Sign up

Export Citation Format

Share Document