scholarly journals Hybrid Deep Neural Network for Handling Data Imbalance in Precursor MicroRNA

2021 ◽  
Vol 9 ◽  
Author(s):  
Elakkiya R. ◽  
Deepak Kumar Jain ◽  
Ketan Kotecha ◽  
Sharnil Pandya ◽  
Sai Siddhartha Reddy ◽  
...  

Over the last decade, the field of bioinformatics has been increasing rapidly. Robust bioinformatics tools are going to play a vital role in future progress. Scientists working in the field of bioinformatics conduct a large number of researches to extract knowledge from the biological data available. Several bioinformatics issues have evolved as a result of the creation of massive amounts of unbalanced data. The classification of precursor microRNA (pre miRNA) from the imbalanced RNA genome data is one such problem. The examinations proved that pre miRNAs (precursor microRNAs) could serve as oncogene or tumor suppressors in various cancer types. This paper introduces a Hybrid Deep Neural Network framework (H-DNN) for the classification of pre miRNA in imbalanced data. The proposed H-DNN framework is an integration of Deep Artificial Neural Networks (Deep ANN) and Deep Decision Tree Classifiers. The Deep ANN in the proposed H-DNN helps to extract the meaningful features and the Deep Decision Tree Classifier helps to classify the pre miRNA accurately. Experimentation of H-DNN was done with genomes of animals, plants, humans, and Arabidopsis with an imbalance ratio up to 1:5000 and virus with a ratio of 1:400. Experimental results showed an accuracy of more than 99% in all the cases and the time complexity of the proposed H-DNN is also very less when compared with the other existing approaches.

Author(s):  
S. Neelakandan ◽  
D. Paulraj

People communicate their views, arguments and emotions about their everyday life on social media (SM) platforms (e.g. Twitter and Facebook). Twitter stands as an international micro-blogging service that features a brief message called tweets. Freestyle writing, incorrect grammar, typographical errors and abbreviations are some noises that occur in the text. Sentiment analysis (SA) centered on a tweet posted by the user, and also opinion mining (OM) of the customers review is another famous research topic. The texts are gathered from users’ tweets by means of OM and automatic-SA centered on ternary classifications, namely positive, neutral and negative. It is very challenging for the researchers to ascertain sentiments as a result of its limited size, misspells, unstructured nature, abbreviations and slangs for Twitter data. This paper, with the aid of the Gradient Boosted Decision Tree classifier (GBDT), proposes an efficient SA and Sentiment Classification (SC) of Twitter data. Initially, the twitter data undergoes pre-processing. Next, the pre-processed data is processed using HDFS MapReduce. Now, the features are extracted from the processed data, and then efficient features are selected using the Improved Elephant Herd Optimization (I-EHO) technique. Now, score values are calculated for each of those chosen features and given to the classifier. At last, the GBDT classifier classifies the data as negative, positive, or neutral. Experiential results are analyzed and contrasted with the other conventional techniques to show the highest performance of the proposed method.


Water ◽  
2020 ◽  
Vol 12 (8) ◽  
pp. 2249
Author(s):  
Ghorban Mahtabi ◽  
Barkha Chaplot ◽  
Hazi Mohammad Azamathulla ◽  
Mahesh Pal

This paper presents a classification using a decision tree algorithm of hydraulic jump over rough beds based on the approach Froude number, Fr1. Specifically, 581 datasets, from literature, were analyzed. Of these, 280 datasets were for natural rough beds and 301 were for artificial rough beds. The said dataset was divided into four classes based on the energy losses. To compare the performance of the decision tree classifier (J48), a multi-layer neural network (NN) was used. The results suggest an improved performance in terms of classification accuracy by the J48 algorithm in comparison to the NN classifier. Furthermore, the classifier model had only four leaves and achieved an accuracy of 91.56%. Furthermore, classification results showed that the first class (A) of hydraulic jump over the rough beds is approximately similar to that for the smooth bed. Moreover, in the next three classes (B, C, and D), upper values of Fr1 decreased with respect to the smooth bed classes. Lastly, in class D, the upper value of Fr1 reduced to 7.45, which indicates that the shear stress (i.e., the energy loss) grows sharply with increasing Fr1. Put simply, bed roughness effectively increases the energy dissipation with an increase in the Fr1.


Author(s):  
V. Jinubala ◽  
P. Jeyakumar

Aims: To classify the rice pest data based on the weather attributes using a machine learning approach, a decision tree classifier, and to validate the performance results with other existing techniques through comparison. Design: Rice pest classification using C5.0 algorithm Methodology: We collected rice pest data from the crop fields of various regions in the state of Maharashtra of India. The dataset contains the name of the region (Taluk), period (week), pest data, temperature, rainfall, and relative humidity. The data is collected from 39 taluks within four districts in different weeks of the year of 2013-2014. The weather information plays a vital role in this rice pest data analysis, because based on the weather, pest infestation varies in all the regions. The pests considered in this research are Yellow Stem borer, Gall midge, Leaf folder, and Planthopper. The collected dataset is given as input to the classifier, where 75% of data from the dataset is used for training, and 25% of data are used for testing the classifier. Results: The proposed C5.0 algorithm performed better in the classification of rice pest dataset based on weather attributes. The C5.0 algorithm achieved 88.99% accuracy, 78.81% sensitivity, and 89.11% specificity, which are higher in performance when compared with other techniques. Compared with the other different methods, the C5.0 algorithm achieved 1.3 to 8.5% improved accuracy, 2.4 to 9% improved sensitivity, and 0.8 to 7.8% improved specificity. Conclusion: Early detection of pest and pest based diseases is an essential process to avoid major crop losses. The proposed classification model is designed to classify the level of pest infestations based on weather attributes, as level of infestations caused by the rice pest varies based on weather conditions. The C5.0 algorithm classified the rice pest data based on the weather attributes in the dataset.


2020 ◽  
Vol 2020 ◽  
pp. 1-13 ◽  
Author(s):  
Majid Nour ◽  
Kemal Polat

Hypertension (high blood pressure) is an important disease seen among the public, and early detection of hypertension is significant for early treatment. Hypertension is depicted as systolic blood pressure higher than 140 mmHg or diastolic blood pressure higher than 90 mmHg. In this paper, in order to detect the hypertension types based on the personal information and features, four machine learning (ML) methods including C4.5 decision tree classifier (DTC), random forest, linear discriminant analysis (LDA), and linear support vector machine (LSVM) have been used and then compared with each other. In the literature, we have first carried out the classification of hypertension types using classification algorithms based on personal data. To further explain the variability of the classifier type, four different classifier algorithms were selected for solving this problem. In the hypertension dataset, there are eight features including sex, age, height (cm), weight (kg), systolic blood pressure (mmHg), diastolic blood pressure (mmHg), heart rate (bpm), and BMI (kg/m2) to explain the hypertension status and then there are four classes comprising the normal (healthy), prehypertension, stage-1 hypertension, and stage-2 hypertension. In the classification of the hypertension dataset, the obtained classification accuracies are 99.5%, 99.5%, 96.3%, and 92.7% using the C4.5 decision tree classifier, random forest, LDA, and LSVM. The obtained results have shown that ML methods could be confidently used in the automatic determination of the hypertension types.


2018 ◽  
Vol 7 (3.6) ◽  
pp. 154
Author(s):  
S K. Sajan ◽  
M Germanus Alex

Breast cancer is a major threat humans are facing irrespective of geographical limits. The awareness about breast cancer has increased during the last decade and many preventive measures were in practice to detect the breast cancer before the symptoms were felt. Mammography is a screening methodology currently in practice. In this paper the mammogram image is analyzed using automated system. The automated system is designed to be capable of distinguishing the mammogram image into a normal or malignant. This process involves image enhancement and image segmentation at preprocessing level. Histogram equalization technique is used to transform low contrast region of the mammogram into region with higher contrast and Fuzzy C Means (FCM) algorithm is used to segment the mammogram image into regions suitable for further analysis. After enhancement and segmentation at preprocessing level the classification is done using three classification algorithms like decision tree classifier, Neural Network classifier and Support Vector Machine (SVM). The performance of the classification algorithms is evaluated using the following criteria like speed, flexibility, robustness, scalability, interpretability, Time complexity and also based on accuracy, sensitivity and specificity. The results obtained in classification are compared with other classification algorithms. It is found that the neural network classifier approach produces better results compared to other classifiers.The average accuracy in diagnosis by Neural Network approach classifier is around 91%.  Also it is found that the decision tree approach is much flexible and easy to use compared to other approaches.  


Oncogene ◽  
2021 ◽  
Author(s):  
Dvir Netanely ◽  
Stav Leibou ◽  
Roma Parikh ◽  
Neta Stern ◽  
Hananya Vaknine ◽  
...  

AbstractCutaneous melanoma tumors are heterogeneous and show diverse responses to treatment. Identification of robust molecular biomarkers for classifying melanoma tumors into clinically distinct and homogenous subtypes is crucial for improving the diagnosis and treatment of the disease. In this study, we present a classification of melanoma tumors into four subtypes with different survival profiles based on three distinct gene expression signatures: keratin, immune, and melanogenesis. The melanogenesis expression pattern includes several genes that are characteristic of the melanosome organelle and correlates with worse survival, suggesting the involvement of melanosomes in melanoma aggression. We experimentally validated the secretion of melanosomes into surrounding tissues by melanoma tumors, which potentially affects the lethality of metastasis. We propose a simple molecular decision tree classifier for predicting a tumor’s subtype based on representative genes from the three identified signatures. Key predictor genes were experimentally validated on melanoma samples taken from patients with varying survival outcomes. Our three-pattern approach for classifying melanoma tumors can contribute to advancing the understanding of melanoma variability and promote accurate diagnosis, prognostication, and treatment.


Sign in / Sign up

Export Citation Format

Share Document