Salient Features, data and algorithms for microRNA screening from plants: A review on the gains and pitfalls of machine learning techniques

2020 ◽  
Vol 15 ◽  
Author(s):  
Garima Ayachit ◽  
Inayatullah Shaikh ◽  
Himanshu Pandya ◽  
Jayashankar Das

: The era of Big Data and high-throughput genomic technology has enabled scientists to have a clear view of plant genomic profiles. However, it has also led to a massive need of computational tools and strategies to interpret this data. In this scenario of huge data inflow, machine learning (ML) approaches are emerging to be the most promising for analysing heterogeneous and unstructured biological datasets. Extending its application to healthcare and agriculture, ML approaches are being useful for microRNA (miRNA) screening as well. Identification of miRNAs is a crucial step towards understanding post-transcriptional gene regulation and miRNA-related pathology. The use of ML tools is becoming indispensable in analysing such data and identifying species-specific, non-conserved miRNA. However, these techniques have their own benefits and lacunas. In this review, we discuss the current scenario and pitfalls of ML based tools for plant miRNA identification and provide some insights into the important features, the need for deep learning models and direction in which studies are needed.

Prediction of diseases is one of the challenging tasks in healthcare domain. Conventionally the heart diseases were diagnosed by experienced medical professional and cardiologist with the help of medical and clinical tests. With conventional method even experienced medical professional struggled to predict the disease with sufficient accuracy. In addition, manually analysing and extracting useful knowledge from the archived disease data becomes time consuming as well as infeasible. The advent of machine learning techniques enables the prediction of various diseases in healthcare domain. Machine learning algorithms are trained to learn from the existing historical data and prediction models are being created to predict the unknown raw data. For the past two decades, machine learning techniques are extensively employed for disease prediction. Despite the capability of machine algorithm on learning from huge historical data which is stored in data mart and data warehouses using traditional database technologies such as Oracle OnLine Analytical Processing (OLAP). The conventional database technologies suffer from the limitation that they cannot handle huge data or unstructured data or data that comes with speed. In this context, big data tools and technologies plays a major role in storing and facilitating the processing of huge data. In this paper, an approach is proposed for prediction of heart diseases using Support Vector Algorithm in Spark environment. Support Vector Machine algorithm is basically a binary classifier which classifies both linear and non-linear input data. It transforms the non-linear data into hyper plan with the help of different kernel functions. Spark is a distributed big data processing platform which has a unique feature of keeping and processing a huge data in memory. The proposed approach is tested with a benchmark dataset from UCI repository and results are discussed.


Author(s):  
Shriya Salunkhe ◽  
◽  
Kiran Bhowmick ◽  

In recent years, multi-label classifications have become common. Multi label classification is a classification in which a collection of labels is associated with a single instance, which may be a variation of the classification of a single label. The problem of huge data is the classification in which each instance is of different kind which further can be identified with more than one class. The various machine learning strategies for classifying multi-label data are discussed in this paper. Many researches have been carried out that specify the grouping of multiple labels. Here we will compare various classification machine learning techniques that involve two approaches: the adapted algorithm approach and the method of problem transformation. Here we are using naive multinomial bayes and logistic regression. We use certain evaluation metrics to predict the differences as well. Better classification methods are discussed in this paper.


2020 ◽  
Author(s):  
Tahani Daghistani ◽  
Huda AlGhamdi ◽  
Riyad Alshammari ◽  
Raed H. AlHazme

Abstract Outpatients who fail to attend their appointments have a negative impact on the healthcare outcome. Thus, healthcare organizations facing new opportunities, one of them is to improve the quality of healthcare. The main challenges is predictive analysis using techniques capable of handle the huge data generated. We propose a big data framework for identifying subject outpatients’ no-show via feature engineering and machine learning (MLlib) in Spark platform. The aim of this paper is exploring factors that affect no-sow rate then can be used to formulate predictions using big data machine learning techniques.


Author(s):  
Divya Ebenezer Nathaniel ◽  
Sonia Panesar

With the Advent of advancement in the field of Artificial Intelligence the computer is made more intelligent and can enable to think and make prediction accurately. The machine learning being a subfield of Artificial Intelligence is used in numerous research works. Different analysts feel that enormous data generated in field of biology have to be sorted in an intelligent way to yield best model. There are numerous kinds of Machine Learning Techniques like Unsupervised, Semi Supervised, Supervised, Reinforcement, and Evolutionary Learning and Deep Learning. These learning’s are used to classify huge data at a rapid pace. This paper discusses about the wide spectrum of Biology and the process of pre-processing data and the best suitable Machine learning model for each of them.


Author(s):  
Jayashree M. Kudari

Developments in machine learning techniques for classification and regression exposed the access of detecting sophisticated patterns from various domain-penetrating data. In biomedical applications, enormous amounts of medical data are produced and collected to predict disease type and stage of the disease. Detection and prediction of diseases, such as diabetes, lung cancer, brain cancer, heart disease, and liver diseases, requires huge tests and that increases the size of patient medical data. Robust prediction of a patient's disease from the huge data set is an important agenda in in this chapter. The challenge of applying a machine learning method is to select the best algorithm within the disease prediction framework. This chapter opts for robust machine learning algorithms for various diseases by using case studies. This usually analyzes each dimension of disease, independently checking the identified value between the limits to monitor the condition of the disease.


2017 ◽  
Vol 10 (3) ◽  
pp. 660-663
Author(s):  
L. Dhanapriya ◽  
Dr. S. MANJU

In the recent development of IT technology, the capacity of data has surpassed the zettabyte, and improving the efficiency of business is done by increasing the ability of predictive through an efficient analysis on these data which has emerged as an issue in the current society. Now the market needs for methods that are capable of extracting valuable information from large data sets. Recently big data is becoming the focus of attention, and using any of the machine learning techniques to extract the valuable information from the huge data of complex structures has become a concern yet an urgent problem to resolve. The aim of this work is to provide a better understanding of this Machine Learning technique for discovering interesting patterns and introduces some machine learning algorithms to explore the developing trend.


2021 ◽  
Vol 9 (1) ◽  
pp. 207-214
Author(s):  
Krishna Kumar Yadav, Dr. Anurag Sharma, Dr. Abhishek Badholia

In few previous decades around the globe the reason for extensive number of deaths is cardiovascular disease or Heart related disease and not only in India but all around the world has emerged as a life-threatening disease. So for the correct treatment and in time diagnosis for this disease the need of feasible, accurate and reliable system is encountered. For automation of analysis of the sophisticated and huge data, to the various medical dataset of Machine Learning techniques and methods are applied. In recent times many researchers for the health care industry assistance with the help of various techniques of Machine Learning, this in turn helps the professionals in the procedure of the heart related disease diagnosis. A survey of various models that accepts such techniques and algorithms and their performance analysis is presented in this paper. Within the researchers few very fashionable Model supported supervised learning algorithms are Random forest (RF), Decision Tree (DT), Naïve Bayes, ensemble models, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM).  


2006 ◽  
Author(s):  
Christopher Schreiner ◽  
Kari Torkkola ◽  
Mike Gardner ◽  
Keshu Zhang

2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


Sign in / Sign up

Export Citation Format

Share Document