Machine Learning for Causal Inference in Biological Networks: Perspectives of This Challenge

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.

Download Full-text

Combining Correlation-Based Feature and Machine Learning for Sensory Evaluation of Saigon Beer

International Journal of Knowledge and Systems Science ◽

10.4018/ijkss.2020040104 ◽

2020 ◽

Vol 11 (2) ◽

pp. 71-85

Author(s):

Nhat-Vinh Lu ◽

Trong-Nhan Vuong ◽

Duy-Tai Dinh

Keyword(s):

Machine Learning ◽

Sensory Evaluation ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Methods ◽

Feature Selection Technique ◽

Machine Learning Methods ◽

Learning Techniques ◽

Correlation Based Feature Selection ◽

Positive Results

Sensory evaluation plays an important role in the food and consumer goods industry. In recent years, the application of machine learning techniques to support food sensory evaluation has become popular. Many different machine learning methods have been applied and produced positive results in this field. In this article, the authors propose a new method to support sensory evaluation on multiple criteria based on the use of a correlation-based feature selection technique, combined with machine learning methods such as linear regression, multilayer perceptron, support vector machine, and random forest. Experimental results are based on considering the correlation between physicochemical components and sensory factors on the Saigon beer dataset.

Download Full-text

Identification of Duplication in Questions Posed on Knowledge Sharing Platform Quora using Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3017.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 2444-2451

Keyword(s):

Machine Learning ◽

Question Answering ◽

Contextual Information ◽

Machine Learning Techniques ◽

Learning Methods ◽

Machine Learning Methods ◽

Comparison Methods ◽

Learning Techniques ◽

Letter Comparison ◽

Lower Accuracy

Quora, an online question-answering platform has a lot of duplicate questions i.e. questions that convey the same meaning. Since it is open to all users, anyone can pose a question any number of times this increases the count of duplicate questions. This paper uses a dataset comprising of question pairs (taken from the Quora website) in different columns with an indication of whether the pair of questions are duplicates or not. Traditional comparison methods like Sequence matcher perform a letter by letter comparison without understanding the contextual information, hence they give lower accuracy. Machine learning methods predict the similarity using features extracted from the context. Both the traditional methods as well as the machine learning methods were compared in this study. The features for the machine learning methods are extracted using the Bag of Words models- Count-Vectorizer and TFIDF-Vectorizer. Among the traditional comparison methods, Sequence matcher gave the highest accuracy of 65.29%. Among the machine learning methods XGBoost gave the highest accuracy, 80.89% when Count-Vectorizer is used and 80.12% when TFIDF-Vectorizer is used.

Download Full-text

Business Processes, Dynamic Contexts, Learning

Encyclopedia of Business Analytics and Optimization ◽

10.4018/978-1-4666-5202-6.ch037 ◽

2014 ◽

pp. 407-417

Author(s):

Michael M. Richter

Keyword(s):

Machine Learning ◽

Business Processes ◽

Machine Learning Techniques ◽

Underlying Structure ◽

Learning Methods ◽

Machine Learning Methods ◽

Open World ◽

Learning Techniques

In this article we present relations between complex business processes and machine learning techniques. The processes considered here are mostly related to planning. Planning takes place in preparing many decisions and often it is encountered with a rapidly changing context that constitutes an open world. The underlying structure and preconditions of the processes is quite often not known and hence the processes are regarded as stochastic. One can only observe the processes. Such observations deliver data and these data contain some knowledge about the processes in a hidden form. As a consequence, machine learning methods are involved here. The idea is to give the business persons an overview of quite different machine learning techniques so that they can select suitable ones. We provide a number of examples for business processes that we use for illustrations.

Download Full-text

Algebraic Shortcuts for Leave-One-Out Cross-Validation in Supervised Network Inference

10.1101/242321 ◽

2018 ◽

Author(s):

Michiel Stock ◽

Tapio Pahikkala ◽

Antti Airola ◽

Willem Waegeman ◽

Bernard De Baets

Keyword(s):

Machine Learning ◽

Biological Networks ◽

Regulatory Networks ◽

Network Inference ◽

Cross Validation ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Ligand Interaction ◽

Learning Techniques ◽

Leave One Out

AbstractMotivationSupervised machine learning techniques have traditionally been very successful at reconstructing biological networks, such as protein-ligand interaction, protein-protein interaction and gene regulatory networks. Recently, much emphasis has been placed on the correct evaluation of such supervised models. It is vital to distinguish between using the model to either predict new interactions in a given network or to predict interactions for a new vertex not present in the original network. Specific cross-validation schemes need to be used to assess the performance in such different prediction settings.ResultsWe present a series of leave-one-out cross-validation shortcuts to rapidly estimate the performance of state-of-the-art kernel-based network inference techniques.AvailabilityThe machine learning techniques with the algebraic shortcuts are implemented in the RLScore software package.

Download Full-text

Adoption of machine learning techniques in Ecology and Earth Science

10.7287/peerj.preprints.1720v1 ◽

2016 ◽

Cited By ~ 1

Author(s):

Anne E Thessen

Keyword(s):

Machine Learning ◽

Earth Science ◽

Collaborative Work ◽

Machine Learning Techniques ◽

Learning Methods ◽

Data Set ◽

Machine Learning Methods ◽

Learning Techniques ◽

Expert Annotation ◽

Natural Scientists

The natural sciences, such as ecology and earth science, study complex interactions between biotic and abiotic systems in order to infer understanding and make predictions. Machine-learning-based methods have an advantage over traditional statistical methods in studying these systems because the former do not impose unrealistic assumptions (such as linearity), are capable of inferring missing data, and can reduce long-term expert annotation burden. Thus, a wider adoption of machine learning methods in ecology and earth science has the potential to greatly accelerate the pace and quality of science. Despite these advantages, machine learning techniques have not had wide spread adoption in ecology and earth science. This is largely due to 1) a lack of communication and collaboration between the machine learning research community and natural scientists, 2) a lack of easily accessible tools and services, and 3) the requirement for a robust training and test data set. These impediments can be overcome through financial support for collaborative work and the development of tools and services facilitating ML use. Natural scientists who have not yet used machine learning methods can be introduced to these techniques through Random Forest, a method that is easy to implement and performs well. This manuscript will 1) briefly describe several popular ML methods and their application to ecology and earth science, 2) discuss why ML methods are underutilized in natural science, and 3) propose solutions for barriers preventing wider ML adoption.

Download Full-text

Using Machine Learning to Advance Early Warning Systems: Promise and Pitfalls

Teachers College Record ◽

10.1177/016146812012201403 ◽

2020 ◽

Vol 122 (14) ◽

pp. 1-30

Author(s):

James Soland ◽

Benjamin Domingue ◽

David Lang

Keyword(s):

Machine Learning ◽

High School ◽

At Risk ◽

Early Warning ◽

Early Warning Systems ◽

Machine Learning Techniques ◽

Dropping Out ◽

Learning Methods ◽

Machine Learning Methods ◽

Learning Techniques

Background/Context Early warning indicators (EWI) are often used by states and districts to identify students who are not on track to finish high school, and provide supports/interventions to increase the odds the student will graduate. While EWI are diverse in terms of the academic behaviors they capture, research suggests that indicators like course failures, chronic absenteeism, and suspensions can help identify students in need of additional supports. In parallel with the expansion of administrative data that have made early versions of EWI possible, new machine learning methods have been developed. These methods are data-driven and often designed to sift through thousands of variables with the purpose of identifying the best predictors of a given outcome. While applications of machine learning techniques to identify students at-risk of high school dropout have obvious appeal, few studies consider the benefits and limitations of applying those models in an EWI context, especially as they relate to questions of fairness and equity. Focus of Study In this study, we will provide applied examples of how machine learning can be used to support EWI selection. The purpose is to articulate the broad risks and benefits of using machine learning methods to identify students who may be at risk of dropping out. We focus on dropping out given its salience in the EWI literature, but also anticipate generating insights that will be germane to EWI used for a variety of outcomes. Research Design We explore these issues by using several hypothetical examples of how ML techniques might be used to identify EWI. For example, we show results from decision tree algorithms used to identify predictors of dropout that use simulated data. Conclusions/Recommendations Generally, we argue that machine learning techniques have several potential benefits in the EWI context. For example, some related methods can help create clear decision rules for which students are a dropout risk, and their predictive accuracy can be higher than for more traditional, regression-based models. At the same time, these methods often require additional statistical and data management expertise to be used appropriately. Further, the black-box nature of machine learning algorithms could invite their users to interpret results through the lens of preexisting biases about students and educational settings.

Download Full-text

Intelligent health risk prediction systems using machine learning: a review

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.12654 ◽

2018 ◽

Vol 7 (3) ◽

pp. 1019 ◽

Cited By ~ 6

Author(s):

Mr Santosh A. Shinde ◽

Dr P. Raja Rajeswari

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Research Review ◽

Learning Methods ◽

Research Directions ◽

Machine Learning Methods ◽

Experience Machine ◽

Learning Techniques ◽

Prediction Systems ◽

Mother Earth

Humans are considered to be the most intelligent species on the mother earth and are inherently more health conscious. Since Centuries mankind has discovered various proven healthcare systems. To automate the process and predict diseases more accurately machine learning methods are gaining popularity in research community. Machine Learning methods facilitate development of the intelligence into a machine, so that it can perform better in the future using the learned experience. Machine learning methods application on electronic health record dataset could provide valuable information and predication of health risks.The aim of this research review paper are four-fold: i) serve as a guideline for researchers who are new to machine learning area and want to contribute to it, ii) provide state-of-the-art survey of machine learning, iii) application of machine learning techniques in the health prediction, and iv) provides further research directions required into health prediction system using machine learning.

Download Full-text

Adoption of machine learning techniques in Ecology and Earth Science

10.7287/peerj.preprints.1720 ◽

2016 ◽

Cited By ~ 1

Author(s):

Anne E Thessen

Keyword(s):

Machine Learning ◽

Earth Science ◽

Collaborative Work ◽

Machine Learning Techniques ◽

Learning Methods ◽

Data Set ◽

Machine Learning Methods ◽

Learning Techniques ◽

Expert Annotation ◽

Natural Scientists

The natural sciences, such as ecology and earth science, study complex interactions between biotic and abiotic systems in order to infer understanding and make predictions. Machine-learning-based methods have an advantage over traditional statistical methods in studying these systems because the former do not impose unrealistic assumptions (such as linearity), are capable of inferring missing data, and can reduce long-term expert annotation burden. Thus, a wider adoption of machine learning methods in ecology and earth science has the potential to greatly accelerate the pace and quality of science. Despite these advantages, machine learning techniques have not had wide spread adoption in ecology and earth science. This is largely due to 1) a lack of communication and collaboration between the machine learning research community and natural scientists, 2) a lack of easily accessible tools and services, and 3) the requirement for a robust training and test data set. These impediments can be overcome through financial support for collaborative work and the development of tools and services facilitating ML use. Natural scientists who have not yet used machine learning methods can be introduced to these techniques through Random Forest, a method that is easy to implement and performs well. This manuscript will 1) briefly describe several popular ML methods and their application to ecology and earth science, 2) discuss why ML methods are underutilized in natural science, and 3) propose solutions for barriers preventing wider ML adoption.

Download Full-text

The rise and fall of machine learning methods in biomedical research

F1000Research ◽

10.12688/f1000research.13016.2 ◽

2018 ◽

Vol 6 ◽

pp. 2012 ◽

Cited By ~ 5

Author(s):

Hashem Koohy

Keyword(s):

Machine Learning ◽

Biomedical Research ◽

Life Sciences ◽

Biological Data ◽

Research Note ◽

Machine Learning Techniques ◽

Learning Methods ◽

The Past ◽

Machine Learning Methods ◽

Learning Techniques

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.

Download Full-text