<title>Pipelining machine learning algorithms for knowledge discovery</title>

Traditional Chinese Medicine (TCM) has attracted more and more attention due to its remarkable effects on treating diseases, and Chinese herbal medicine (CHM) is an important partition of TCM, rich in natural active ingredients. Researchers are trying multiple analytical methods to dig out more valuable information about CHM and reveal the principle of TCM. Machine learning is playing an important role in the studies. Knowledge discovery of CHM using machine learning mainly includes quality control of CHM, network pharmacology in CHM, and medical prescriptions composed by CHM, aiming to understand TCM better, provide more efficiency methods in the production of CHM and find novel treatment of disease not curable nowadays. In this paper, we summarized the basic idea of frequently used classification and clustering machine learning algorithms, introduced pre-processing algorithms commonly used to simplify and accelerate machine learning procedure, presented current status of machine learning algorithms’ applications in knowledge discovery of CHM, discussed challenges and future trends of machine learning’s application in CHM. It is believed that the paper provides a valuable insight for the starters trying to apply machine learning in the study of CHM and catch up the recent status of related researches.

Download Full-text

Knowledge Discovery in Geographical Sciences—A Systematic Survey of Various Machine Learning Algorithms for Rainfall Prediction

10.1007/978-981-16-2597-8_51 ◽

2021 ◽

pp. 593-608

Author(s):

Sheikh Amir Fayaz ◽

Majid Zaman ◽

Muheet Ahmed Butt

Keyword(s):

Machine Learning ◽

Knowledge Discovery ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Systematic Survey ◽

Rainfall Prediction

Download Full-text

Knowledge Discovery of Global Landslides Using Automated Machine Learning Algorithms

IEEE Access ◽

10.1109/access.2021.3115043 ◽

2021 ◽

Vol 9 ◽

pp. 131400-131419

Author(s):

Fahim K. Sufi ◽

Musleh Alsulami

Keyword(s):

Machine Learning ◽

Knowledge Discovery ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Automated Machine Learning

Download Full-text

Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile

Entropy ◽

10.3390/e23040485 ◽

2021 ◽

Vol 23 (4) ◽

pp. 485 ◽

Cited By ~ 1

Author(s):

Carlos A. Palacios ◽

José A. Reyes-Suárez ◽

Lorena A. Bearzotti ◽

Víctor Leiva ◽

Carolina Marchant

Keyword(s):

Higher Education ◽

Machine Learning ◽

Data Mining ◽

Random Forest ◽

Student Retention ◽

Knowledge Discovery ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector

Data mining is employed to extract useful information and to detect patterns from often large data sets, closely related to knowledge discovery in databases and data science. In this investigation, we formulate models based on machine learning algorithms to extract relevant information predicting student retention at various levels, using higher education data and specifying the relevant variables involved in the modeling. Then, we utilize this information to help the process of knowledge discovery. We predict student retention at each of three levels during their first, second, and third years of study, obtaining models with an accuracy that exceeds 80% in all scenarios. These models allow us to adequately predict the level when dropout occurs. Among the machine learning algorithms used in this work are: decision trees, k-nearest neighbors, logistic regression, naive Bayes, random forest, and support vector machines, of which the random forest technique performs the best. We detect that secondary educational score and the community poverty index are important predictive variables, which have not been previously reported in educational studies of this type. The dropout assessment at various levels reported here is valid for higher education institutions around the world with similar conditions to the Chilean case, where dropout rates affect the efficiency of such institutions. Having the ability to predict dropout based on student’s data enables these institutions to take preventative measures, avoiding the dropouts. In the case study, balancing the majority and minority classes improves the performance of the algorithms.

Download Full-text

Supplemental Material for One Model to Rule Them All? Using Machine Learning Algorithms to Determine the Number of Factors in Exploratory Factor Analysis

Psychological Methods ◽

10.1037/met0000262.supp ◽

2020 ◽

Keyword(s):

Machine Learning ◽

Factor Analysis ◽

Exploratory Factor Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Number Of Factors

Download Full-text

Forecasting US movies box office performances in Turkey using machine learning algorithms

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189120 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6579-6590

Author(s):

Sandy Çağlıyor ◽

Başar Öztayşi ◽

Selime Sezgin

Keyword(s):

Machine Learning ◽

Global Economy ◽

Learning Algorithms ◽

Forecast Model ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

High Stakes ◽

Box Office ◽

Industry Forecast ◽

The Impact

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.

Download Full-text

Intelligent system of English composition scoring model based on improved machine learning algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189235 ◽

2020 ◽

pp. 1-11

Author(s):

Jie Liu ◽

Lin Lin ◽

Xiufang Liang

Keyword(s):

Machine Learning ◽

Evaluation System ◽

Intelligent System ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Assessment System ◽

English Composition ◽

Region Extraction ◽

Constraint Model

The online English teaching system has certain requirements for the intelligent scoring system, and the most difficult stage of intelligent scoring in the English test is to score the English composition through the intelligent model. In order to improve the intelligence of English composition scoring, based on machine learning algorithms, this study combines intelligent image recognition technology to improve machine learning algorithms, and proposes an improved MSER-based character candidate region extraction algorithm and a convolutional neural network-based pseudo-character region filtering algorithm. In addition, in order to verify whether the algorithm model proposed in this paper meets the requirements of the group text, that is, to verify the feasibility of the algorithm, the performance of the model proposed in this study is analyzed through design experiments. Moreover, the basic conditions for composition scoring are input into the model as a constraint model. The research results show that the algorithm proposed in this paper has a certain practical effect, and it can be applied to the English assessment system and the online assessment system of the homework evaluation system algorithm system.

Download Full-text

The Unlearnable Checkerboard Pattern

Communications of the Blyth Institute ◽

10.33014/issn.2640-5652.1.2.holloway.1 ◽

2019 ◽

Vol 1 (2) ◽

pp. 78-80

Author(s):

Eric Holloway

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Checkerboard Pattern ◽

Simple Task

Detecting some patterns is a simple task for humans, but nearly impossible for current machine learning algorithms. Here, the "checkerboard" pattern is examined, where human prediction nears 100% and machine prediction drops significantly below 50%.

Download Full-text