Fish survival prediction in an aquatic environment using random forest model

In the real world, it is very difficult for fish farmers to select the perfect fish species for aquaculture in a specific aquatic environment. The main goal of this research is to build a machine learning that can predict the perfect fish species in an aquatic environment. In this paper, we have utilized a model using random forest (RF). To validate the model, we have used a dataset of aquatic environment for 11 different fishes. To predict the fish species, we utilized the different characteristics of aquatic environment including pH, temperature, and turbidity. As a performance metrics, we measured accuracy, true positive (TP) rate, and kappa statistics. Experimental results demonstrate that the proposed RF-based prediction model shows accuracy 88.48%, kappa statistic 87.11% and TP rate 88.5% for the tested dataset. In addition, we compare the proposed model with the state-of-art models J48, RF, k-nearest neighbor (k-NN), and classification and regression trees (CART). The proposed model outperforms than the existing models by exhibiting the higher accuracy score, TP rate and kappa statistics.

Download Full-text

Land Cover Classification Using the Proposed Texture Model and Fuzzy k-NN Classifier

Optimization Techniques for Problem Solving in Uncertainty - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-5091-4.ch009 ◽

2018 ◽

pp. 226-261

Author(s):

Jenicka S.

Keyword(s):

Nearest Neighbor ◽

Confusion Matrix ◽

Texture Feature ◽

Image Texture ◽

Kappa Statistics ◽

Grey Level ◽

Classification Problems ◽

K Nearest Neighbor ◽

Texture Model ◽

Proposed Model

Texture feature is a decisive factor in pattern classification problems because texture features are not deduced from the intensity of current pixel but from the grey level intensity variations of current pixel with its neighbors. In this chapter, a new texture model called multivariate binary threshold pattern (MBTP) has been proposed with five discrete levels such as -9, -1, 0, 1, and 9 characterizing the grey level intensity variations of the center pixel with its neighbors in the local neighborhood of each band in a multispectral image. Texture-based classification has been performed with the proposed model using fuzzy k-nearest neighbor (fuzzy k-NN) algorithm on IRS-P6, LISS-IV data, and the results have been evaluated based on confusion matrix, classification accuracy, and Kappa statistics. From the experiments, it is found that the proposed model outperforms other chosen existing texture models.

Download Full-text

Evaluation of Three Different Machine Learning Methods for Object-Based Artificial Terrace Mapping—A Case Study of the Loess Plateau, China

Remote Sensing ◽

10.3390/rs13051021 ◽

2021 ◽

Vol 13 (5) ◽

pp. 1021

Author(s):

Hu Ding ◽

Jiaming Na ◽

Shangjing Jiang ◽

Jie Zhu ◽

Kai Liu ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Loess Plateau ◽

Water Conservation ◽

Nearest Neighbor ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

The Loess Plateau ◽

Object Based ◽

Extreme Gradient Boosting

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.

Download Full-text

Computational Intelligence-Based Model for Mortality Rate Prediction in COVID-19 Patients

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18126429 ◽

2021 ◽

Vol 18 (12) ◽

pp. 6429

Author(s):

Irfan Ullah Khan ◽

Nida Aslam ◽

Malak Aljabri ◽

Sumayh S. Aljameel ◽

Mariam Moataz Aly Kamaleldin ◽

...

Keyword(s):

Mortality Rate ◽

Computational Intelligence ◽

Nearest Neighbor ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

Detection And Identification ◽

Proposed Model ◽

Extreme Gradient Boosting ◽

The World ◽

Detection And Diagnosis

The COVID-19 outbreak is currently one of the biggest challenges facing countries around the world. Millions of people have lost their lives due to COVID-19. Therefore, the accurate early detection and identification of severe COVID-19 cases can reduce the mortality rate and the likelihood of further complications. Machine Learning (ML) and Deep Learning (DL) models have been shown to be effective in the detection and diagnosis of several diseases, including COVID-19. This study used ML algorithms, such as Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbor (KNN) and DL model (containing six layers with ReLU and output layer with sigmoid activation), to predict the mortality rate in COVID-19 cases. Models were trained using confirmed COVID-19 patients from 146 countries. Comparative analysis was performed among ML and DL models using a reduced feature set. The best results were achieved using the proposed DL model, with an accuracy of 0.97. Experimental results reveal the significance of the proposed model over the baseline study in the literature with the reduced feature set.

Download Full-text

Quantifying the Influence of Achievement Emotions for Student Learning in MOOCs

Journal of Educational Computing Research ◽

10.1177/0735633120967318 ◽

2020 ◽

pp. 073563312096731

Author(s):

Bowen Liu ◽

Wanli Xing ◽

Yifang Zeng ◽

Yonghe Wu

Keyword(s):

Random Forest ◽

Nearest Neighbor ◽

Online Courses ◽

Learning Performance ◽

Support Vector ◽

K Nearest Neighbor ◽

Achievement Emotions ◽

Integrative Framework ◽

Emotional Interaction ◽

Performance Results

Massive Open Online Courses (MOOCs) have become a popular tool for worldwide learners. However, a lack of emotional interaction and support is an important reason for learners to abandon their learning and eventually results in poor learning performance. This study applied an integrative framework of achievement emotions to uncover their holistic influence on students’ learning by analyzing more than 400,000 forum posts from 13 MOOCs. Six machine-learning models were first built to automatically identify achievement emotions, including K-Nearest Neighbor, Logistic Regression, Naïve Bayes, Decision Tree, Random Forest, and Support Vector Machines. Results showed that Random Forest performed the best with a kappa of 0.83 and an ROC_AUC of 0.97. Then, multilevel modeling with the “Stepwise Build-up” strategy was used to quantify the effect of achievement emotions on students’ academic performance. Results showed that different achievement emotions influenced students’ learning differently. These findings allow MOOC platforms and instructors to provide relevant emotional feedback to students automatically or manually, thereby improving their learning in MOOCs.

Download Full-text

Spatial modeling and susceptibility zonation of landslides using random forest, naïve bayes and K-nearest neighbor in a complicated terrain

Earth Science Informatics ◽

10.1007/s12145-021-00653-y ◽

2021 ◽

Author(s):

Sherif Ahmed Abu El-Magd ◽

Sk Ajim Ali ◽

Quoc Bao Pham

Keyword(s):

Random Forest ◽

Spatial Modeling ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

K Nearest Neighbor

Download Full-text

Precision Pig Farming Image Analysis Using Random Forest and Boruta Predictive Big Data Analysis Using Neural Network and K- Nearest Neighbor

2021 2nd International Conference on Intelligent Engineering and Management (ICIEM) ◽

10.1109/iciem51511.2021.9445328 ◽

2021 ◽

Author(s):

S. A. Shaik Mazhar ◽

G. Suseendran

Keyword(s):

Neural Network ◽

Image Analysis ◽

Big Data ◽

Data Analysis ◽

Random Forest ◽

Nearest Neighbor ◽

Big Data Analysis ◽

K Nearest Neighbor ◽

Pig Farming

Download Full-text

A Novel Dynamic Hybridization Method for Best Feature Selection

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2021040106 ◽

2021 ◽

Vol 12 (2) ◽

pp. 85-99

Author(s):

Nassima Dif ◽

Zakaria Elberrichi

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

Optimization Problems ◽

Learning Algorithm ◽

Accuracy Score ◽

Hybridization Method ◽

K Nearest Neighbor ◽

Feature Selection Problem ◽

Combinatorial Optimization Problems ◽

The Comparative Study

Hybrid metaheuristics has received a lot of attention lately to solve combinatorial optimization problems. The purpose of hybridization is to create a cooperation between metaheuristics for better solutions. Most proposed works were interested in static hybridization. The objective of this work is to propose a novel dynamic hybridization method (GPBD) that generates the most suitable sequential hybridization between GA, PSO, BAT, and DE metaheuristics, according to each problem. The authors choose to test this approach for solving the best feature selection problem in a wrapper tactic, performed on face image recognition datasets, with the k-nearest neighbor (KNN) learning algorithm. The comparative study of the metaheuristics and their hybridization GPBD shows that the proposed approach achieved the best results. It was definitely competitive with other filter approaches proposed in the literature. It achieved a perfect accuracy score of 100% for Orl10P, Pix10P, and PIE10P datasets.

Download Full-text

Modeling Barrier Island Habitats Using Landscape Position Information

Remote Sensing ◽

10.3390/rs11080976 ◽

2019 ◽

Vol 11 (8) ◽

pp. 976

Author(s):

Nicholas M. Enwright ◽

Lei Wang ◽

Hongqing Wang ◽

Michael J. Osland ◽

Laura C. Feher ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Barrier Island ◽

Barrier Islands ◽

Machine Learning Algorithms ◽

Landscape Position ◽

K Nearest Neighbor ◽

Island Habitats

Barrier islands are dynamic environments because of their position along the marine–estuarine interface. Geomorphology influences habitat distribution on barrier islands by regulating exposure to harsh abiotic conditions. Researchers have identified linkages between habitat and landscape position, such as elevation and distance from shore, yet these linkages have not been fully leveraged to develop predictive models. Our aim was to evaluate the performance of commonly used machine learning algorithms, including K-nearest neighbor, support vector machine, and random forest, for predicting barrier island habitats using landscape position for Dauphin Island, Alabama, USA. Landscape position predictors were extracted from topobathymetric data. Models were developed for three tidal zones: subtidal, intertidal, and supratidal/upland. We used a contemporary habitat map to identify landscape position linkages for habitats, such as beach, dune, woody vegetation, and marsh. Deterministic accuracy, fuzzy accuracy, and hindcasting were used for validation. The random forest algorithm performed best for intertidal and supratidal/upland habitats, while the K-nearest neighbor algorithm performed best for subtidal habitats. A posteriori application of expert rules based on theoretical understanding of barrier island habitats enhanced model results. For the contemporary model, deterministic overall accuracy was nearly 70%, and fuzzy overall accuracy was over 80%. For the hindcast model, deterministic overall accuracy was nearly 80%, and fuzzy overall accuracy was over 90%. We found machine learning algorithms were well-suited for predicting barrier island habitats using landscape position. Our model framework could be coupled with hydrodynamic geomorphologic models for forecasting habitats with accelerated sea-level rise, simulated storms, and restoration actions.

Download Full-text

Determinants of rental strategy: short-term vs long-term rental strategy

International Journal of Contemporary Hospitality Management ◽

10.1108/ijchm-03-2020-0185 ◽

2020 ◽

Vol 13 (12) ◽

pp. 3873-3894

Author(s):

Sina Shokoohyar ◽

Ahmad Sobhani ◽

Anae Sobhani

Keyword(s):

Nearest Neighbor ◽

Performance Metrics ◽

Sharing Economy ◽

Real Estate Market ◽

Support Vector ◽

Attractive Alternative ◽

K Nearest Neighbor ◽

Short Term ◽

Content Type

Purpose Short-term rental option enabled via accommodation sharing platforms is an attractive alternative to conventional long-term rental. The purpose of this study is to compare rental strategies (short-term vs long-term) and explore the main determinants for strategy selection. Design/methodology/approach Using logistic regression, this study predicts the rental strategy with the highest rate of return for a given property in the City of Philadelphia. The modeling result is then compared with the applied machine learning methods, including random forest, k-nearest neighbor, support vector machine, naïve Bayes and neural networks. The best model is finally selected based on different performance metrics that determine the prediction strength of underlying models. Findings By analyzing 2,163 properties, the results show that properties with more bedrooms, closer to the historic attractions, in neighborhoods with lower minority rates and higher nightlife vibe are more likely to have a higher return if they are rented out through short-term rental contract. Additionally, the property location is found out to have a significant impact on the selection of the rental strategy, which emphasizes the widely known term of “location, location, location” in the real estate market. Originality/value The findings of this study contribute to the literature by determining the neighborhood and property characteristics that make a property more suitable for the short-term rental vs the long-term one. This contribution is extremely important as it facilitates differentiating the short-term rentals from the long-term rentals and would help better understanding the supply-side in the sharing economy-based accommodation market.

Download Full-text

Klasifikasi Citra Alat Musik Tradisional dengan Metode k-Nearest Neighbor, Random Forest, dan Support Vector Machine

JURNAL SISTEM INFORMASI BISNIS ◽

10.21456/vol9iss2pp185-191 ◽

2019 ◽

Vol 9 (2) ◽

pp. 185

Author(s):

Herry Sujaini

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor

Dalam dekade terakhir, metode non-parametrik (algoritma berbasis pembelajaran mesin) semakin banyak dipergunakan dari berbagai aplikasi berbasis pengolahan citra digital. Penelitian ini bertujuan untuk membandingkan tiga metode non-parametrik yaitu Metode k-Nearest Neighbor (kNN), Random Forest (RF), dan Support Vector Machine (SVM) terhadap klasifikasi citra alat musik tradisional di Indonesia yang populer di kalangan masyarakat yaitu : angklung, djembe, gamelan, gong, gordang, kendang, kolintang, rebana, sasando, dan serunai. Dari hasil eksperimen pengklasifikasian dengan metode kNN, RF dan SVM, metode kNN memiliki akurasi yang paling baik. Rata-rata nilai precision ketiga metode tersebut berturut-turut adalah 92,1% untuk kNN, 85,4% untuk SVM, dan 69,4% untuk RF

Download Full-text