Pemetaan Lamun Mengunakan Machine Learning Dengan Citra Planetscope Di Nusa Lembongan

Seagrass is one community in benthic habitat that has tremendous benefits for the ecosystem, however the existence of seagrass has been frequently marginalized in recent decades. Seagrass beds functions as a blue carbon ecosystem which are able to absorb carbon higher than terrestrial vegetation. Therefore, it is important to detect and map the seagrass beds distribution to calculate the potential carbon uptake from seagrass. The seagrass mapping can be employed efficiently by using remote sensing imagery and the use of machine learning technology. This research aims to examine the utilization of PlanetScope imagery (3.7 m spatial resolution) for seagrass mapping and to subsequently examine, the effect of atmospheric corrections, sun-glint, and the water column corrections on the accuracy of seagrass mapping. In addition, this study also identified the cover changes in seagrass area from 2016 to 2021 in Nusa Lembongan. The study utilized the tree-based machine learning methods such as decision tree and random forest. The results showed that the best model accuracy was generated by using raw PlanetScope data the best model accuracy of 98% and classification accuracy of 94% from decision tree method. Based on the decision tree mapping using PlanetScope data for 2016 and 2021, there was a decline in the seagrass cover from 100.53 hectares to 97.31 hectares. Lamun merupakan salah satu dari ekosistem habitat bentik yang memiliki manfaat yang sangat besar namun sebagai ekosistem, kehadiran lamun sering dikesampingkan beberapa dekade terakhir. Fungsi padang lamun sebagai ekosistem karbon biru mampu menyerap karbon lebih tinggi dibandingkan vegetasi daratan. Karena itu, penting untuk mendeteksi dan memetakan informasi padang lamun untuk memperhitungkan serapan karbon oleh lamun. Pemanfaatan lamun dapat dilakukan secara cepat dan efisien dengan mengunakan teknologi penginderaan jauh dan pemenfaatan teknologi machine learning. Penelitian bertujuan untuk mengkaji pemanfaatan citra PlanetScope untuk memetakan lamun dan selanjutnya menganalisis pengaruh kalibrasi atmosferik, sun-glint, dan kolom air terhadap akurasi pemetaan padang lamun. Selain itu, perubahan tutupan lamun tahun 2016 – 2021 di Nusa Lembongan juga dipetakan. Penelitian ini menggunakan metode machine learning berbasis pohon seperti decision tree dan random forest. Hasil penelitian menunjukkan akurasi model terbaik dihasilkan dengan menggunakan data mentah dengan akurasi model 98% dan akurasi klasifikasi 94% dari metode decision tree. Berdasarkan data PlanetScope tahun 2016 dan 2021 dengan mengunakan metode decision tree terjadi penurunan luasan lamun dari 100,53 Ha menjadi 97,31 Ha.

Download Full-text

Differentiating Thrombotic Microangiopathies Based on Laboratory Tests Other Than ADAMTS13 Using Machine Learning Technology

Blood ◽

10.1182/blood.v128.22.3749.3749 ◽

2016 ◽

Vol 128 (22) ◽

pp. 3749-3749

Author(s):

Youngil Koh ◽

SuYeon Lee ◽

Hong-Seok Yun ◽

Sung-Soo Yoon ◽

Inho Kim ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Correlation Coefficient ◽

Machine Learning Techniques ◽

Learning Technology ◽

Thrombotic Microangiopathies ◽

Random Forest Method ◽

Learning Techniques

Abstract Introduction: ADAMTS13 activity level is crucial for differentiating thrombotic microangiopathies. However, ADAMTS13 testing is not readily available at site in many parts of the world. Hence, we developed an innovative algorithm that allow differentiation of thrombotic thrombocytopenic purpura (TTP) from other TMA's based on laboratory results other than ADAMTS13 using machine learning. Methods: Two hundred- eight adult patients with either TTP (N=64) or TMA other than TTP (N=144) (ADAMTS13 cutoff level of 10%) were classified using three machine learning techniques (decision tree, random forest, and neural network), using a set of easily measured 19 clinical variables such as fever, Hb, ALT and so on. Basically, each clinical variable is not correlated with TTP (Absolute values of correlation coefficients are lower than 0.5), so we applied machine learning algorithms. First, we divided patient data into three parts, train, test and validation set. And then, we applied these 3 machine learning techniques, decision tree, random forest and neural network. Principal component analysis was also performed. Results: As a single variable, platelet count, BUN and total bilirubin were the most important three variables that are predictive of differentiating TTP from other TMA's with accuracy of 82%. Random forest method increased accuracy to 85% and precision, and recall statistic is 0.828, and 0.832, respectively. Neural network did not do better without optimization than random forest method. Conclusion: Machine learning technology seems promising in differentiating TTP from other TMA's if ADAMTS13 value is not available. These algorithms could support the physician in tailoring the management of TMA. Correlation coefficient in our study Correlation coefficient in our study Scheme of Random Forest method used in our study Scheme of Random Forest method used in our study Disclosures Lee: SamsungSDS: Employment. Yun:Samsung SDS: Employment.

Download Full-text

Machine Learning in Aging: An Example of Developing Prediction Models for Serious Fall Injury in Older Adults

Innovation in Aging ◽

10.1093/geroni/igaa057.859 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 268-269

Author(s):

Jaime Speiser ◽

Kathryn Callahan ◽

Jason Fanning ◽

Thomas Gill ◽

Anne Newman ◽

...

Keyword(s):

Machine Learning ◽

Older Adults ◽

Random Forest ◽

Decision Tree ◽

Prediction Models ◽

Receiver Operating Curve ◽

Learning Methods ◽

Life Study ◽

Fall Injury ◽

Machine Learning Methods

Abstract Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty understanding the complex algorithms behind models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated in data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Machine learning methods may offer improved performance compared to traditional models for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.

Download Full-text

Modified Decision Tree Technique for Ransomware Detection at Runtime through API Calls

Scientific Programming ◽

10.1155/2020/8845833 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Faizan Ullah ◽

Qaisar Javaid ◽

Abdu Salam ◽

Masood Ahmad ◽

Nadeem Sarwar ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Feature Vector ◽

Machine Learning Algorithms ◽

The Novel ◽

Proposed Model ◽

Testing Accuracy ◽

Financial Losses

Ransomware (RW) is a distinctive variety of malware that encrypts the files or locks the user’s system by keeping and taking their files hostage, which leads to huge financial losses to users. In this article, we propose a new model that extracts the novel features from the RW dataset and performs classification of the RW and benign files. The proposed model can detect a large number of RW from various families at runtime and scan the network, registry activities, and file system throughout the execution. API-call series was reutilized to represent the behavior-based features of RW. The technique extracts fourteen-feature vector at runtime and analyzes it by applying online machine learning algorithms to predict the RW. To validate the effectiveness and scalability, we test 78550 recent malign and benign RW and compare with the random forest and AdaBoost, and the testing accuracy is extended at 99.56%.

Download Full-text

A Machine Learning Approach to Detect Student Dropout at University

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/041062021 ◽

2021 ◽

Vol 10 (6) ◽

pp. 3101-3107

Keyword(s):

Machine Learning ◽

Random Forest ◽

Dropout Rate ◽

Random Forest Classifier ◽

Drop Out ◽

Dropout Rates ◽

Learning Technology ◽

Student Dropout ◽

High Dropout Rate ◽

Academic Plan

In universities, student dropout is a major concern that reflects the university's quality. Some characteristics cause students to drop out of university. A high dropout rate of students affects the university's reputation and the student's careers in the future. Therefore, there's a requirement for student dropout analysis to enhance academic plan and management to scale back student's drop out from the university also on enhancing the standard of the upper education system. The machine learning technique provides powerful methods for the analysis and therefore the prediction of the dropout. This study uses a dataset from a university representative to develop a model for predicting student dropout. In this work, machine- learning models were used to detect dropout rates. Machine learning is being more widely used in the field of knowledge mining diagnostics. Following an examination of certain studies, we observed that dropout detection may be done using several methods. We've even used five dropout detection models. These models are Decision tree, Naïve bayes, Random Forest Classifier, SVM and KNN. We used machine-learning technology to analyze the data, and we discovered that the Random Forest classifier is highly promising for predicting dropout rates, with a training accuracy of 94% and a testing accuracy of 86%.

Download Full-text

Subepithelial neutrophil infiltration as a predictor of the surgical outcome of chronic rhinosinusitis with nasal polyps

Rhinology Journal ◽

10.4193/rhin20.373 ◽

2020 ◽

Vol 0 (0) ◽

pp. 0-0

Author(s):

D-K. Kim ◽

H-S. Lim ◽

K.M. Eun ◽

Y. Seo ◽

J.K. Kim ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Chronic Rhinosinusitis ◽

Surgical Outcomes ◽

Nasal Polyps ◽

Human Neutrophil Elastase ◽

Immunofluorescence Analysis ◽

Double Positive ◽

Ki 67

BACKGROUND: Neutrophils present as major inflammatory cells in refractory chronic rhinosinusitis with nasal polyps (CRSwNP), regardless of the endotype. However, their role in the pathophysiology of CRSwNP remains poorly understood. We investigated factors predicting the surgical outcomes of CRSwNP patients with focus on neutrophilic localization. METHODS: We employed machine-learning methods such as the decision tree and random forest models to predict the surgical outcomes of CRSwNP. Immunofluorescence analysis was conducted to detect human neutrophil elastase (HNE), Bcl-2, and Ki-67 in NP tissues. We counted the immunofluorescence-positive cells and divided them into three groups based on the infiltrated area, namely, epithelial, subepithelial, and perivascular groups. RESULTS: On machine learning, the decision tree algorithm demonstrated that the number of subepithelial HNE-positive cells, Lund-Mackay (LM) scores, and endotype (eosinophilic or non-eosinophilic) were the most important predictors of surgical outcomes in CRSwNP patients. Additionally, the random forest algorithm showed that, after ranking the mean decrease in the Gini index or the accuracy of each factor, the top three ranking factors associated with surgical outcomes were the LM score, age, and number of subepithelial HNE-positive cells. In terms of cellular proliferation, immunofluorescence analysis revealed that Ki-67/HNE-double positive and Bcl-2/HNE-double positive cells were significantly increased in the subepithelial area in refractory CRSwNP. CONCLUSION: Our machine-learning approach and immunofluorescence analysis demonstrated that subepithelial neutrophils in NP tissues had a high expression of Ki-67 and could serve as a cellular biomarker for predicting surgical outcomes in CRSwNP patients.

Download Full-text

Predicting Bank Operational Efficiency Using Machine Learning Algorithm: Comparative Study of Decision Tree, Random Forest, and Neural Networks

Advances in Fuzzy Systems ◽

10.1155/2020/8581202 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Peter Appiahene ◽

Yaw Marfo Missah ◽

Ussiph Najim

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Banking Sector ◽

Banking Industry ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

And Performance

The financial crisis that hit Ghana from 2015 to 2018 has raised various issues with respect to the efficiency of banks and the safety of depositors’ in the banking industry. As part of measures to improve the banking sector and also restore customers’ confidence, efficiency and performance analysis in the banking industry has become a hot issue. This is because stakeholders have to detect the underlying causes of inefficiencies within the banking industry. Nonparametric methods such as Data Envelopment Analysis (DEA) have been suggested in the literature as a good measure of banks’ efficiency and performance. Machine learning algorithms have also been viewed as a good tool to estimate various nonparametric and nonlinear problems. This paper presents a combined DEA with three machine learning approaches in evaluating bank efficiency and performance using 444 Ghanaian bank branches, Decision Making Units (DMUs). The results were compared with the corresponding efficiency ratings obtained from the DEA. Finally, the prediction accuracies of the three machine learning algorithm models were compared. The results suggested that the decision tree (DT) and its C5.0 algorithm provided the best predictive model. It had 100% accuracy in predicting the 134 holdout sample dataset (30% banks) and a P value of 0.00. The DT was followed closely by random forest algorithm with a predictive accuracy of 98.5% and a P value of 0.00 and finally the neural network (86.6% accuracy) with a P value 0.66. The study concluded that banks in Ghana can use the result of this study to predict their respective efficiencies. All experiments were performed within a simulation environment and conducted in R studio using R codes.

Download Full-text

Applying Machine-Learning to Human Gastrointestinal Microbial Species to Predict Dietary Intake (P20-040-19)

Current Developments in Nutrition ◽

10.1093/cdn/nzz040.p20-040-19 ◽

2019 ◽

Vol 3 (Supplement_1) ◽

Cited By ~ 1

Author(s):

Leila Shinn ◽

Yutong Li ◽

Ruoqing Zhu ◽

Aditya Mansharamani ◽

Loretta Auvil ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Dietary Intake ◽

Predictive Accuracy ◽

Bacterial Species ◽

Whole Grains ◽

Classification Error ◽

Whole Grain ◽

Model Accuracy ◽

Food And Agriculture

Abstract Objectives To better understand host-microbe interactions, a more computationally intensive, multivariate, machine learning approach must be utilized. Accordingly, we aimed to identify biomarkers with high predictive accuracy for dietary intake. Methods Data were aggregated from five randomized, controlled, feeding studies in adults (n = 199) that provided avocados, almonds, broccoli, walnuts, or whole grain oats and whole grain barley. Fecal samples were collected during treatment and control periods for each study for DNA extraction. Subsequently, the 16S rRNA gene (V4 region) was amplified and sequenced. Sequence data were analyzed using DADA2 and QIIME2. Marginal screening using the Kruskal-Wallis test was performed on all species-level taxa to examine the differences between each of the 6 treatment groups and respective control groups. The top 20 species from each diet were selected and pooled together for multiclass classification using random forest. The resultant bacterial species were further decreased in a stepwise fashion and iteratively analyzed with the variable importance generated from random forest to determine a compact feature set with a minor loss of accuracy in the prediction of food consumed. Result When all six foods were analyzed together using the top 20 species of each diet, oats and barley were frequently confused for each other, with 44% and 47% classification error, respectively, and the overall model accuracy was 66%. Collapsing oats and barley into one category, whole grains, reduced the classification error of the whole grain category to 6% and improved the overall model accuracy to 73%. Refitting the random forest with the top 30, 20, and 10 important species resulted in correct identification of the 5 foods (avocados, almonds, broccoli, walnuts, and whole grains) 75%, 74%, and 70% of the time, respectively. Conclusions These results reveal promise in accurately predicting foods consumed using bacterial species as biomarkers. Ongoing analyses include incorporation of metagenomic and metabolomic data into the models to improve predictive accuracy and utilize the multi-omics dataset to predict health status. Long-term, these approaches may inform diet-microbiota-tailored recommendations. Funding Sources This research was funded by The Foundation for Food and Agriculture Research, USDA, Hass Avocado Board, and USDA National Institute of Food and Agriculture, Hatch project 1009249.

Download Full-text

Machine Learning Framework to Predict Chronic Kidney Disease using Ensemble Algorithm

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.d9107.069520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 1-6

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Random Forest ◽

Kidney Disease ◽

Decision Tree ◽

Performance Metrics ◽

Weighted Average ◽

Gradient Boosting ◽

Support Vector ◽

The Individual

Chronic Kidney Disease (CKD) is a worldwide concern that influences roughly 10% of the grown-up population on the world. For most of the people the early diagnosis of CKD is often not possible. Therefore, the utilization of present-day Computer aided supported strategies is important to help the conventional CKD finding framework to be progressively effective and precise. In this project, six modern machine learning techniques namely Multilayer Perceptron Neural Network, Support Vector Machine, Naïve Bayes, K-Nearest Neighbor, Decision Tree, Logistic regression were used and then to enhance the performance of the model Ensemble Algorithms such as ADABoost, Gradient Boosting, Random Forest, Majority Voting, Bagging and Weighted Average were used on the Chronic Kidney Disease dataset from the UCI Repository. The model was tuned finely to get the best hyper parameters to train the model. The performance metrics used to evaluate the model was measured using Accuracy, Precision, Recall, F1-score, Mathew`s Correlation Coefficient and ROC-AUC curve. The experiment was first performed on the individual classifiers and then on the Ensemble classifiers. The ensemble classifier like Random Forest and ADABoost performed better with 100% Accuracy, Precision and Recall when compared to the individual classifiers with 99.16% accuracy, 98.8% Precision and 100% Recall obtained from Decision Tree Algorithm

Download Full-text

MRI-Based Machine Learning in Differentiation Between Benign and Malignant Breast Lesions

Frontiers in Oncology ◽

10.3389/fonc.2021.552634 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yanjie Zhao ◽

Rong Chen ◽

Ting Zhang ◽

Chaoyue Chen ◽

Muhetaer Muhelisa ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Texture Analysis ◽

Training Group ◽

Gradient Boosting ◽

Breast Lesions ◽

Extreme Gradient Boosting ◽

Malignant Breast ◽

Benign Breast Lesions

BackgroundDifferential diagnosis between benign and malignant breast lesions is of crucial importance relating to follow-up treatment. Recent development in texture analysis and machine learning may lead to a new solution to this problem.MethodThis current study enrolled a total number of 265 patients (benign breast lesions:malignant breast lesions = 71:194) diagnosed in our hospital and received magnetic resonance imaging between January 2014 and August 2017. Patients were randomly divided into the training group and validation group (4:1), and two radiologists extracted their texture features from the contrast-enhanced T1-weighted images. We performed five different feature selection methods including Distance correlation, Gradient Boosting Decision Tree (GBDT), least absolute shrinkage and selection operator (LASSO), random forest (RF), eXtreme gradient boosting (Xgboost) and five independent classification models were built based on Linear discriminant analysis (LDA) algorithm.ResultsAll five models showed promising results to discriminate malignant breast lesions from benign breast lesions, and the areas under the curve (AUCs) of receiver operating characteristic (ROC) were all above 0.830 in both training and validation groups. The model with a better discriminating ability was the combination of LDA + gradient boosting decision tree (GBDT). The sensitivity, specificity, AUC, and accuracy in the training group were 0.814, 0.883, 0.922, and 0.868, respectively; LDA + random forest (RF) also suggests promising results with the AUC of 0.906 in the training group.ConclusionThe evidence of this study, while preliminary, suggested that a combination of MRI texture analysis and LDA algorithm could discriminate benign breast lesions from malignant breast lesions. Further multicenter researches in this field would be of great help in the validation of the result.

Download Full-text

Prediction of COVID-19 Risk in Public Areas Using IoT and Machine Learning

Electronics ◽

10.3390/electronics10141677 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1677

Author(s):

Ersin Elbasi ◽

Ahmet E. Topcu ◽

Shinu Mathew

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Naive Bayes ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Bayes Classifier ◽

Social Distancing ◽

Public Areas ◽

Iot Devices

COVID-19 is a community-acquired infection with symptoms that resemble those of influenza and bacterial pneumonia. Creating an infection control policy involving isolation, disinfection of surfaces, and identification of contagions is crucial in eradicating such pandemics. Incorporating social distancing could also help stop the spread of community-acquired infections like COVID-19. Social distancing entails maintaining certain distances between people and reducing the frequency of contact between people. Meanwhile, a significant increase in the development of different Internet of Things (IoT) devices has been seen together with cyber-physical systems that connect with physical environments. Machine learning is strengthening current technologies by adding new approaches to quickly and correctly solve problems utilizing this surge of available IoT devices. We propose a new approach using machine learning algorithms for monitoring the risk of COVID-19 in public areas. Extracted features from IoT sensors are used as input for several machine learning algorithms such as decision tree, neural network, naïve Bayes classifier, support vector machine, and random forest to predict the risks of the COVID-19 pandemic and calculate the risk probability of public places. This research aims to find vulnerable populations and reduce the impact of the disease on certain groups using machine learning models. We build a model to calculate and predict the risk factors of populated areas. This model generates automated alerts for security authorities in the case of any abnormal detection. Experimental results show that we have high accuracy with random forest of 97.32%, with decision tree of 94.50%, and with the naïve Bayes classifier of 99.37%. These algorithms indicate great potential for crowd risk prediction in public areas.

Download Full-text