FPGA and GPU-based acceleration of ML workloads on Amazon cloud - A case study using gradient boosted decision tree library

Integration ◽  
2020 ◽  
Vol 70 ◽  
pp. 1-9
Author(s):  
Maxim Shepovalov ◽  
Venkatesh Akella
Author(s):  
Ellysia Jumin ◽  
Faridah Bte Basaruddin ◽  
Yuzainee Bte. Md Yusoff ◽  
Sarmad Dashti Latif ◽  
Ali Najah Ahmed

2020 ◽  
Vol 7 (2) ◽  
pp. 200
Author(s):  
Puji Santoso ◽  
Rudy Setiawan

One of the tasks in the field of marketing finance is to analyze customer data to find out which customers have the potential to do credit again. The method used to analyze customer data is by classifying all customers who have completed their credit installments into marketing targets, so this method causes high operational marketing costs. Therefore this research was conducted to help solve the above problems by designing a data mining application that serves to predict the criteria of credit customers with the potential to lend (credit) to Mega Auto Finance. The Mega Auto finance Fund Section located in Kotim Regency is a place chosen by researchers as a case study, assuming the Mega Auto finance Fund Section has experienced the same problems as described above. Data mining techniques that are applied to the application built is a classification while the classification method used is the Decision Tree (decision tree). While the algorithm used as a decision tree forming algorithm is the C4.5 Algorithm. The data processed in this study is the installment data of Mega Auto finance loan customers in July 2018 in Microsoft Excel format. The results of this study are an application that can facilitate the Mega Auto finance Funds Section in obtaining credit marketing targets in the future


2021 ◽  
Author(s):  
Thomas Weripuo Gyeera

<div>The National Institute of Standards and Technology defines the fundamental characteristics of cloud computing as: on-demand computing, offered via the network, using pooled resources, with rapid elastic scaling and metered charging. The rapid dynamic allocation and release of resources on demand to meet heterogeneous computing needs is particularly challenging for data centres, which process a huge amount of data characterised by its high volume, velocity, variety and veracity (4Vs model). Data centres seek to regulate this by monitoring and adaptation, typically reacting to service failures after the fact. We present a real cloud test bed with the capabilities of proactively monitoring and gathering cloud resource information for making predictions and forecasts. This contrasts with the state-of-the-art reactive monitoring of cloud data centres. We argue that the behavioural patterns and Key Performance Indicators (KPIs) characterizing virtualized servers, networks, and database applications can best be studied and analysed with predictive models. Specifically, we applied the Boosted Decision Tree machine learning algorithm in making future predictions on the KPIs of a cloud server and virtual infrastructure network, yielding an R-Square of 0.9991 at a 0.2 learning rate. This predictive framework is beneficial for making short- and long-term predictions for cloud resources.</div>


2018 ◽  
Vol 20 (3) ◽  
pp. 298-105 ◽  
Author(s):  
Shrawan Kumar Trivedi ◽  
Prabin Kumar Panigrahi

PurposeEmail spam classification is now becoming a challenging area in the domain of text classification. Precise and robust classifiers are not only judged by classification accuracy but also by sensitivity (correctly classified legitimate emails) and specificity (correctly classified unsolicited emails) towards the accurate classification, captured by both false positive and false negative rates. This paper aims to present a comparative study between various decision tree classifiers (such as AD tree, decision stump and REP tree) with/without different boosting algorithms (bagging, boosting with re-sample and AdaBoost).Design/methodology/approachArtificial intelligence and text mining approaches have been incorporated in this study. Each decision tree classifier in this study is tested on informative words/features selected from the two publically available data sets (SpamAssassin and LingSpam) using a greedy step-wise feature search method.FindingsOutcomes of this study show that without boosting, the REP tree provides high performance accuracy with the AD tree ranking as the second-best performer. Decision stump is found to be the under-performing classifier of this study. However, with boosting, the combination of REP tree and AdaBoost compares favourably with other classification models. If the metrics false positive rate and performance accuracy are taken together, AD tree and REP tree with AdaBoost were both found to carry out an effective classification task. Greedy stepwise has proven its worth in this study by selecting a subset of valuable features to identify the correct class of emails.Research limitations/implicationsThis research is focussed on the classification of those email spams that are written in the English language only. The proposed models work with content (words/features) of email data that is mostly found in the body of the mail. Image spam has not been included in this study. Other messages such as short message service or multi-media messaging service were not included in this study.Practical implicationsIn this research, a boosted decision tree approach has been proposed and used to classify email spam and ham files; this is found to be a highly effective approach in comparison with other state-of-the-art modes used in other studies. This classifier may be tested for different applications and may provide new insights for developers and researchers.Originality/valueA comparison of decision tree classifiers with/without ensemble has been presented for spam classification.


Author(s):  
Heni Sulistiani ◽  
Ahmad Ari Aldino

In pandemic era, almost everyone struggles for their life. College students are such example. They have difficulty in paying tuition fee to continue their study. Based on this problematic situation, Universitas Teknokrat Indonesia grants the students who have good academic performance with tuition fee aid program. Many variables used for determining the grant made it hard to make a decision in a short time or even takes very long time. To make it easier for management to decide who is the right student to get grant, it needs classification model. The purpose of this study is the classification of grant recipients by using decision tree C4.5 algorithm. That can determine whether a potential student can be accepted as an awardee or not. Then, the results of the classification are validated with ten-fold cross validation with an accuracy, precision and recall with the score of 87 % for all part. It means the model perform quite well to be implemented into system.


Author(s):  
Made Leo Radhitya ◽  
Agus Harjoko

One of the dangers that occur at the beach is rip current. Rip current poses significant danger for beachgoers. This paper proposes a method to predict the rip current's occurence risk by using decision tree generated using C4.5 algorithm. The output from the decision tree is rip current's occurrence risk. The case study for this research is the beach located at Rote Island, Rote Ndao, Nusa Tenggara Timur. Evaluation result shows that the accuracy is 0.84, and the precision is 0.61. The average recall value is 0.68 and the average F-measure is 0.59 in the range 0 to 1.


Sign in / Sign up

Export Citation Format

Share Document