scholarly journals Multi–GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL

Author(s):  
Jan Masek ◽  
Radim Burget ◽  
Lukas Povoda ◽  
Malay Kishore Dutta

Using modern Graphic Processing Units (GPUs) becomes very useful for computing complex and time consuming processes. GPUs provide high–performance computation capabilities with a good price. This paper deals with a multi–GPU OpenCL and CUDA implementations of k–Nearest Neighbor (k–NN) algorithm. This work compares performances of OpenCLand CUDA implementations where each of them is suitable for different number of used attributes. The proposed CUDA algorithm achieves acceleration up to 880x in comparison witha single thread CPU version. The common k-NN was modified to be faster when the lower number of k neighbors is set. The performance of algorithm was verified with two GPUs dual-core NVIDIA GeForce GTX 690 and CPU Intel Core i7 3770 with 4.1 GHz frequency. The results of speed up were measured for one GPU, two GPUs, three and four GPUs. We performed several tests with data sets containing up to 4 million elements with various number of attributes.

2022 ◽  
Vol 2022 ◽  
pp. 1-15
Author(s):  
Chao Zhang ◽  
Peisi Zhong ◽  
Mei Liu ◽  
Qingjun Song ◽  
Zhongyuan Liang ◽  
...  

The K-Nearest Neighbor (KNN) algorithm is a classical machine learning algorithm. Most KNN algorithms are based on a single metric and do not further distinguish between repeated values in the range of K values, which can lead to a reduced classification effect and thus affect the accuracy of fault diagnosis. In this paper, a hybrid metric-based KNN algorithm is proposed to calculate a composite metric containing distance and direction information between test samples, which improves the discriminability of the samples. In the experiments, the hybrid metric KNN (HM-KNN) algorithm proposed in this paper is compared and validated with a variety of KNN algorithms based on a single distance metric on six data sets, and an HM-KNN application method is given for the forward gait stability control of a bipedal robot, where the abnormal motion is considered as a fault, and the distribution of zero moment points when the abnormal motion is generated is compared. The experimental results show that the algorithm has good data differentiation and generalization ability for different data sets, and it is feasible to apply it to the walking stability control of bipedal robots based on deep neural network control.


2021 ◽  
pp. 1-17
Author(s):  
Ahmed Al-Tarawneh ◽  
Ja’afer Al-Saraireh

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.


Mathematics ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. 830
Author(s):  
Seokho Kang

k-nearest neighbor (kNN) is a widely used learning algorithm for supervised learning tasks. In practice, the main challenge when using kNN is its high sensitivity to its hyperparameter setting, including the number of nearest neighbors k, the distance function, and the weighting function. To improve the robustness to hyperparameters, this study presents a novel kNN learning method based on a graph neural network, named kNNGNN. Given training data, the method learns a task-specific kNN rule in an end-to-end fashion by means of a graph neural network that takes the kNN graph of an instance to predict the label of the instance. The distance and weighting functions are implicitly embedded within the graph neural network. For a query instance, the prediction is obtained by performing a kNN search from the training data to create a kNN graph and passing it through the graph neural network. The effectiveness of the proposed method is demonstrated using various benchmark datasets for classification and regression tasks.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1797
Author(s):  
Ján Vachálek ◽  
Dana Šišmišová ◽  
Pavol Vašek ◽  
Jan Rybář ◽  
Juraj Slovák ◽  
...  

The article deals with aspects of identifying industrial products in motion based on their color. An automated robotic workplace with a conveyor belt, robot and an industrial color sensor is created for this purpose. Measured data are processed in a database and then statistically evaluated in form of type A standard uncertainty and type B standard uncertainty, in order to obtain combined standard uncertainties results. Based on the acquired data, control charts of RGB color components for identified products are created. Influence of product speed on the measuring process identification and process stability is monitored. In case of identification uncertainty i.e., measured values are outside the limits of control charts, the K-nearest neighbor machine learning algorithm is used. This algorithm, based on the Euclidean distances to the classified value, estimates its most accurate iteration. This results into the comprehensive system for identification of product moving on conveyor belt, where based on the data collection and statistical analysis using machine learning, industry usage reliability is demonstrated.


2018 ◽  
Vol 7 (3) ◽  
pp. 1372
Author(s):  
Soudamini Hota ◽  
Sudhir Pathak

‘Sentiment’ literally means ‘Emotions’. Sentiment analysis, synonymous to opinion mining, is a type of data mining that refers to the analy-sis of data obtained from microblogging sites, social media updates, online news reports, user reviews etc., in order to study the sentiments of the people towards an event, organization, product, brand, person etc. In this work, sentiment classification is done into multiple classes. The proposed methodology based on KNN classification algorithm shows an improvement over one of the existing methodologies which is based on SVM classification algorithm. The data used for analysis has been taken from Twitter, this being the most popular microblogging site. The source data has been extracted from Twitter using Python’s Tweepy. N-Gram modeling technique has been used for feature extraction and the supervised machine learning algorithm k-nearest neighbor has been used for sentiment classification. The performance of proposed and existing techniques is compared in terms of accuracy, precision and recall. It is analyzed and concluded that the proposed technique performs better in terms of all the standard evaluation parameters. 


Polymers ◽  
2021 ◽  
Vol 13 (21) ◽  
pp. 3811
Author(s):  
Iosif Sorin Fazakas-Anca ◽  
Arina Modrea ◽  
Sorin Vlase

This paper proposes a new method for calculating the monomer reactivity ratios for binary copolymerization based on the terminal model. The original optimization method involves a numerical integration algorithm and an optimization algorithm based on k-nearest neighbour non-parametric regression. The calculation method has been tested on simulated and experimental data sets, at low (<10%), medium (10–35%) and high conversions (>40%), yielding reactivity ratios in a good agreement with the usual methods such as intersection, Fineman–Ross, reverse Fineman–Ross, Kelen–Tüdös, extended Kelen–Tüdös and the error in variable method. The experimental data sets used in this comparative analysis are copolymerization of 2-(N-phthalimido) ethyl acrylate with 1-vinyl-2-pyrolidone for low conversion, copolymerization of isoprene with glycidyl methacrylate for medium conversion and copolymerization of N-isopropylacrylamide with N,N-dimethylacrylamide for high conversion. Also, the possibility to estimate experimental errors from a single experimental data set formed by n experimental data is shown.


2021 ◽  
Author(s):  
Ayesha Sania ◽  
Nicolo Pini ◽  
Morgan Nelson ◽  
Michael Myers ◽  
Lauren Shuffrey ◽  
...  

Abstract Background — Missing data are a source of bias in epidemiologic studies. This is problematic in alcohol research where data missingness is linked to drinking behavior. Methods — The Safe Passage study was a prospective investigation of prenatal drinking and fetal/infant outcomes (n=11,083). Daily alcohol consumption for last reported drinking day and 30 days prior was recorded using Timeline Followback method. Of 3.2 million person-days, data were missing for 0.36 million. We imputed missing data using a machine learning algorithm; “K Nearest Neighbor” (K-NN). K-NN imputes missing values for a participant using data of participants closest to it. Imputed values were weighted for the distances from nearest neighbors and matched for day of week. Validation was done on randomly deleted data for 5-15 consecutive days. Results — Data from 5 nearest neighbors and segments of 55 days provided imputed values with least imputation error. After deleting data segments from with no missing days first trimester, there was no difference between actual and predicted values for 64% of deleted segments. For 31% of the segments, imputed data were within +/-1 drink/day of the actual. Conclusions — K-NN can be used to impute missing data in longitudinal studies of alcohol use during pregnancy with high accuracy.


Diagnostics ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1870
Author(s):  
Yaghoub Pourasad ◽  
Esmaeil Zarouri ◽  
Mohammad Salemizadeh Parizi ◽  
Amin Salih Mohammed

Breast cancer is one of the main causes of death among women worldwide. Early detection of this disease helps reduce the number of premature deaths. This research aims to design a method for identifying and diagnosing breast tumors based on ultrasound images. For this purpose, six techniques have been performed to detect and segment ultrasound images. Features of images are extracted using the fractal method. Moreover, k-nearest neighbor, support vector machine, decision tree, and Naïve Bayes classification techniques are used to classify images. Then, the convolutional neural network (CNN) architecture is designed to classify breast cancer based on ultrasound images directly. The presented model obtains the accuracy of the training set to 99.8%. Regarding the test results, this diagnosis validation is associated with 88.5% sensitivity. Based on the findings of this study, it can be concluded that the proposed high-potential CNN algorithm can be used to diagnose breast cancer from ultrasound images. The second presented CNN model can identify the original location of the tumor. The results show 92% of the images in the high-performance region with an AUC above 0.6. The proposed model can identify the tumor’s location and volume by morphological operations as a post-processing algorithm. These findings can also be used to monitor patients and prevent the growth of the infected area.


Stock Trading has been one of the most important parts of the financial world for decades. People investing in the share market analyze the financial history of a corporation, the news related to it and study huge amounts of data so as to predict its stock price trend. The right investment i.e. buying and selling a company stock at the right time leads to monetary benefits and can make one a millionaire overnight. The stock market is an extremely fluctuating platform wherein data is produced in humongous quantities and is influenced by numerous disparate factors such as socio-political issues, financial activities like splits and dividends, news as well as rumors. This work proposes a novel system “IntelliFin” to predict the share market trend. The system uses the various stock market technical indicators along with the company's historical market data trends to predict the share prices. The system employs the sentiment determination of a company's financial and socio-political news for a more accurate prediction. This system is implemented using two models. The first is a hybrid LSTM model optimized by an ADAM optimizer. The other is a hybrid ML model which integrates a Support Vector Regressor, K-Nearest Neighbor classifier, an RF classifier and a Linear Regressor using a Majority Voting algorithm. Both models employ a sentiment analyzer to account for the news impacting the stock prices which is powered by NLP. The models are trained continuously using Reinforcement Learning implemented by the Q-Learning Algorithm to increase the consistency and accuracy. The project aims to support the inexperienced investors, who don't have enough experience in investing in the stock market and help them maximize their profit and minimize or eliminate the losses. The developed system will also serve as a tool for professional investors to help and aid their decision making.


Sign in / Sign up

Export Citation Format

Share Document