An Efficient FPGA-Based Hardware Accelerator for Convex Optimization-Based SVM Classifier for Machine Learning on Embedded Platforms

Machine learning is becoming the cornerstones of smart and autonomous systems. Machine learning algorithms can be categorized into supervised learning (classification) and unsupervised learning (clustering). Among many classification algorithms, the Support Vector Machine (SVM) classifier is one of the most commonly used machine learning algorithms. By incorporating convex optimization techniques into the SVM classifier, we can further enhance the accuracy and classification process of the SVM by finding the optimal solution. Many machine learning algorithms, including SVM classification, are compute-intensive and data-intensive, requiring significant processing power. Furthermore, many machine learning algorithms have found their way into portable and embedded devices, which have stringent requirements. In this research work, we introduce a novel, unique, and efficient Field Programmable Gate Array (FPGA)-based hardware accelerator for a convex optimization-based SVM classifier for embedded platforms, considering the constraints associated with these platforms and the requirements of the applications running on these devices. We incorporate suitable mathematical kernels and decomposition methods to systematically solve the convex optimization for machine learning applications with a large volume of data. Our proposed architectures are generic, parameterized, and scalable; hence, without changing internal architectures, our designs can be used to process different datasets with varying sizes, can be executed on different platforms, and can be utilized for various machine learning applications. We also introduce system-level architectures and techniques to facilitate real-time processing. Experiments are performed using two different benchmark datasets to evaluate the feasibility and efficiency of our hardware architecture, in terms of timing, speedup, area, and accuracy. Our embedded hardware design achieves up to 79 times speedup compared to its embedded software counterpart, and can also achieve up to 100% classification accuracy.

Download Full-text

Tebyan: Fake News Detection System (Preprint)

10.2196/preprints.35982 ◽

2021 ◽

Author(s):

Lamya Alderywsh ◽

Aseel Aldawood ◽

Ashwag Alasmari ◽

Farah Aldeijy ◽

Ghadah Alqubisy ◽

...

Keyword(s):

Machine Learning ◽

Arab World ◽

Detection System ◽

Learning Algorithms ◽

Performance Measure ◽

Machine Learning Algorithms ◽

Svm Classifier ◽

Fake News ◽

Typical Type ◽

Performance Results

BACKGROUND There is a serious threat from fake news spreading in technologically advanced societies, including those in the Arab world, via deceptive machine-generated text. In the last decade, Arabic fake news identification has gained increased attention, and numerous detection approaches have revealed some ability to find fake news throughout various data sources. Nevertheless, many existing approaches overlook recent advancements in fake news detection, explicitly to incorporate machine learning algorithms system. OBJECTIVE Tebyan project aims to address the problem of fake news by developing a fake news detection system that employs machine learning algorithms to detect whether the news is fake or real in the context of Arab world. METHODS The project went through numerous phases using an iterative methodology to develop the system. This study analysis incorporated numerous stages using an iterative method to develop the system of misinformation and contextualize fake news regarding society's information. It consists of implementing the machine learning algorithms system using Python to collect genuine and fake news datasets. The study also assesses how information-exchanging behaviors can minimize and find the optimal source of authentication of the emergent news through system testing approaches. RESULTS The study revealed that the main deliverable of this project is the Tebyan system in the community, which allows the user to ensure the credibility of news in Arabic newspapers. It showed that the SVM classifier, on average, exhibited the highest performance results, resulting in 90% in every performance measure of sources. Moreover, the results indicate the second-best algorithm is the linear SVC since it resulted in 90% in performance measure with the societies' typical type of fake information. CONCLUSIONS The study concludes that conducting a system with machine learning algorithms using Python programming language allows the rapid measures of the users' perception to comment and rate the credibility result and subscribing to news email services.

Download Full-text

Machine Learning and Cryptographic Algorithms – Analysis and Design in Ransomware and Vulnerabilities Detection

10.36227/techrxiv.13146866 ◽

2020 ◽

Author(s):

Nandkumar Niture

Keyword(s):

Machine Learning ◽

Intelligent System ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

System Level ◽

Management Systems ◽

System Response ◽

Hard Part ◽

Analysis And Design ◽

Password Management

The AI, deep learning and machine learning algorithms are gaining the ground in every application domain of information technology including information security. In formation security domain knows for traditional password management systems, auto-provisioning systems and user information management systems. There is another raising concern on the application and system level security with ransomware. On the existing systems cyber-attacks of Ransomware asking for ransom increasing every day. Ransomware is the class of malware where the goal is to gain the data through encryption mechanism and render back with the ransom. The ransomware attacks are mainly on the vulnerable systems which are exposed to the network with weak security measures. With the help of machine learning algorithms, the pattern of the attacks can be analyzed. Create or discuss a workaround solution of a machine learning model with combination of cryptographic algorithm which will enhance the effectiveness of the system response to the possible attacks. The other part of the problem, which is hard part to create an intelligence for the organizations for preventing the ransomware attacks with the help of intelligent system password management and intelligent account provisioning. In this paper I elaborate on the machine learning algorithms analysis for the intelligent ransomware detection problem, later part of this paper would be design of the algorithm.

Download Full-text

Towards scaling Twitter for digital epidemiology of birth defects

npj Digital Medicine ◽

10.1038/s41746-019-0170-5 ◽

2019 ◽

Vol 2 (1) ◽

Cited By ~ 4

Author(s):

Ari Z. Klein ◽

Abeed Sarker ◽

Davy Weissenbacher ◽

Graciela Gonzalez-Hernandez

Keyword(s):

Machine Learning ◽

Social Media ◽

Language Processing ◽

Birth Defects ◽

Birth Defect ◽

Learning Algorithms ◽

Class Imbalance ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Svm Classifier

Abstract Social media has recently been used to identify and study a small cohort of Twitter users whose pregnancies with birth defect outcomes—the leading cause of infant mortality—could be observed via their publicly available tweets. In this study, we exploit social media on a larger scale by developing natural language processing (NLP) methods to automatically detect, among thousands of users, a cohort of mothers reporting that their child has a birth defect. We used 22,999 annotated tweets to train and evaluate supervised machine learning algorithms—feature-engineered and deep learning-based classifiers—that automatically distinguish tweets referring to the user’s pregnancy outcome from tweets that merely mention birth defects. Because 90% of the tweets merely mention birth defects, we experimented with under-sampling and over-sampling approaches to address this class imbalance. An SVM classifier achieved the best performance for the two positive classes: an F1-score of 0.65 for the “defect” class and 0.51 for the “possible defect” class. We deployed the classifier on 20,457 unlabeled tweets that mention birth defects, which helped identify 542 additional users for potential inclusion in our cohort. Contributions of this study include (1) NLP methods for automatically detecting tweets by users reporting their birth defect outcomes, (2) findings that an SVM classifier can outperform a deep neural network-based classifier for highly imbalanced social media data, (3) evidence that automatic classification can be used to identify additional users for potential inclusion in our cohort, and (4) a publicly available corpus for training and evaluating supervised machine learning algorithms.

Download Full-text

Compendiums of cancer transcriptomes for machine learning applications

Scientific Data ◽

10.1038/s41597-019-0207-2 ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 2

Author(s):

Su Bin Lim ◽

Swee Jin Tan ◽

Wan-Teck Lim ◽

Chwee Teck Lim

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Data Reuse ◽

Rna Seq ◽

Genomic Landscape ◽

Source Data ◽

Machine Learning Applications ◽

Cancer Types ◽

Data Source

Abstract There are massive transcriptome profiles in the form of microarray. The challenge is that they are processed using diverse platforms and preprocessing tools, requiring considerable time and informatics expertise for cross-dataset analyses. If there exists a single, integrated data source, data-reuse can be facilitated for discovery, analysis, and validation of biomarker-based clinical strategy. Here, we present merged microarray-acquired datasets (MMDs) across 11 major cancer types, curating 8,386 patient-derived tumor and tumor-free samples from 95 GEO datasets. Using machine learning algorithms, we show that diagnostic models trained from MMDs can be directly applied to RNA-seq-acquired TCGA data with high classification accuracy. Machine learning optimized MMD further aids to reveal immune landscape across various carcinomas critically needed in disease management and clinical interventions. This unified data source may serve as an excellent training or test set to apply, develop, and refine machine learning algorithms that can be tapped to better define genomic landscape of human cancers.

Download Full-text

A Novel Method for Colorectal Cancer Screening Based on Circulating Tumor Cells and Machine Learning

Entropy ◽

10.3390/e23101248 ◽

2021 ◽

Vol 23 (10) ◽

pp. 1248

Author(s):

Eleana Hatzidaki ◽

Aggelos Iliopoulos ◽

Ioannis Papasotiriou

Keyword(s):

Colorectal Cancer ◽

Machine Learning ◽

Flow Cytometry ◽

Cancer Screening ◽

Tumor Cells ◽

Circulating Tumor Cells ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Svm Classifier ◽

Significant Information

Colorectal cancer is one of the most common types of cancer, and it can have a high mortality rate if left untreated or undiagnosed. The fact that CRC becomes symptomatic at advanced stages highlights the importance of early screening. The reference screening method for CRC is colonoscopy, an invasive, time-consuming procedure that requires sedation or anesthesia and is recommended from a certain age and above. The aim of this study was to build a machine learning classifier that can distinguish cancer from non-cancer samples. For this, circulating tumor cells were enumerated using flow cytometry. Their numbers were used as a training set for building an optimized SVM classifier that was subsequently used on a blind set. The SVM classifier’s accuracy on the blind samples was found to be 90.0%, sensitivity was 80.0%, specificity was 100.0%, precision was 100.0% and AUC was 0.98. Finally, in order to test the generalizability of our method, we also compared the performances of different classifiers developed by various machine learning models, using over-sampling datasets generated by the SMOTE algorithm. The results showed that SVM achieved the best performances according to the validation accuracy metric. Overall, our results demonstrate that CTCs enumerated by flow cytometry can provide significant information, which can be used in machine learning algorithms to successfully discriminate between healthy and colorectal cancer patients. The clinical significance of this method could be the development of a simple, fast, non-invasive cancer screening tool based on blood CTC enumeration by flow cytometry and machine learning algorithms.

Download Full-text

Machine Learning and Cryptographic Algorithms – Analysis and Design in Ransomware and Vulnerabilities Detection

10.36227/techrxiv.13146866.v1 ◽

2020 ◽

Author(s):

Nandkumar Niture

Keyword(s):

Machine Learning ◽

Intelligent System ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

System Level ◽

Management Systems ◽

System Response ◽

Hard Part ◽

Analysis And Design ◽

Password Management

Download Full-text

Optimization Techniques to Solve Travelling Salesman Problem Using Machine Learning Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2022.39822 ◽

2022 ◽

Vol 10 (1) ◽

pp. 274-279

Author(s):

Prince Nathan S

Keyword(s):

Machine Learning ◽

Ant Colony Optimization ◽

Travelling Salesman Problem ◽

Learning Algorithms ◽

Optimization Techniques ◽

Machine Learning Algorithms ◽

Ant Colony ◽

Lasso Regression ◽

Q Learning ◽

The Right

Abstract: Travelling Salesmen problem is a very popular problem in the world of computer programming. It deals with the optimization of algorithms and an ever changing scenario as it gets more and more complex as the number of variables goes on increasing. The solutions which exist for this problem are optimal for a small and definite number of cases. One cannot take into consideration of the various factors which are included when this specific problem is tried to be solved for the real world where things change continuously. There is a need to adapt to these changes and find optimized solutions as the application goes on. The ability to adapt to any kind of data, whether static or ever-changing, understand and solve it is a quality that is shown by Machine Learning algorithms. As advances in Machine Learning take place, there has been quite a good amount of research for how to solve NP-hard problems using Machine Learning. This reportis a survey to understand what types of machine algorithms can be used to solve with TSP. Different types of approaches like Ant Colony Optimization and Q-learning are explored and compared. Ant Colony Optimization uses the concept of ants following pheromone levels which lets them know where the most amount of food is. This is widely used for TSP problems where the path is with the most pheromone is chosen. Q-Learning is supposed to use the concept of awarding an agent when taking the right action for a state it is in and compounding those specific rewards. This is very much based on the exploiting concept where the agent keeps on learning onits own to maximize its own reward. This can be used for TSP where an agentwill be rewarded for having a short path and will be rewarded more if the path chosen is the shortest. Keywords: LINEAR REGRESSION, LASSO REGRESSION, RIDGE REGRESSION, DECISION TREE REGRESSOR, MACHINE LEARNING, HYPERPARAMETER TUNING, DATA ANALYSIS

Download Full-text

Optimization Techniques for Mining Power Quality Data and Processing Unbalanced Datasets in Machine Learning Applications

Energies ◽

10.3390/en14020463 ◽

2021 ◽

Vol 14 (2) ◽

pp. 463

Author(s):

Alvaro Furlani Bastos ◽

Surya Santoso

Keyword(s):

Machine Learning ◽

Power Systems ◽

Power Quality ◽

Optimization Techniques ◽

Machine Learning Algorithms ◽

Quality Data ◽

Successful Performance ◽

Data Mining Approach ◽

Machine Learning Applications ◽

Learning Frameworks

In recent years, machine learning applications have received increasing interest from power system researchers. The successful performance of these applications is dependent on the availability of extensive and diverse datasets for the training and validation of machine learning frameworks. However, power systems operate at quasi-steady-state conditions for most of the time, and the measurements corresponding to these states provide limited novel knowledge for the development of machine learning applications. In this paper, a data mining approach based on optimization techniques is proposed for filtering root-mean-square (RMS) voltage profiles and identifying unusual measurements within triggerless power quality datasets. Then, datasets with equal representation between event and non-event observations are created so that machine learning algorithms can extract useful insights from the rare but important event observations. The proposed framework is demonstrated and validated with both synthetic signals and field data measurements.

Download Full-text

Detailed Analysis of Intrusion Detection using Machine Learning Algorithms

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a2127.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 1894-1899 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Svm Classifier ◽

Learning Approaches ◽

Decision Tree Classifier ◽

Internet Users ◽

Tree Classifier ◽

Challenging Tasks

The number of internet users has increased exponentially over the years and so have increased intrusive activities significantly. To detect an intrusion attack in a system connected over a network is one of the most challenging tasks in today’s world. A significant number of techniques have been developed which are based on machine learning approaches to detect these intrusion attacks. Even though these techniques are good, they are not good enough to detect all kinds of attacks. In this paper, the analysis of different machine learning algorithm will be performed on the NSL-KDD dataset with pre-processing steps like One-hot encoding, feature selection and random sampling to use in different machine learning models to find the best performing model to detect these attacks. The attacks are from the datasets are classified into four types of attacks: Probe, DoS, U2R, R2L while the non- attack is the Normal. The dataset is in two parts: KDD-Train and KDD-Test. The dataset is trained and tested to find accuracy and understand the performance of different machine learning algorithms and compare them. The Machine Learning algorithms used are Naive Bayes Classifier, Decision Tree Classifier, Random Forest Classifier, KNeighbours Classifier, Logistic Regression, SVM Classifier, Voting Classifier. These techniques are compared according to their capability to detect the attacks. This comparison will help to find the algorithm which would work the best to detect different kinds of intrusion attacks.

Download Full-text

Performance analysis of machine learning algorithms trained on biased data

10.5753/eniac.2021.18283 ◽

2021 ◽

Author(s):

Renata Sendreti Broder ◽

Lilian Berton

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Logistic Regression ◽

Ethical Issues ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Applications ◽

Processing Algorithms ◽

Real Person ◽

Shed Light

The use of Artificial Intelligence and Machine Learning algorithms in everyday life is common nowadays in several areas, bringing many possibilities and benefits to society. However, since there is room for learning algorithms to make decisions, the range of related ethical issues was also expanded. There are many complaints about Machine Learning applications that identify some kind of bias, disadvantaging or favoring some group, with the possibility of causing harm to a real person. The present work aims to shed light on the existence of biases, analyzing and comparing the behavior of different learning algorithms – namely Decision Tree, MLP, Naive Bayes, Random Forest, Logistic Regression and SVM – when being trained using biased data. We employed pre-processing algorithms for mitigating bias provided by IBM's framework AI Fairness 360.

Download Full-text