Detecting malicious URLs using binary classification through adaboost algorithm

Malicious Uniform Resource Locator (URL) is a frequent and severe menace to cybersecurity. Malicious URLs are used to extract unsolicited information and trick inexperienced end users as a sufferer of scams and create losses of billions of money each year. It is crucial to identify and appropriately respond to such URLs. Usually, this discovery is made by the practice and use of blacklists in the cyber world. However, blacklists cannot be exhaustive, and cannot recognize zero-day malicious URLs. So to increase the observation of malicious URL indicators, machine learning procedures should be incorporated. This study aims to discuss the exposure of malicious URLs as a binary classification problem using machine learning through an AdaBoost algorithm.

Download Full-text

ANALYSIS OF MACHINE LEARNING ALGORITHMS FOR THE BINARY CLASSIFICATION PROBLEM

CHERKASY UNIVERSITY BULLETIN: APPLIED MATHEMATICS. INFORMATICS ◽

10.31651/2076-5886-2019-2-86-95 ◽

2020 ◽

pp. 86-95

Author(s):

Oleksandr PISKUN ◽

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Learning Algorithms ◽

Classification Problem ◽

Machine Learning Algorithms ◽

Binary Classification Problem

Download Full-text

Using Machine Learning with PySpark and MLib for Solving a Binary Classification Problem: Case of Searching for Exotic Particles

Recent Advances in Intuitionistic Fuzzy Logic Systems and Mathematics - Studies in Fuzziness and Soft Computing ◽

10.1007/978-3-030-53929-0_8 ◽

2020 ◽

pp. 109-118

Author(s):

Mourad Azhari ◽

Abdallah Abarda ◽

Badia Ettaki ◽

Jamal Zerouaoui ◽

Mohamed Dakkon

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Classification Problem ◽

Exotic Particles ◽

Problem Case ◽

Binary Classification Problem

Download Full-text

Advancing Stress Detection Methodology with Deep Learning Techniques Targeting UX Evaluation in AAL Scenarios: Applying Embeddings for Categorical Variables

Electronics ◽

10.3390/electronics10131550 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1550

Author(s):

Alexandros Liapis ◽

Evanthia Faliagka ◽

Christos P. Antonopoulos ◽

Georgios Keramidas ◽

Nikolaos Voros

Keyword(s):

Machine Learning ◽

Deep Learning ◽

User Experience ◽

Electrodermal Activity ◽

Binary Classification ◽

Research Question ◽

Classification Problem ◽

Categorical Variables ◽

Stress Detection ◽

Software Failures

Physiological measurements have been widely used by researchers and practitioners in order to address the stress detection challenge. So far, various datasets for stress detection have been recorded and are available to the research community for testing and benchmarking. The majority of the stress-related available datasets have been recorded while users were exposed to intense stressors, such as songs, movie clips, major hardware/software failures, image datasets, and gaming scenarios. However, it remains an open research question if such datasets can be used for creating models that will effectively detect stress in different contexts. This paper investigates the performance of the publicly available physiological dataset named WESAD (wearable stress and affect detection) in the context of user experience (UX) evaluation. More specifically, electrodermal activity (EDA) and skin temperature (ST) signals from WESAD were used in order to train three traditional machine learning classifiers and a simple feed forward deep learning artificial neural network combining continues variables and entity embeddings. Regarding the binary classification problem (stress vs. no stress), high accuracy (up to 97.4%), for both training approaches (deep-learning, machine learning), was achieved. Regarding the stress detection effectiveness of the created models in another context, such as user experience (UX) evaluation, the results were quite impressive. More specifically, the deep-learning model achieved a rather high agreement when a user-annotated dataset was used for validation.

Download Full-text

A robust multiobjective Harris’ Hawks Optimization algorithm for the binary classification problem

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107219 ◽

2021 ◽

pp. 107219

Author(s):

Tansel Dokeroglu ◽

Ayça Deniz ◽

Hakan Ezgi Kiziloz

Keyword(s):

Optimization Algorithm ◽

Binary Classification ◽

Classification Problem ◽

Binary Classification Problem

Download Full-text

Confidence interval for micro-averaged F1 and macro-averaged F1 scores

Applied Intelligence ◽

10.1007/s10489-021-02635-5 ◽

2021 ◽

Author(s):

Kanae Takahashi ◽

Kouji Yamamoto ◽

Aya Kuchiba ◽

Tatsuki Koyama

Keyword(s):

Binary Classification ◽

Classification Problem ◽

Classification Problems ◽

Summary Measure ◽

Medical Field ◽

Predictive Values ◽

Binary Classification Problem ◽

Multi Class Classification ◽

Sensitivity Specificity ◽

Measures Of Performance

AbstractA binary classification problem is common in medical field, and we often use sensitivity, specificity, accuracy, negative and positive predictive values as measures of performance of a binary predictor. In computer science, a classifier is usually evaluated with precision (positive predictive value) and recall (sensitivity). As a single summary measure of a classifier’s performance, F1 score, defined as the harmonic mean of precision and recall, is widely used in the context of information retrieval and information extraction evaluation since it possesses favorable characteristics, especially when the prevalence is low. Some statistical methods for inference have been developed for the F1 score in binary classification problems; however, they have not been extended to the problem of multi-class classification. There are three types of F1 scores, and statistical properties of these F1 scores have hardly ever been discussed. We propose methods based on the large sample multivariate central limit theorem for estimating F1 scores with confidence intervals.

Download Full-text

Stock Market Index Data and indicators for Day Trading as a Binary Classification problem

Data in Brief ◽

10.1016/j.dib.2016.12.044 ◽

2017 ◽

Vol 10 ◽

pp. 569-575 ◽

Cited By ~ 5

Author(s):

Renato Bruni

Keyword(s):

Stock Market ◽

Binary Classification ◽

Classification Problem ◽

Stock Market Index ◽

Day Trading ◽

Index Data ◽

Market Index ◽

Binary Classification Problem

Download Full-text

On the binary classification problem in discriminant analysis using linear programming methods

Operations Research and Decisions ◽

10.37190/ord200107 ◽

2020 ◽

Vol 30 (1) ◽

Author(s):

Michael O. Olusola ◽

Sydney I. Onyeagu

Keyword(s):

Linear Programming ◽

Discriminant Analysis ◽

Binary Classification ◽

Classification Problem ◽

Solution Technique ◽

Phase Method ◽

Bound Constraints ◽

Two Phase ◽

Linear Discriminant ◽

Binary Classification Problem

This paper is centred on a binary classification problem in which it is desired to assign a new object with multivariate features to one of two distinct populations as based on historical sets of samples from two populations. A linear discriminant analysis framework has been proposed, called the minimised sum of deviations by proportion (MSDP) to model the binary classification problem. In the MSDP formulation, the sum of the proportion of exterior deviations is minimised subject to the group separation constraints, the normalisation constraint, the upper bound constraints on proportions of exterior deviations and the sign unrestriction vis-à-vis the non-negativity constraints. The two-phase method in linear programming is adopted as a solution technique to generate the discriminant function. The decision rule on group-membership prediction is constructed using the apparent error rate. The performance of the MSDP has been compared with some existing linear discriminant models using a previously published dataset on road casualties. The MSDP model was more promising and well suited for the imbalanced dataset on road casualties.

Download Full-text

Detection and Tracking Cows by Computer Vision and Image Classification Methods

International Journal of Security and Privacy in Pervasive Computing ◽

10.4018/ijsppc.2021010101 ◽

2021 ◽

Vol 13 (1) ◽

pp. 1-45

Author(s):

Terry Gao

Keyword(s):

Feature Fusion ◽

Binary Classification ◽

Classification Problem ◽

Detection Algorithm ◽

Video Sequences ◽

Image Block ◽

Body Contour ◽

Adaboost Algorithm ◽

Target Model ◽

Detection And Tracking

In this paper, the cow recognition and traction in video sequences is studied. In the recognition phase, this paper does some discussion and analysis which aim at different classification algorithms and feature extraction algorithms, and cow's detection is transformed into a binary classification problem. The detection method extracts cow's features using a method of multiple feature fusion. These features include edge characters which reflects the cow body contour, grey value, and spatial position relationship. In addition, the algorithm detects the cow body through the classifier which is trained by Gentle Adaboost algorithm. Experiments show that the method has good detection performance when the target has deformation or the contrast between target and background is low. Compared with the general target detection algorithm, this method reduces the miss rate and the detection precision is improved. Detection rate can reach 97.3%. In traction phase, the popular compressive tracking (CT) algorithm is proposed. The learning rate is changed through adaptively calculating the pap distance of image block. Moreover, the update for target model is stopped to avoid introducing error and noise when the classification response values are negative. The experiment results show that the improved tracking algorithm can effectively solve the target model update by mistaken when there are large covers or the attitude is changed frequently. For the detection and tracking of cow body, a detection and tracking framework for the image of cow is built and the detector is combined with the tracking framework. The algorithm test for some video sequences under the complex environment indicates the detection algorithm based on improved compressed perception shows good tracking effect in the changing and complicated background.

Download Full-text

Similarity Learning for Motion Estimation

Semantic Mining Technologies for Multimedia Databases ◽

10.4018/978-1-60566-188-9.ch005 ◽

2011 ◽

pp. 130-151

Author(s):

Shaohua Kevin Zhou ◽

Jie Shao ◽

Bogdan Georgescu ◽

Dorin Comaniciu

Keyword(s):

Motion Estimation ◽

Binary Classification ◽

Classification Problem ◽

Similarity Function ◽

Training Procedure ◽

Image Pair ◽

Similarity Learning ◽

Model Complex ◽

Binary Classification Problem ◽

Test Errors

Motion estimation necessitates an appropriate choice of similarity function. Because generic similarity functions derived from simple assumptions are insufficient to model complex yet structured appearance variations in motion estimation, the authors propose to learn a discriminative similarity function to match images under varying appearances by casting image matching into a binary classification problem. They use the LogitBoost algorithm to learn the classifier based on an annotated database that exemplifies the structured appearance variations: An image pair in correspondence is positive and an image pair out of correspondence is negative. To leverage the additional distance structure of negatives, they present a location-sensitive cascade training procedure that bootstraps negatives for later stages of the cascade from the regions closer to the positives, which enables viewing a large number of negatives and steering the training process to yield lower training and test errors. The authors apply the learned similarity function to estimating the motion for the endocardial wall of left ventricle in echocardiography and to performing visual tracking. They obtain improved performances when comparing the learned similarity function with conventional ones.

Download Full-text

A Machine Learning Approach to the Detection of Fetal Hypoxia during Labor and Delivery

AI Magazine ◽

10.1609/aimag.v33i2.2412 ◽

2012 ◽

Vol 33 (2) ◽

pp. 79 ◽

Cited By ~ 5

Author(s):

Philip A. Warrick ◽

Emily F. Hamilton ◽

Robert E. Kearney ◽

Doina Precup

Keyword(s):

Binary Classification ◽

Classification Problem ◽

Fetal Hypoxia ◽

Novel Approach ◽

Modern Health Care ◽

Machine Learning Approach ◽

Binary Classification Problem ◽

Fetal Response ◽

Monitoring Devices ◽

Labor Monitoring

Labor monitoring is crucial in modern health care, as it can be used to detect (and help avoid) significant problems with the fetus. In this article we focus on detecting hypoxia (or oxygen deprivation), a very serious condition that can arise from different pathologies and can lead to life-long disability and death. We present a novel approach to hypoxia detection based on recordings of the uterine pressure and fetal heart rate, which are obtained using standard labor monitoring devices. The key idea is to learn models of the fetal response to signals from its environment. Then, we use the parameters of these models as attributes in a binary classification problem. A running count of pathological classifications over several time periods is taken to provide the current label for the fetus. We use a unique database of real clinical recordings, both from normal and pathological cases. Our approach classifies correctly more than half the pathological cases, 1.5 hours before delivery. These are cases that were missed by clinicians; early detection of this type would have allowed the physician to perform a Caesarean section, possibly avoiding the negative outcome.

Download Full-text