Assisting in Auditing of Buffer Overflow Vulnerabilities via Machine Learning

Buffer overflow vulnerability is a kind of consequence in which programmers’ intentions are not implemented correctly. In this paper, a static analysis method based on machine learning is proposed to assist in auditing buffer overflow vulnerabilities. First, an extended code property graph is constructed from the source code to extract seven kinds of static attributes, which are used to describe buffer properties. After embedding these attributes into a vector space, five frequently used machine learning algorithms are employed to classify the functions into suspicious vulnerable functions and secure ones. The five classifiers reached an average recall of 83.5%, average true negative rate of 85.9%, a best recall of 96.6%, and a best true negative rate of 91.4%. Due to the imbalance of the training samples, the average precision of the classifiers is 68.9% and the average F1 score is 75.2%. When the classifiers were applied to a new program, our method could reduce the false positive to 1/12 compared to Flawfinder.

Download Full-text

True Negative Rate

Encyclopedia of Machine Learning ◽

10.1007/978-0-387-30164-8_854 ◽

2011 ◽

pp. 999-999

Author(s):

William Uther ◽

Dunja Mladenić ◽

Massimiliano Ciaramita ◽

Bettina Berendt ◽

Aleksander Kołcz ◽

...

Keyword(s):

True Negative ◽

True Negative Rate ◽

Negative Rate

Download Full-text

Incorporation of relative cerebral blood flow into CT perfusion maps reduces false ’at risk' penumbra

Journal of NeuroInterventional Surgery ◽

10.1136/neurintsurg-2017-013268 ◽

2017 ◽

Vol 10 (7) ◽

pp. 657-662 ◽

Cited By ~ 2

Author(s):

Shlomi Peretz ◽

David Orion ◽

David Last ◽

Yael Mardor ◽

Yotam Kimmel ◽

...

Keyword(s):

At Risk ◽

Acute Stroke ◽

Ct Perfusion ◽

False Negative ◽

Stroke Patients ◽

True Negative ◽

True Negative Rate ◽

Negative Rate ◽

Perfusion Maps

PurposeThe region defined as ‘at risk’ penumbra by current CT perfusion (CTP) maps is largely overestimated. We aimed to quantitate the portion of true ‘at risk’ tissue within CTP penumbra and to determine the parameter and threshold that would optimally distinguish it from false ‘at risk’ tissue, that is, benign oligaemia.MethodsAmong acute stroke patients evaluated by multimodal CT (NCCT/CTA/CTP) we identified those that had not undergone endovascular/thrombolytic treatment and had follow-up NCCT. Maps of absolute and relative CBF, CBV, MTT, TTP and Tmax as well as summary maps depicting infarcted and penumbral regions were generated using the Intellispace Portal (Philips Healthcare, Best, Netherlands). Follow-up CT was automatically co-registered to the CTP scan and the final infarct region was manually outlined. Perfusion parameters were systematically analysed – the parameter that resulted in the highest true-negative-rate (ie, proportion of benign oligaemia correctly identified) at a fixed, clinically relevant false-negative-rate (ie, proportion of ‘missed’ infarct) of 15%, was chosen as optimal. It was then re-applied to the CTP data to produce corrected perfusion maps.ResultsForty seven acute stroke patients met selection criteria. Average portion of infarcted tissue within CTP penumbra was 15%±2.2%. Relative CBF at a threshold of 0.65 yielded the highest average true-negative-rate (48%), enabling reduction of the false ‘at risk’ penumbral region by ~half.ConclusionsApplying a relative CBF threshold on relative MTT-based CTP maps can significantly reduce false ‘at risk’ penumbra. This step may help to avoid unnecessary endovascular interventions.

Download Full-text

FEASIBILITY OF MACHINE LEARNING METHODS FOR SEPARATING WOOD AND LEAF POINTS FROM TERRESTRIAL LASER SCANNING DATA

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-2-w4-157-2017 ◽

2017 ◽

Vol IV-2/W4 ◽

pp. 157-164 ◽

Cited By ~ 6

Author(s):

D. Wang ◽

M. Hollaus ◽

N. Pfeifer

Keyword(s):

Machine Learning ◽

Laser Scanning ◽

Learning Algorithms ◽

Terrestrial Laser Scanning ◽

Gaussian Mixture ◽

Machine Learning Algorithms ◽

Support Vector ◽

Area Index ◽

Training Samples

Classification of wood and leaf components of trees is an essential prerequisite for deriving vital tree attributes, such as wood mass, leaf area index (LAI) and woody-to-total area. Laser scanning emerges to be a promising solution for such a request. Intensity based approaches are widely proposed, as different components of a tree can feature discriminatory optical properties at the operating wavelengths of a sensor system. For geometry based methods, machine learning algorithms are often used to separate wood and leaf points, by providing proper training samples. However, it remains unclear how the chosen machine learning classifier and features used would influence classification results. To this purpose, we compare four popular machine learning classifiers, namely Support Vector Machine (SVM), Na¨ıve Bayes (NB), Random Forest (RF), and Gaussian Mixture Model (GMM), for separating wood and leaf points from terrestrial laser scanning (TLS) data. Two trees, an <i>Erytrophleum fordii</i> and a <i>Betula pendula</i> (silver birch) are used to test the impacts from classifier, feature set, and training samples. Our results showed that RF is the best model in terms of accuracy, and local density related features are important. Experimental results confirmed the feasibility of machine learning algorithms for the reliable classification of wood and leaf points. It is also noted that our studies are based on isolated trees. Further tests should be performed on more tree species and data from more complex environments.

Download Full-text

Mastoid, middle ear and inner ear analysis in CT scan – a possible contribution for the identification of remains

Medicine Science and the Law ◽

10.1177/0025802419893424 ◽

2020 ◽

Vol 60 (2) ◽

pp. 102-111

Author(s):

Henrique Rodrigues ◽

Rosa Ramos ◽

Leoni Fagundes ◽

Orlando Galego ◽

David Navega ◽

...

Keyword(s):

Positive Predictive Value ◽

Predictive Value ◽

Visual Assessment ◽

True Positive Rate ◽

Observer Agreement ◽

True Positive ◽

True Negative ◽

True Negative Rate ◽

Negative Rate ◽

Positive Rate

Objective We aimed to evaluate whether the internal structures of the human ear have anatomical characteristics that are sufficiently distinctive to contribute to human identification and use in a forensic context. Materials and methods After data anonymisation, a dataset containing temporal bone CT scans of 100 subjects was processed by a radiologist who was not involved in the study. Four reference images were selected for each subject. Of the original sample, 10 examinations were used for visual comparison, case by case, against the dataset of 100 patients. This visual assessment was performed independently by four observers, who evaluated the anatomical agreement using a Likert scale (1–5). Inter-observer agreement, true positive rate, positive predictive value, true negative rate, negative predictive value, false positive rate, false negative rate and positive likelihood ratio (LR+) were evaluated. Results Inter-observer agreement obtained an overall Cohen’s Kappa = 99.59%. True positive rate, positive predictive value, true negative rate and negative predictive value were all 100%. Conclusion Visual assessment of the mastoid examinations was shown to be a robust and reliable approach to identify unique osseous features and contribute to human identification. The statistical analysis indicates that regardless of the examiner’s background and training, the approach has a high degree of accuracy.

Download Full-text

Review of fMRI Data Analysis

International Journal of E-Health and Medical Communications ◽

10.4018/ijehmc.2014040101 ◽

2014 ◽

Vol 5 (2) ◽

pp. 1-26

Author(s):

Shantipriya Parida ◽

Satchidananda Dehuri

Keyword(s):

Machine Learning ◽

Promising Result ◽

Mental States ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Fmri Data ◽

Brain State ◽

Hybrid Techniques ◽

Learning Techniques ◽

Training Samples

Classification of brain states obtained through functional magnetic resonance imaging (fMRI) poses a serious challenges for neuroimaging community to uncover discriminating patterns of brain state activity that define independent thought processes. This challenge came into existence because of the large number of voxels in a typical fMRI scan, the classifier is presented with a massive feature set coupled with a relatively small training samples. One of the most popular research topics in last few years is the application of machine learning algorithms for mental states classification, decoding brain activation, and finding the variable of interest from fMRI data. In classification scenario, different algorithms have different biases, in the sequel performances differs across datasets, and for a particular dataset the accuracy varies from classifier to classifier. To overcome the limitations of individual techniques, hybridization or fusion of these machine learning techniques emerged in recent years which have shown promising result and open up new direction of research. This paper reviews the machine learning techniques ranging from individual classifiers, ensemble, and hybrid techniques used in cognitive classification with a well balance treatment of their applications, performance, and limitations. It also discusses many open research challenges for further research.

Download Full-text

DARQ: Deep learning of quality control for stereotaxic registration of human brain MRI

10.1101/2021.08.16.456514 ◽

2021 ◽

Author(s):

Vladimir Fonov ◽

Mahsa Dadar ◽

D. Louis Collins ◽

◽

Keyword(s):

Image Processing ◽

Quality Control ◽

Deep Learning ◽

Human Brain ◽

Brain Mri ◽

True Negative ◽

Available N ◽

True Negative Rate ◽

Negative Rate ◽

Mri Scans

Linear registration to stereotaxic space is a common first step in many automated image-processing tools for analysis of human brain MRI scans. This step is crucial for the success of the subsequent image-processing steps. Several well-established algorithms are commonly used in the field of neuroimaging for this task, but none have a 100% success rate. Manual assessment of the registration is commonly used as part of quality control. To reduce the burden of this time-consuming step, we propose Deep Automated Registration Qc (DARQ), a fully automatic quality control method based on deep learning that can replace the human rater and accurately perform quality control assessment for stereotaxic registration of T1w brain scans. In a recently published study from our group comparing linear registration methods, we used a database of 9325 MRI scans from several publicly available datasets and applied seven linear registration tools to them. In this study, the resulting images that were assessed and labeled by a human rater are used to train a deep neural network to detect cases when registration failed. We further validated the results on an independent dataset of patients with multiple sclerosis, with manual QC labels available (n=1200). In terms of agreement with a manual rater, our automated QC method was able to achieve 89% accuracy and 85% true negative rate (equivalently 15% false positive rate) in detecting scans that should pass quality control in a balanced cross-validation experiments, and 96.1% accuracy and 95.5% true negative rate (or 4.5% FPR) when evaluated in a balanced independent sample, similar to manual QC rater (test-retest accuracy of 93%). The results show that DARQ is robust, fast, accurate, and generalizable in detecting failure in linear stereotaxic registrations and can substantially reduce QC time (by a factor of 20 or more) when processing large datasets.

Download Full-text

True Negative Rate

Encyclopedia of Autism Spectrum Disorders ◽

10.1007/978-1-4419-1698-3_101483 ◽

2013 ◽

pp. 3187-3187

Author(s):

Daniel Campbell ◽

Corey Ray-Subramanian ◽

Winifred Schultz-Krohn ◽

Kristen M. Powers ◽

Renee Watling ◽

...

Keyword(s):

True Negative ◽

True Negative Rate ◽

Negative Rate

Download Full-text

Prediction of early progression (EP) to CDKIs first line treatment in ER+/HER2- metastatic breast cancer (MBC) using machine learning.

Journal of Clinical Oncology ◽

10.1200/jco.2020.38.15_suppl.e13040 ◽

2020 ◽

Vol 38 (15_suppl) ◽

pp. e13040-e13040

Author(s):

Jose Manuel Jerez ◽

Nuria Ribelles ◽

Pablo Rodriguez-Brazzarola ◽

Tamara Diaz Redondo ◽

Begoña Jimenez Rodriguez ◽

...

Keyword(s):

Machine Learning ◽

Disease Progression ◽

Metastatic Breast ◽

True Positive Rate ◽

Machine Learning Algorithms ◽

Free Text ◽

Line Treatment ◽

First Line ◽

True Negative ◽

First Line Treatment

e13040 Background: The treatment of luminal MBC has undergone a substantial change with the use of cyclin dependent kinase 4/6 inhibitors (CDKIs). Nevertheless, there is not a clearly defined subgroup of patients who do not initially respond to CDKIs and show EP. Methods: MBC ER+/HER2- patients who have received at least one line of treatment were eligible. The event of interest was disease progression within 6 months of first line treatment according to the type of therapy administered. The first line treatments were categorized in chemotherapy (CT), hormonal therapy (HT), CT plus maintenance HT and HT plus CDKIs. Free text data from clinical visits registered in our Electronic Health Record were obtained until the date of first treatment in order to generate a feature vector composed of the word frequencies for each visit of every patient. Six different machine learning algorithms were evaluated to predict the event of interest and to obtain the risk of EP for every type of therapy. Area under the ROC curve (AUC), True Positive Rate (TPR) and True Negative Rate (TNR) were assessed using 10-fold cross validation. Results: 610 RE+/HER2- MBC treated between November 1991 and August 2019 were included. Median follow up for metastatic disease was 28 months. 17426 clinical visits were analyzed (per patient: range 1-173; median 30). 119 patients received CT as first line treatment, 311 HT, 117 CT plus maintenance HT and 63 HT plus CDKIs. There were 379 patients with disease progression, from which 126 were within 6 months from first line treatment (54 events with CT, 57 with HT, 4 with CT plus maintenance HT and 11 with HT plus CDKIs). The model that yields the best results was the GLMBoost algorithm: AUC 0.72 (95%CI 0.67-0.77), TPR 70.85% (95%CI 70.63%-71.06%), TNR 66.27% (95% 66.08%-66.46%). Conclusions: Our model based on unstructured data from real-world patients predicts EP and establishes the risk for each of the different types of treatment for MBC ER+/HER2-. Obviously an additional validation is needed, but a tool with these characteristics could help to select the best available treatment when that decision has to be made, avoiding those therapies that are probably not to be effective.

Download Full-text

A NEW BINARY SUPPORT VECTOR SYSTEM FOR INCREASING DETECTION RATE OF CREDIT CARD FRAUD

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001406004624 ◽

2006 ◽

Vol 20 (02) ◽

pp. 227-239 ◽

Cited By ~ 36

Author(s):

RONG-CHANG CHEN ◽

TUNG-SHOU CHEN ◽

CHIH-CHIANG LIN

Keyword(s):

Credit Card ◽

Input Data ◽

Detection Rate ◽

Vector System ◽

Support Vector ◽

True Negative ◽

Credit Card Fraud ◽

True Negative Rate ◽

Negative Rate ◽

Support Vectors

Recently, a new personalized model has been developed to prevent credit card fraud. This model is promising; however, there remains some problems. Existing approaches cannot identify well credit card frauds from few data with skewed distributions. This paper proposes to address the problem using a binary support vector system (BSVS). The proposed BSVS is based on the support vectors in the support vector machines (SVM) and the genetic algorithm (GA) is employed to select support vectors. To obtain a high true negative rate, self-organizing mapping (SOM) is first employed to estimate the distribution model of the input data. Then BSVS is used to best train the data according to the input data distribution to obtain a high detection rate. Experimental results show that the proposed BSVS is effective especially for predicting a high true negative rate.

Download Full-text

Reducing U2R and R2L category false negative rates with support vector machines

Serbian Journal of Electrical Engineering ◽

10.2298/sjee131007015m ◽

2014 ◽

Vol 11 (1) ◽

pp. 175-188 ◽

Cited By ~ 1

Author(s):

Nemanja Macek ◽

Milan Milosavljevic

Keyword(s):

Machine Learning ◽

Detection System ◽

False Negative ◽

False Negative Rate ◽

Machine Learning Algorithms ◽

Support Vector ◽

Negative Rate ◽

Machine Learning Model ◽

Vector Machines ◽

Feature Values

The KDD Cup '99 is commonly used dataset for training and testing IDS machine learning algorithms. Some of the major downsides of the dataset are the distribution and the proportions of U2R and R2L instances, which represent the most dangerous attack types, as well as the existence of R2L attack instances identical to normal traffic. This enforces minor category detection complexity and causes problems while building a machine learning model capable of detecting these attacks with sufficiently low false negative rate. This paper presents a new support vector machine based intrusion detection system that classifies unknown data instances according both to the feature values and weight factors that represent importance of features towards the classification. Increased detection rate and significantly decreased false negative rate for U2R and R2L categories, that have a very few instances in the training set, have been empirically proven.

Download Full-text