Estimating Neural Network’s Performance with Bootstrap: A Tutorial

Neural networks present characteristics where the results are strongly dependent on the training data, the weight initialisation, and the hyperparameters chosen. The determination of the distribution of a statistical estimator, as the Mean Squared Error (MSE) or the accuracy, is fundamental to evaluate the performance of a neural network model (NNM). For many machine learning models, as linear regression, it is possible to analytically obtain information as variance or confidence intervals on the results. Neural networks present the difficulty of not being analytically tractable due to their complexity. Therefore, it is impossible to easily estimate distributions of statistical estimators. When estimating the global performance of an NNM by estimating the MSE in a regression problem, for example, it is important to know the variance of the MSE. Bootstrap is one of the most important resampling techniques to estimate averages and variances, between other properties, of statistical estimators. In this tutorial, the application of resampling techniques (including bootstrap) to the evaluation of neural networks’ performance is explained from both a theoretical and practical point of view. The pseudo-code of the algorithms is provided to facilitate their implementation. Computational aspects, as the training time, are discussed, since resampling techniques always require simulations to be run many thousands of times and, therefore, are computationally intensive. A specific version of the bootstrap algorithm is presented that allows the estimation of the distribution of a statistical estimator when dealing with an NNM in a computationally effective way. Finally, algorithms are compared on both synthetically generated and real data to demonstrate their performance.

Download Full-text

Multiple Haplotype Reconstruction from Allele Frequency Data

10.1101/2020.07.09.191924 ◽

2020 ◽

Author(s):

Marta Pelizzola ◽

Merle Behr ◽

Housen Li ◽

Axel Munk ◽

Andreas Futschik

Keyword(s):

Allele Frequency ◽

Real Data ◽

Coefficient Matrix ◽

Point Of View ◽

Design Matrix ◽

Frequency Data ◽

Computationally Efficient ◽

Regression Problem ◽

Allele Frequency Data ◽

Multiple Samples

AbstractSince haplotype information is of widespread interest in biomedical applications, effort has been put into their reconstruction. Here, we propose a new, computationally efficient method, called haploSep, that is able to accurately infer major haplotypes and their frequencies just from multiple samples of allele frequency data. Our approach seems to be the first that is able to estimate more than one haplotype given such data. Even the accuracy of experimentally obtained allele frequencies can be improved by re-estimating them from our reconstructed haplotypes. From a methodological point of view, we model our problem as a multivariate regression problem where both the design matrix and the coefficient matrix are unknown. The design matrix, with 0/1 entries, models haplotypes and the columns of the coefficient matrix represent the frequencies of haplotypes, which are non-negative and sum up to one. We illustrate our method on simulated and real data focusing on experimental evolution and microbial data.

Download Full-text

Complexity of Deep Convolutional Neural Networks in Mobile Computing

Complexity ◽

10.1155/2020/3853780 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Saad Naeem ◽

Noreen Jamil ◽

Habib Ullah Khan ◽

Shah Nazir

Keyword(s):

Neural Networks ◽

Mobile Devices ◽

Training Data ◽

Deep Convolutional Neural Networks ◽

Compression Technique ◽

Training Time ◽

Network Pruning ◽

Highly Nonlinear ◽

Process Work ◽

Time Required

Neural networks employ massive interconnection of simple computing units called neurons to compute the problems that are highly nonlinear and could not be hard coded into a program. These neural networks are computation-intensive, and training them requires a lot of training data. Each training example requires heavy computations. We look at different ways in which we can reduce the heavy computation requirement and possibly make them work on mobile devices. In this paper, we survey various techniques that can be matched and combined in order to improve the training time of neural networks. Additionally, we also review some extra recommendations to make the process work for mobile devices as well. We finally survey deep compression technique that tries to solve the problem by network pruning, quantization, and encoding the network weights. Deep compression reduces the time required for training the network by first pruning the irrelevant connections, i.e., the pruning stage, which is then followed by quantizing the network weights via choosing centroids for each layer. Finally, at the third stage, it employs Huffman encoding algorithm to deal with the storage issue of the remaining weights.

Download Full-text

PERFORMANCE EVALUATION OF THREE PATTERN CLASSIFICATION TECHNIQUES USED FOR WATER QUALITY MONITORING

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026812500137 ◽

2012 ◽

Vol 11 (02) ◽

pp. 1250013 ◽

Cited By ~ 3

Author(s):

M. BOUAMAR ◽

M. LADJAL

Keyword(s):

Neural Networks ◽

Water Quality ◽

Human Life ◽

Recognition Rate ◽

Real Data ◽

Water Quality Monitoring ◽

Support Vector ◽

Rbf Neural Networks ◽

Training Time ◽

Classification Tool

Water quality is one of the major concerns of countries around the world. Monitoring of water quality is becoming more and more interesting because of its effects on human life. The control of risks in the factories that produce and distribute water ensures the quality of this vital resource. Many techniques were developed in order to improve this process attending to rigorous follow-ups of the water quality. In this paper, we present a comparative study of the performance of three techniques resulting from the field of the artificial intelligence namely: Artificial Neural Networks (ANN), RBF Neural Networks (RBF-NN), and Support Vector Machines (SVM). Developed from the statistical learning theory, these methods display optimal training performances and generalization in many fields of application, among others the field of pattern recognition. In order to evaluate their performances regarding the recognition rate, training time, and robustness, a simulation using generated and real data is carried out. To validate their functionalities, an application performed on real data is presented. Applied as a classification tool, the technique selected should ensure, within a multisensor monitoring system, a direct and quasi permanent control of water quality.

Download Full-text

Investigating rainfall estimation from radar measurements using neural networks

Natural Hazards and Earth System Science ◽

10.5194/nhess-13-535-2013 ◽

2013 ◽

Vol 13 (3) ◽

pp. 535-544 ◽

Cited By ~ 7

Author(s):

A. Alqudah ◽

V. Chandrasekar ◽

M. Le

Keyword(s):

Neural Network ◽

Neural Networks ◽

Rain Gauge ◽

Training Data ◽

Training Dataset ◽

Dimensional Structure ◽

Rbf Neural Networks ◽

Rainfall Estimation ◽

Training Time ◽

Radar Measurements

Abstract. Rainfall observed on the ground is dependent on the four dimensional structure of precipitation aloft. Scanning radars can observe the four dimensional structure of precipitation. Neural network is a nonparametric method to represent the nonlinear relationship between radar measurements and rainfall rate. The relationship is derived directly from a dataset consisting of radar measurements and rain gauge measurements. The performance of neural network based rainfall estimation is subject to many factors, such as the representativeness and sufficiency of the training dataset, the generalization capability of the network to new data, seasonal changes, and regional changes. Improving the performance of the neural network for real time applications is of great interest. The goal of this paper is to investigate the performance of rainfall estimation based on Radial Basis Function (RBF) neural networks using radar reflectivity as input and rain gauge as the target. Data from Melbourne, Florida NEXRAD (Next Generation Weather Radar) ground radar (KMLB) over different years along with rain gauge measurements are used to conduct various investigations related to this problem. A direct gauge comparison study is done to demonstrate the improvement brought in by the neural networks and to show the feasibility of this system. The principal components analysis (PCA) technique is also used to reduce the dimensionality of the training dataset. Reducing the dimensionality of the input training data will reduce the training time as well as reduce the network complexity which will also avoid over fitting.

Download Full-text

NOVEL APPROACHES TO DEVELOPMENT OF ARTIFICIAL INTELLIGENCE ALGORITHMS IN THE LUNG CANCER DIAGNOSTICS

Diagnostic radiology and radiotherapy ◽

10.22328/2079-5343-2019-10-1-8-18 ◽

2019 ◽

pp. 8-18 ◽

Cited By ~ 1

Author(s):

A. A. Meldo ◽

L. V. Utkin ◽

T. N. Trofimova ◽

M. A. Ryabinin ◽

V. M. Moiseenko ◽

...

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Neural Networks ◽

Diagnostic System ◽

Cancer Diagnostics ◽

Point Of View ◽

Training Data ◽

Medical Community ◽

Social Significance ◽

Feature Dimension

The relevance of developing an intelligent automated diagnostic system (IADS) for lung cancer (LC) detection stems from the social significance of this disease and its leading position among all cancer diseases. Theoretically, the use of IADS is possible at a stage of screening as well as at a stage of adjusted diagnosis of LC. The recent approaches to training the IADS do not take into account the clinical and radiological classification as well as peculiarities of the LC clinical forms, which are used by the medical community. This defines difficulties and obstacles of using the available IADS. The authors are of the opinion that the closeness of a developed IADS to the «doctor’s logic» contributes to a better reproducibility and interpretability of the IADS usage results. Most IADS described in the literature have been developed on the basis of neural networks, which have several disadvantages that affect reproducibility when using the system. This paper proposes a composite algorithm using machine learning methods such as Deep Forest and Siamese neural network, which can be regarded as a more efficient approach for dealing with a small amount of training data and optimal from the reproducibility point of view. The open datasets used for training IADS include annotated objects which in some cases are not confirmed morphologically. The paper provides a description of the LIRA dataset developed by using the diagnostic results of St. Petersburg Clinical Research Center of Specialized Types of Medical Care (Oncology), which includes only computed tomograms of patients with the verified diagnosis. The paper considers stages of the machine learning process on the basis of the shape features, of the internal structure features as well as a new developed system of differential diagnosis of LC based on the Siamese neural networks. A new approach to the feature dimension reduction is also presented in the paper, which aims more efficient and faster learning of the system.

Download Full-text

Data Mining for Healthcare Data: A Comparison of Neural Networks Algorithms

CogITo Smart Journal ◽

10.31154/cogito.v3i1.40.10-19 ◽

2017 ◽

Vol 3 (1) ◽

pp. 10

Author(s):

Debby E. Sondakh

Keyword(s):

Neural Networks ◽

Multilayer Perceptron ◽

Mean Squared Error ◽

Absolute Error ◽

Error Rates ◽

Large Dataset ◽

Training Time ◽

Healthcare Data ◽

Squared Error ◽

Hidden Layer

Classification has been considered as an important tool utilized for the extraction of useful information from healthcare dataset. It may be applied for recognition of disease over symptoms. This paper aims to compare and evaluate different approaches of neural networks classification algorithms for healthcare datasets. The algorithms considered here are Multilayer Perceptron, Radial Basis Function, and Voted Perceptron which are tested based on resulted classifiers accuracy, precision, mean absolute error and root mean squared error rates, and classifier training time. All the algorithms are applied for five multivariate healthcare datasets, Echocardiogram, SPECT Heart, Chronic Kidney Disease, Mammographic Mass, and EEG Eye State datasets. Among the three algorithms, this study concludes the best algorithm for the chosen datasets is Multilayer Perceptron. It achieves the highest for all performance parameters tested. It can produce high accuracy classifier model with low error rate, but suffer in training time especially of large dataset. Voted Perceptron performance is the lowest in all parameters tested. For further research, an investigation may be conducted to analyze whether the number of hidden layer in Multilayer Perceptron’s architecture has a significant impact on the training time.

Download Full-text

Accelerating recommendation system training by leveraging popular choices

Proceedings of the VLDB Endowment ◽

10.14778/3485450.3485462 ◽

2021 ◽

Vol 15 (1) ◽

pp. 127-140

Author(s):

Muhammad Adnan ◽

Yassaman Ebrahimzadeh Maboud ◽

Divya Mahajan ◽

Prashant J. Nair

Keyword(s):

Neural Networks ◽

Large Scale ◽

Recommendation System ◽

Training Data ◽

Categorical Variables ◽

Numerical Representation ◽

Data Layout ◽

Production Scale ◽

Training Time ◽

Usage Patterns

Recommender models are commonly used to suggest relevant items to a user for e-commerce and online advertisement-based applications. These models use massive embedding tables to store numerical representation of items' and users' categorical variables (memory intensive) and employ neural networks (compute intensive) to generate final recommendations. Training these large-scale recommendation models is evolving to require increasing data and compute resources. The highly parallel neural networks portion of these models can benefit from GPU acceleration however, large embedding tables often cannot fit in the limited-capacity GPU device memory. Hence, this paper deep dives into the semantics of training data and obtains insights about the feature access, transfer, and usage patterns of these models. We observe that, due to the popularity of certain inputs, the accesses to the embeddings are highly skewed with a few embedding entries being accessed up to 10000X more. This paper leverages this asymmetrical access pattern to offer a framework, called FAE, and proposes a hot-embedding aware data layout for training recommender models. This layout utilizes the scarce GPU memory for storing the highly accessed embeddings, thus reduces the data transfers from CPU to GPU. At the same time, FAE engages the GPU to accelerate the executions of these hot embedding entries. Experiments on production-scale recommendation models with real datasets show that FAE reduces the overall training time by 2.3X and 1.52X in comparison to XDL CPU-only and XDL CPU-GPU execution while maintaining baseline accuracy.

Download Full-text

Generative models for the transfer of knowledge in seismic interpretation with deep learning

The Leading Edge ◽

10.1190/tle40070534.1 ◽

2021 ◽

Vol 40 (7) ◽

pp. 534-542

Author(s):

Ricard Durall ◽

Valentin Tschannen ◽

Norman Ettrich ◽

Janis Keuper

Keyword(s):

Neural Networks ◽

Real Data ◽

Generative Models ◽

Training Data ◽

Transfer Of Knowledge ◽

Validation Data ◽

Generalization Problem ◽

Novel Method ◽

Segmentation Task

Interpreting seismic data requires the characterization of a number of key elements such as the position of faults and main reflections, presence of structural bodies, and clustering of areas exhibiting a similar amplitude versus angle response. Manual interpretation of geophysical data is often a difficult and time-consuming task, complicated by lack of resolution and presence of noise. In recent years, approaches based on convolutional neural networks have shown remarkable results in automating certain interpretative tasks. However, these state-of-the-art systems usually need to be trained in a supervised manner, and they suffer from a generalization problem. Hence, it is highly challenging to train a model that can yield accurate results on new real data obtained with different acquisition, processing, and geology than the data used for training. In this work, we introduce a novel method that combines generative neural networks with a segmentation task in order to decrease the gap between annotated training data and uninterpreted target data. We validate our approach on two applications: the detection of diffraction events and the picking of faults. We show that when transitioning from synthetic training data to real validation data, our workflow yields superior results compared to its counterpart without the generative network.

Download Full-text

Deep learning surrogate models for spatial and visual connectivity

International Journal of Architectural Computing ◽

10.1177/1478077119894483 ◽

2019 ◽

Vol 18 (1) ◽

pp. 53-66 ◽

Cited By ~ 1

Author(s):

Sherif Tarabishy ◽

Stamatios Psarras ◽

Marcin Kosicki ◽

Martha Tsigkari

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Distributed Computation ◽

Training Data ◽

Connectivity Analysis ◽

Floor Plan ◽

Learning Models ◽

Computationally Intensive ◽

Analysis Engine ◽

Machine Learning Models

Spatial and visual connectivity are important metrics when developing workplace layouts. Calculating those metrics in real time can be difficult, depending on the size of the floor plan being analysed and the resolution of the analyses. This article investigates the possibility of considerably speeding up the outcomes of such computationally intensive simulations by using machine learning to create models capable of identifying the spatial and visual connectivity potential of a space. To that end, we present the entire process of investigating different machine learning models and a pipeline for training them on such task, from the incorporation of a bespoke spatial and visual connectivity analysis engine through a distributed computation pipeline, to the process of synthesizing training data and evaluating the performance of different neural networks.

Download Full-text

Traffic Sign Recognition Using a Synthetic Data Training Approach

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821302050013x ◽

2020 ◽

Vol 29 (05) ◽

pp. 2050013

Author(s):

Oualid Araar ◽

Abdenour Amamra ◽

Asma Abdeldaim ◽

Ivan Vitanov

Keyword(s):

Neural Networks ◽

Synthetic Data ◽

Real Data ◽

Training Data ◽

Traffic Sign Recognition ◽

Driver Assistance ◽

Traffic Sign ◽

Sign Recognition ◽

Crucial Component ◽

Training Approach

Traffic Sign Recognition (TSR) is a crucial component in many automotive applications, such as driver assistance, sign maintenance, and vehicle autonomy. In this paper, we present an efficient approach to training a machine learning-based TSR solution. In our choice of recognition method, we have opted for convolutional neural networks, which have demonstrated best-in-class performance in previous works on TSR. One of the challenges related to training deep neural networks is the requirement for a large amount of training data. To circumvent the tedious process of acquiring and manually labelling real data, we investigate the use of synthetically generated images. Our networks, trained on only synthetic data, are capable of recognising traffic signs in challenging real-world footage. The classification results achieved on the GTSRB benchmark are seen to outperform existing state-of-the-art solutions.

Download Full-text