Massive Data Classification of Neural Responses

When analyzing the neuronal code, neuroscientists usually perform extra-cellular recordings of neuronal responses (spikes). Since the size of the microelectrodes used to perform these recordings is much larger than the size of the cells, responses from multiple neurons are recorded by each micro-electrode. Thus, the obtained response must be classified and evaluated, in order to identify how many neurons were recorded, and to assess which neuron generated each spike. A platform for the mass-classification of neuronal responses is proposed in this chapter, employing data-parallelism for speeding up the classification of neuronal responses. The platform is built in a modular way, supporting multiple web-interfaces, different back-end environments for parallel computing or different algorithms for spike classification. Experimental results on the proposed platform show that even for an unbalanced data set of neuronal responses the execution time was reduced of about 45%. For balanced data sets, the platform may achieve a reduction in execution time equal to the inverse of the number of back-end computational elements.

Download Full-text

Classification of jujube defects in small data sets based on transfer learning

Neural Computing and Applications ◽

10.1007/s00521-021-05715-2 ◽

2021 ◽

Author(s):

Jianping Ju ◽

Hong Zheng ◽

Xiaohang Xu ◽

Zhongyuan Guo ◽

Zhaohui Zheng ◽

...

Keyword(s):

Transfer Learning ◽

Loss Function ◽

Training Model ◽

Parameter Distribution ◽

Test Accuracy ◽

Small Data ◽

Data Sets ◽

Data Set ◽

Small Data Sets

AbstractAlthough convolutional neural networks have achieved success in the field of image classification, there are still challenges in the field of agricultural product quality sorting such as machine vision-based jujube defects detection. The performance of jujube defect detection mainly depends on the feature extraction and the classifier used. Due to the diversity of the jujube materials and the variability of the testing environment, the traditional method of manually extracting the features often fails to meet the requirements of practical application. In this paper, a jujube sorting model in small data sets based on convolutional neural network and transfer learning is proposed to meet the actual demand of jujube defects detection. Firstly, the original images collected from the actual jujube sorting production line were pre-processed, and the data were augmented to establish a data set of five categories of jujube defects. The original CNN model is then improved by embedding the SE module and using the triplet loss function and the center loss function to replace the softmax loss function. Finally, the depth pre-training model on the ImageNet image data set was used to conduct training on the jujube defects data set, so that the parameters of the pre-training model could fit the parameter distribution of the jujube defects image, and the parameter distribution was transferred to the jujube defects data set to complete the transfer of the model and realize the detection and classification of the jujube defects. The classification results are visualized by heatmap through the analysis of classification accuracy and confusion matrix compared with the comparison models. The experimental results show that the SE-ResNet50-CL model optimizes the fine-grained classification problem of jujube defect recognition, and the test accuracy reaches 94.15%. The model has good stability and high recognition accuracy in complex environments.

Download Full-text

Sequential Sampling for Estimation and Classification of the Incidence of Hop Powdery Mildew II: Cone Sampling

Plant Disease ◽

10.1094/pdis-91-8-1013 ◽

2007 ◽

Vol 91 (8) ◽

pp. 1013-1020 ◽

Cited By ~ 8

Author(s):

David H. Gent ◽

William W. Turechek ◽

Walter F. Mahaffee

Keyword(s):

Powdery Mildew ◽

Binomial Distribution ◽

Disease Incidence ◽

Sequential Sampling ◽

Model Construction ◽

Data Sets ◽

Data Set ◽

Sampling Plans ◽

Simulated Sampling

Sequential sampling models for estimation and classification of the incidence of powdery mildew (caused by Podosphaera macularis) on hop (Humulus lupulus) cones were developed using parameter estimates of the binary power law derived from the analysis of 221 transect data sets (model construction data set) collected from 41 hop yards sampled in Oregon and Washington from 2000 to 2005. Stop lines, models that determine when sufficient information has been collected to estimate mean disease incidence and stop sampling, for sequential estimation were validated by bootstrap simulation using a subset of 21 model construction data sets and simulated sampling of an additional 13 model construction data sets. Achieved coefficient of variation (C) approached the prespecified C as the estimated disease incidence, [Formula: see text], increased, although achieving a C of 0.1 was not possible for data sets in which [Formula: see text] < 0.03 with the number of sampling units evaluated in this study. The 95% confidence interval of the median difference between [Formula: see text] of each yard (achieved by sequential sampling) and the true p of the original data set included 0 for all 21 data sets evaluated at levels of C of 0.1 and 0.2. For sequential classification, operating characteristic (OC) and average sample number (ASN) curves of the sequential sampling plans obtained by bootstrap analysis and simulated sampling were similar to the OC and ASN values determined by Monte Carlo simulation. Correct decisions of whether disease incidence was above or below prespecified thresholds (pt) were made for 84.6 or 100% of the data sets during simulated sampling when stop lines were determined assuming a binomial or beta-binomial distribution of disease incidence, respectively. However, the higher proportion of correct decisions obtained by assuming a beta-binomial distribution of disease incidence required, on average, sampling 3.9 more plants per sampling round to classify disease incidence compared with the binomial distribution. Use of these sequential sampling plans may aid growers in deciding the order in which to harvest hop yards to minimize the risk of a condition called “cone early maturity” caused by late-season infection of cones by P. macularis. Also, sequential sampling could aid in research efforts, such as efficacy trials, where many hop cones are assessed to determine disease incidence.

Download Full-text

Bagging Approach for Medical Plants Recognition Based on Their DNA Sequences

International Journal of Social Ecology and Sustainable Development ◽

10.4018/ijsesd.2018100103 ◽

2018 ◽

Vol 9 (4) ◽

pp. 45-60

Author(s):

Mohamed Elhadi Rahmani ◽

Abdelmalek Amine ◽

Reda Mohamed Hamou

Keyword(s):

Dna Sequences ◽

Majority Vote ◽

Data Sets ◽

Data Set ◽

Drug Production ◽

Medical Plants

Many drugs in modern medicines originate from plants and the first step in drug production, is the recognition of plants needed for this purpose. This article presents a bagging approach for medical plants recognition based on their DNA sequences. In this work, the authors have developed a system that recognize DNA sequences of 14 medical plants, first they divided the 14-class data set into bi class sub-data sets, then instead of using an algorithm to classify the 14-class data set, they used the same algorithm to classify the sub-data sets. By doing so, they have simplified the problem of classification of 14 plants into sub-problems of bi class classification. To construct the subsets, the authors extracted all possible pairs of the 14 classes, so they gave each class more chances to be well predicted. This approach allows the study of the similarity between DNA sequences of a plant with each other plants. In terms of results, the authors have obtained very good results in which the accuracy has been doubled (from 45% to almost 80%). Classification of a new sequence was completed according to majority vote.

Download Full-text

Modified Deep Neural Networks for Dog Breeds Identification

10.20944/preprints201812.0232.v1 ◽

2018 ◽

Cited By ~ 1

Author(s):

Aydin Ayanzadeh ◽

Sahand Vahidnia

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

The State ◽

Fine Tuning ◽

Test Accuracy ◽

Data Sets ◽

Data Set

In this paper, we leverage state of the art models on Imagenet data-sets. We use the pre-trained model and learned weighs to extract the feature from the Dog breeds identification data-set. Afterwards, we applied fine-tuning and dataaugmentation to increase the performance of our test accuracy in classification of dog breeds datasets. The performance of the proposed approaches are compared with the state of the art models of Image-Net datasets such as ResNet-50, DenseNet-121, DenseNet-169 and GoogleNet. we achieved 89.66% , 85.37% 84.01% and 82.08% test accuracy respectively which shows thesuperior performance of proposed method to the previous works on Stanford dog breeds datasets.

Download Full-text

Supernova Host Galaxy Association and Photometric Classification of over 10,000 Light Curves from the Zwicky Transient Facility

Research Notes of the AAS ◽

10.3847/2515-5172/ac416e ◽

2021 ◽

Vol 5 (12) ◽

pp. 283

Author(s):

Braden Garretson ◽

Dan Milisavljevic ◽

Jack Reynolds ◽

Kathryn E. Weil ◽

Bhagya Subrayan ◽

...

Keyword(s):

Value Added ◽

Light Curves ◽

Host Galaxy ◽

Massive Data ◽

Data Sets ◽

Data Set ◽

Scale Modeling ◽

Final Data ◽

Type Ia

Abstract Here we present a catalog of 12,993 photometrically-classified supernova-like light curves from the Zwicky Transient Facility, along with candidate host galaxy associations. By training a random forest classifier on spectroscopically classified supernovae from the Bright Transient Survey, we achieve an accuracy of 80% across four supernova classes resulting in a final data set of 8208 Type Ia, 2080 Type II, 1985 Type Ib/c, and 720 SLSN. Our work represents a pathfinder effort to supply massive data sets of supernova light curves with value-added information that can be used to enable population-scale modeling of explosion parameters and investigate host galaxy environments.

Download Full-text

Classification of Microchannel Flame Regimes Based on Convolutional Neural Networks

10.1115/power2021-64437 ◽

2021 ◽

Author(s):

Seyed Navid Roohani Isfahani ◽

Vinicius M. Sauer ◽

Ingmar Schoegl

Keyword(s):

Data Augmentation ◽

Dynamic Range ◽

High Dynamic Range ◽

Data Sets ◽

Data Set ◽

Testing Data ◽

Experimental Approaches ◽

Transition Points ◽

Combustion Regimes

Abstract Micro-combustion has shown significant potential to study and characterize the combustion behavior of hydrocarbon fuels. Among several experimental approaches based on this method, the most prominent one employs an externally heated micro-channel. Three distinct combustion regimes are reported for this device namely, weak flames, flames with repetitive extinction and ignition (FREI), and normal flames, which are formed at low, moderate, and high flow rate ranges, respectively. Within each flame regime, noticeable differences exist in both shape and luminosity where transition points can be used to obtain insights into fuel characteristics. In this study, flame images are obtained using a monochrome camera equipped with a 430 nm bandpass filter to capture the chemiluminescence signal emitted by the flame. Sequences of conventional flame photographs are taken during the experiment, which are computationally merged to generate high dynamic range (HDR) images. In a highly diluted fuel/oxidizer mixture, it is observed that FREI disappear and are replaced by a gradual and direct transition between weak and normal flames which makes it hard to identify different combustion regimes. To resolve the issue, a convolutional neural network (CNN) is introduced to classify the flame regime. The accuracy of the model is calculated to be 99.34, 99.66, and 99.83% for “training”, “validation”, and “testing” data-sets, respectively. This level of accuracy is achieved by conducting a grid search to acquire optimized parameters for CNN. Furthermore, a data augmentation technique based on different experimental scenarios is used to generate flame images to increase the size of the data-set.

Download Full-text

Analyzing performance of classifiers for medical datasets

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.15.11370 ◽

2018 ◽

Vol 7 (2.15) ◽

pp. 136 ◽

Cited By ~ 1

Author(s):

Rosaida Rosly ◽

Mokhairi Makhtar ◽

Mohd Khalid Awang ◽

Mohd Isa Awang ◽

Mohd Nordin Abdul Rahman

Keyword(s):

Breast Cancer ◽

Cross Validation ◽

Ensemble Methods ◽

Data Sets ◽

Ensemble Classifiers ◽

Classification Models ◽

Data Set ◽

Mining Tool ◽

Fold Cross Validation

This paper analyses the performance of classification models using single classification and combination of ensemble method, which are Breast Cancer Wisconsin and Hepatitis data sets as training datasets. This paper presents a comparison of different classifiers based on a 10-fold cross validation using a data mining tool. In this experiment, various classifiers are implemented including three popular ensemble methods which are boosting, bagging and stacking for the combination. The result shows that for the classification of the Breast Cancer Wisconsin data set, the single classification of Naïve Bayes (NB) and a combination of bagging+NB algorithm displayed the highest accuracy at the same percentage (97.51%) compared to other combinations of ensemble classifiers. For the classification of the Hepatitisdata set, the result showed that the combination of stacking+Multi-Layer Perception (MLP) algorithm achieved a higher accuracy at 86.25%. By using the ensemble classifiers, the result may be improved. In future, a multi-classifier approach will be proposed by introducing a fusion at the classification level between these classifiers to obtain classification with higher accuracies.

Download Full-text

Acute Lymphoblastic Leukemia Detection and Classification of Its Subtypes Using Pretrained Deep Convolutional Neural Networks

Technology in Cancer Research & Treatment ◽

10.1177/1533033818802789 ◽

2018 ◽

Vol 17 ◽

pp. 153303381880278 ◽

Cited By ~ 29

Author(s):

Sarmad Shafique ◽

Samabia Tehsin

Keyword(s):

Acute Lymphoblastic Leukemia ◽

Data Augmentation ◽

Lymphoblastic Leukemia ◽

White Blood Cells ◽

Automated Detection ◽

Data Sets ◽

Deep Convolutional Neural Networks ◽

Data Set ◽

Different Color

Leukemia is a fatal disease of white blood cells which affects the blood and bone marrow in human body. We deployed deep convolutional neural network for automated detection of acute lymphoblastic leukemia and classification of its subtypes into 4 classes, that is, L1, L2, L3, and Normal which were mostly neglected in previous literature. In contrary to the training from scratch, we deployed pretrained AlexNet which was fine-tuned on our data set. Last layers of the pretrained network were replaced with new layers which can classify the input images into 4 classes. To reduce overtraining, data augmentation technique was used. We also compared the data sets with different color models to check the performance over different color images. For acute lymphoblastic leukemia detection, we achieved a sensitivity of 100%, specificity of 98.11%, and accuracy of 99.50%; and for acute lymphoblastic leukemia subtype classification the sensitivity was 96.74%, specificity was 99.03%, and accuracy was 96.06%. Unlike the standard methods, our proposed method was able to achieve high accuracy without any need of microscopic image segmentation.

Download Full-text

Quantification and classification of neuronal responses in kernel-smoothed peristimulus time histograms

Journal of Neurophysiology ◽

10.1152/jn.00595.2014 ◽

2015 ◽

Vol 113 (4) ◽

pp. 1260-1274 ◽

Cited By ~ 4

Author(s):

Michael R. H. Hill ◽

Itzhak Fried ◽

Christof Koch

Keyword(s):

Random Noise ◽

Large Data ◽

Neuronal Responses ◽

Data Set ◽

Firing Rates ◽

Kernel Convolution ◽

Actual Response ◽

Response Envelope ◽

Convolution Method

Peristimulus time histograms are a widespread form of visualizing neuronal responses. Kernel convolution methods transform these histograms into a smooth, continuous probability density function. This provides an improved estimate of a neuron's actual response envelope. We here develop a classifier, called the h-coefficient, to determine whether time-locked fluctuations in the firing rate of a neuron should be classified as a response or as random noise. Unlike previous approaches, the h-coefficient takes advantage of the more precise response envelope estimation provided by the kernel convolution method. The h-coefficient quantizes the smoothed response envelope and calculates the probability of a response of a given shape to occur by chance. We tested the efficacy of the h-coefficient in a large data set of Monte Carlo simulated smoothed peristimulus time histograms with varying response amplitudes, response durations, trial numbers, and baseline firing rates. Across all these conditions, the h-coefficient significantly outperformed more classical classifiers, with a mean false alarm rate of 0.004 and a mean hit rate of 0.494. We also tested the h-coefficient's performance in a set of neuronal responses recorded in humans. The algorithm behind the h-coefficient provides various opportunities for further adaptation and the flexibility to target specific parameters in a given data set. Our findings confirm that the h-coefficient can provide a conservative and powerful tool for the analysis of peristimulus time histograms with great potential for future development.

Download Full-text

Automatic classification of canine thoracic radiographs using deep learning

Scientific Reports ◽

10.1038/s41598-021-83515-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Tommaso Banzato ◽

Marek Wodzinski ◽

Silvia Burti ◽

Valentina Longhin Osti ◽

Valentina Rossoni ◽

...

Keyword(s):

Unsolved Problem ◽

Data Sets ◽

Diagnostic Systems ◽

Data Set ◽

Radiographic Findings ◽

Generalization Ability ◽

Novel Method ◽

Overall Performance ◽

Operator Curve

AbstractThe interpretation of thoracic radiographs is a challenging and error-prone task for veterinarians. Despite recent advancements in machine learning and computer vision, the development of computer-aided diagnostic systems for radiographs remains a challenging and unsolved problem, particularly in the context of veterinary medicine. In this study, a novel method, based on multi-label deep convolutional neural network (CNN), for the classification of thoracic radiographs in dogs was developed. All the thoracic radiographs of dogs performed between 2010 and 2020 in the institution were retrospectively collected. Radiographs were taken with two different radiograph acquisition systems and were divided into two data sets accordingly. One data set (Data Set 1) was used for training and testing and another data set (Data Set 2) was used to test the generalization ability of the CNNs. Radiographic findings used as non mutually exclusive labels to train the CNNs were: unremarkable, cardiomegaly, alveolar pattern, bronchial pattern, interstitial pattern, mass, pleural effusion, pneumothorax, and megaesophagus. Two different CNNs, based on ResNet-50 and DenseNet-121 architectures respectively, were developed and tested. The CNN based on ResNet-50 had an Area Under the Receive-Operator Curve (AUC) above 0.8 for all the included radiographic findings except for bronchial and interstitial patterns both on Data Set 1 and Data Set 2. The CNN based on DenseNet-121 had a lower overall performance. Statistically significant differences in the generalization ability between the two CNNs were evident, with the CNN based on ResNet-50 showing better performance for alveolar pattern, interstitial pattern, megaesophagus, and pneumothorax.

Download Full-text