training sample Latest Research Papers

Classifying Forest Structure of Red-Cockaded Woodpecker Habitat Using Structure from Motion Elevation Data De-Rived from sUAS Imagery

Drones ◽

10.3390/drones6010026 ◽

2022 ◽

Vol 6 (1) ◽

pp. 26

Author(s):

Brett Lawrence

Keyword(s):

Structure From Motion ◽

Forest Structure ◽

Training Sample ◽

Support Vector ◽

Montgomery County ◽

Unmanned Aerial Systems ◽

Post Process ◽

Elevation Data ◽

Red Cockaded Woodpecker ◽

Major Factors

Small unmanned aerial systems (sUAS) and relatively new photogrammetry software solutions are creating opportunities for forest managers to perform spatial analysis more efficiently and cost-effectively. This study aims to identify a method for leveraging these technologies to analyze vertical forest structure of red-cockaded woodpecker habitat in Montgomery County, Texas. Traditional sampling methods would require numerous hours of ground surveying and data collection using various measuring techniques. Structure from Motion (SfM), a photogrammetric method for creating 3-D structure from 2-D images, provides an alternative to relatively expensive LIDAR sensing technologies and can accurately model the high level of complexity found within our study area’s vertical structure. DroneDeploy, a photogrammetry processing app service, was used to post-process and create a point cloud, which was later further processed into a Canopy Height Model (CHM). Using supervised, object-based classification and comparing multiple classifier algorithms, classifications maps were generated with a best overall accuracy of 84.8% using Support Vector Machine in ArcGIS Pro software. Appropriately sized training sample datasets, correctly processed elevation data, and proper image segmentation were among the major factors impacting classification accuracy during the numerous classification iterations performed.

Short term prediction of atmospheric temperature using neural networks

MAUSAM ◽

10.54302/mausam.v53i4.1662 ◽

2022 ◽

Vol 53 (4) ◽

pp. 471-480

Author(s):

S. PAL ◽

J. DAS ◽

P. SENGUPTA ◽

S. K. BANERJEE

Keyword(s):

Neural Network ◽

Minimum Temperature ◽

Ground Level ◽

Test Sample ◽

Atmospheric Temperature ◽

Training Sample ◽

Test Cases ◽

Backpropagation Method ◽

Hidden Layer ◽

Short Term Prediction

In this paper, a neural network based forecasting model for the maximum and the minimum temperature for the ground level is proposed. A backpropagation method of gradient-decent learning in multi-layer perceptron (MLP) type of neural network with only one hidden layer is considered. This network consists of 25 input nodes and two output nodes. The network is trained with a varying number of nodes in the hidden layer using a set of training sample and each of them is tested with a set of test sample. It accepts previous two consecutive days information (such as pressures, temperatures, relative humidities, etc.) as inputs for the estimation of the maximum and the minimum temperature as output. The network with 20 or less neurons in the hidden layer is found to be "optimum" and it produces an error within ±2° C for 80% of test cases.

The Effect of Training Sample Size on the Prediction of White Matter Hyperintensity Volume in a Healthy Population Using BIANCA

Frontiers in Aging Neuroscience ◽

10.3389/fnagi.2021.720636 ◽

2022 ◽

Vol 13 ◽

Author(s):

Niklas Wulms ◽

Lea Redmann ◽

Christine Herpertz ◽

Nadine Bonberg ◽

Klaus Berger ◽

...

Keyword(s):

White Matter ◽

Sample Size ◽

Mean Absolute Error ◽

External Validation ◽

Similarity Index ◽

Absolute Error ◽

Training Sample ◽

White Matter Hyperintensity ◽

Training Sample Size ◽

The Difference

Introduction: White matter hyperintensities of presumed vascular origin (WMH) are an important magnetic resonance imaging marker of cerebral small vessel disease and are associated with cognitive decline, stroke, and mortality. Their relevance in healthy individuals, however, is less clear. This is partly due to the methodological challenge of accurately measuring rare and small WMH with automated segmentation programs. In this study, we tested whether WMH volumetry with FMRIB software library v6.0 (FSL; https://fsl.fmrib.ox.ac.uk/fsl/fslwiki) Brain Intensity AbNormality Classification Algorithm (BIANCA), a customizable and trainable algorithm that quantifies WMH volume based on individual data training sets, can be optimized for a normal aging population.Methods: We evaluated the effect of varying training sample sizes on the accuracy and the robustness of the predicted white matter hyperintensity volume in a population (n = 201) with a low prevalence of confluent WMH and a substantial proportion of participants without WMH. BIANCA was trained with seven different sample sizes between 10 and 40 with increments of 5. For each sample size, 100 random samples of T1w and FLAIR images were drawn and trained with manually delineated masks. For validation, we defined an internal and external validation set and compared the mean absolute error, resulting from the difference between manually delineated and predicted WMH volumes for each set. For spatial overlap, we calculated the Dice similarity index (SI) for the external validation cohort.Results: The study population had a median WMH volume of 0.34 ml (IQR of 1.6 ml) and included n = 28 (18%) participants without any WMH. The mean absolute error of the difference between BIANCA prediction and manually delineated masks was minimized and became more robust with an increasing number of training participants. The lowest mean absolute error of 0.05 ml (SD of 0.24 ml) was identified in the external validation set with a training sample size of 35. Compared to the volumetric overlap, the spatial overlap was poor with an average Dice similarity index of 0.14 (SD 0.16) in the external cohort, driven by subjects with very low lesion volumes.Discussion: We found that the performance of BIANCA, particularly the robustness of predictions, could be optimized for use in populations with a low WMH load by enlargement of the training sample size. Further work is needed to evaluate and potentially improve the prediction accuracy for low lesion volumes. These findings are important for current and future population-based studies with the majority of participants being normal aging people.

A Novel Generative Method for Machine Fault Diagnosis

Journal of Sensors ◽

10.1155/2022/5420478 ◽

2022 ◽

Vol 2022 ◽

pp. 1-11

Author(s):

Zhipeng Dong ◽

Yucheng Liu ◽

Jianshe Kang ◽

Shaohui Zhang

Keyword(s):

Deep Learning ◽

Fault Diagnosis ◽

Test Sample ◽

Training Sample ◽

Generative Adversarial Networks ◽

Mechanical Equipment ◽

Learning Models ◽

Machine Fault Diagnosis ◽

Machine Fault ◽

Generative Method

Deep learning is widely used in fault diagnosis of mechanical equipment and has achieved good results. However, these deep learning models require a large number of labeled samples for training, which is difficult to obtain enough labeled samples in the actual production process. However, it is easier to obtain unlabeled samples in the industrial environment. To overcome this problem, this paper proposes a novel method to generative enough label samples for training deep learning models. Unlike the generative adversarial networks, which required complex computing time, the calculation of the proposed novel generative method is simple and effective. First, we calculate the Euclidean distance between the training sample and the test sample; then, the weight coefficient between the training sample and the test sample is settled to generate pseudosamples; finally, combine with the pseudosamples, the deep learning method is training for machine fault diagnosis. In order to verify the effectiveness of the proposed method, two experiment datasets with planetary gearboxes and wind gearboxes are carried out with different activation functions. Experimental results show that the proposed method is effective for most activation function models.

Rice Mapping in Training Sample Shortage Regions Using a Deep Semantic Segmentation Model Trained on Pseudo-Labels

Remote Sensing ◽

10.3390/rs14020328 ◽

2022 ◽

Vol 14 (2) ◽

pp. 328

Author(s):

Pengliang Wei ◽

Ran Huang ◽

Tao Lin ◽

Jingfeng Huang

Keyword(s):

Large Scale ◽

Ground Truth ◽

Semantic Segmentation ◽

Training Sample ◽

Transfer Model ◽

Ground Truth Data ◽

Rice Distribution ◽

Multi Temporal ◽

Crop Mapping ◽

Model Training

A deep semantic segmentation model-based method can achieve state-of-the-art accuracy and high computational efficiency in large-scale crop mapping. However, the model cannot be widely used in actual large-scale crop mapping applications, mainly because the annotation of ground truth data for deep semantic segmentation model training is time-consuming. At the operational level, it is extremely difficult to obtain a large amount of ground reference data by photointerpretation for the model training. Consequently, in order to solve this problem, this study introduces a workflow that aims to extract rice distribution information in training sample shortage regions, using a deep semantic segmentation model (i.e., U-Net) trained on pseudo-labels. Based on the time series Sentinel-1 images, Cropland Data Layer (CDL) and U-Net model, the optimal multi-temporal datasets for rice mapping were summarized, using the global search method. Then, based on the optimal multi-temporal datasets, the proposed workflow (a combination of K-Means and random forest) was directly used to extract the rice-distribution information of Jiangsu (i.e., the K–RF pseudo-labels). For comparison, the optimal well-trained U-Net model acquired from Arkansas (i.e., the transfer model) was also transferred to Jiangsu to extract local rice-distribution information (i.e., the TF pseudo-labels). Finally, the pseudo-labels with high confidences generated from the two methods were further used to retrain the U-Net models, which were suitable for rice mapping in Jiangsu. For different rice planting pattern regions of Jiangsu, the final results showed that, compared with the U-Net model trained on the TF pseudo-labels, the rice area extraction errors of pseudo-labels could be further reduced by using the U-Net model trained on the K–RF pseudo-labels. In addition, compared with the existing rule-based rice mapping methods, he U-Net model trained on the K–RF pseudo-labels could robustly extract the spatial distribution information of rice. Generally, this study could provide new options for applying a deep semantic segmentation model to training sample shortage regions.

Identifying Potential miRNA Biomarkers for Gastric Cancer Diagnosis Using Machine Learning Variable Selection Approach

Frontiers in Genetics ◽

10.3389/fgene.2021.779455 ◽

2022 ◽

Vol 12 ◽

Author(s):

Neda Gilani ◽

Reza Arabi Belaghi ◽

Younes Aftabi ◽

Elnaz Faramarzi ◽

Tuba Edgünlü ◽

...

Keyword(s):

Machine Learning ◽

Gastric Cancer ◽

Variable Selection ◽

Prediction Models ◽

Strong Relationship ◽

Training Sample ◽

Machine Learning Algorithms ◽

Molecular Events ◽

Selection Approach ◽

Ontological Analysis

Aim: This study aimed to accurately identification of potential miRNAs for gastric cancer (GC) diagnosis at the early stages of the disease.Methods: We used GSE106817 data with 2,566 miRNAs to train the machine learning models. We used the Boruta machine learning variable selection approach to identify the strong miRNAs associated with GC in the training sample. We then validated the prediction models in the independent sample GSE113486 data. Finally, an ontological analysis was done on identified miRNAs to eliciting the relevant relationships.Results: Of those 2,874 patients in the training the model, there were 115 (4%) patients with GC. Boruta identified 30 miRNAs as potential biomarkers for GC diagnosis and hsa-miR-1343-3p was at the highest ranking. All of the machine learning algorithms showed that using hsa-miR-1343-3p as a biomarker, GC can be predicted with very high precision (AUC; 100%, sensitivity; 100%, specificity; 100% ROC; 100%, Kappa; 100) using with the cut-off point of 8.2 for hsa-miR-1343-3p. Also, ontological analysis of 30 identified miRNAs approved their strong relationship with cancer associated genes and molecular events.Conclusion: The hsa-miR-1343-3p could be introduced as a valuable target for studies on the GC diagnosis using reliable biomarkers.

Partial Label Learning Based on Fully Connected Deep Neural Network

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2022.16.35 ◽

2022 ◽

Vol 16 ◽

pp. 287-297

Author(s):

Houjie Li ◽

Lei Wu ◽

Jianjun He ◽

Ruirui Zheng ◽

Yu Zhou ◽

...

Keyword(s):

Neural Network ◽

Process Model ◽

Deep Neural Network ◽

Learning Algorithm ◽

Learning Algorithms ◽

Ground Truth ◽

Training Sample ◽

Support Vector ◽

Partial Label Learning ◽

Fully Connected

The ambiguity of training samples in the partial label learning framework makes it difficult for us to develop learning algorithms and most of the existing algorithms are proposed based on the traditional shallow machine learn- ing models, such as decision tree, support vector machine, and Gaussian process model. Deep neu- ral networks have demonstrated excellent perfor- mance in many application fields, but currently it is rarely used for partial label learning frame- work. This study proposes a new partial label learning algorithm based on a fully connected deep neural network, in which the relationship between the candidate labels and the ground- truth label of each training sample is established by defining three new loss functions, and a regu- larization term is added to prevent overfitting. The experimental results on the controlled U- CI datasets and real-world partial label datasets reveal that the proposed algorithm can achieve higher classification accuracy than the state-of- the-art partial label learning algorithms.

Automated fire risk evaluation of electrical installations in the man-machine system

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/1211/1/012017 ◽

2022 ◽

Vol 1211 (1) ◽

pp. 012017

Author(s):

M A Gabova ◽

O K Nikolsky ◽

Yu D Shlionskaya

Keyword(s):

Neural Network ◽

Correlation Method ◽

Principal Component ◽

Fire Risk ◽

Training Sample ◽

Average Error ◽

Industrial Complex ◽

Principal Component Method ◽

Fire Condition ◽

Expert Assessments

Abstract The article considers approaches to the formation of a system of criteria for assessing the electrical installations fire condition of the agricultural and industrial complex. Based on the analysis of the literature, the conclusion is made about the appropriateness of the use of expert assessments. To implement the decision, a group of experts was assembled, on the basis of whose knowledge a list of 42 parametersζ characterizing the fire condition of the electrical installation was determined. To identify the relationships and form a method for calculating the estimated value of fire risk, experts assessed the fire condition of 70 electrical installations of the agricultural and industrial complex of the region. A knowledge base was formed from the resulting values. As a method of data analysis, it was decided to use neural networks, but the available sample is not sufficient for high-quality training of a neural network. Therefore, the correlation method and the principal component method were considered, and based on the calculations, it was decided to use a training sample consisting of 6 principal components for training a neural network. A neural network was trained on these data and the values of the average error were obtained sufficiently low, which may indicate sufficient accuracy of the generated model. The article also presents a conceptual scheme of a software package for automating calculations in accordance with the developed model.

THE USE OF ARTIFICIAL INTELLIGENCE METHODS FOR APPROXIMATION OF THE MECHANICAL BEHAVIOR OF RUBBER-LIKE MATERIALS

Bulletin of National Technical University KhPI Series System Analysis Control and Information Technologies ◽

10.20998/2079-0023.2021.02.15 ◽

2021 ◽

pp. 95-99

Author(s):

Oleksii Vodka ◽

Serhii Pohrebniak

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Hysteresis Loops ◽

Test Sample ◽

Training Sample ◽

The Neural Network ◽

Internal Structures ◽

Wide Range ◽

Software Product ◽

Direct Distribution

In the XXI century, neural networks are widely used in various fields, including computer simulation and mechanics. This popularity is due to the factthat they give high precision, work fast and have a very wide range of settings. The purpose of creating a software product using elements of artificialintelligence, for interpolation and approximation of experimental data. The software should work correctly, and yield results with minimal error. Thedisadvantage of using mathematical approaches to calculating and predicting hysteresis loops is that they describe unloading rather poorly, thus, weobtain incorrect data for calculating the stress-strain state of a structure. The solution tool use of elements of artificial intelligence, but rather neuralnetworks of direct distribution. The neural network of direct distribution has been built and trained in this work. It has been trained with a teacher (ateacher using the method of reverse error propagation) based on a learning sample of a pre-experiment. Several networks of different structures werebuilt for testing, which received the same dataset that was not used during the training, but was known from the experiment, thus finding a networkerror in the amount of allocated energy and in the mean square deviation. The article describes in detail the mathematical interpretation of neuralnetworks, the method for training them, the previously conducted experiment, structure of network that was used and its topology, the training method,preparation of the training sample, and the test sample. As a result of the robots carried out, the software was tested in which an artificial neuralnetwork was used, several types of neural networks with different input data and internal structures were built and tested, the error of their work wasdetermined, the positive and negative sides of the networks that were used were formed.

MODIFIED CORRELATION WEIGHT K-NEAREST NEIGHBOR CLASSIFIER USING TRAINING DATASET CLEANING METHOD

Indonesian Journal of Physics ◽

10.5614/itb.ijp.2021.32.2.5 ◽

2021 ◽

Vol 32 (2) ◽

pp. 20-25

Author(s):

Efraim Kurniawan Dairo Kette

Keyword(s):

Nearest Neighbor ◽

Training Sample ◽

Classification Performance ◽

Training Data ◽

Training Dataset ◽

Classification Problems ◽

K Nearest Neighbor ◽

Neighborhood Structure ◽

Data Set ◽

Sample Data

In pattern recognition, the k-Nearest Neighbor (kNN) algorithm is the simplest non-parametric algorithm. Due to its simplicity, the model cases and the quality of the training data itself usually influence kNN algorithm classification performance. Therefore, this article proposes a sparse correlation weight model, combined with the Training Data Set Cleaning (TDC) method by Classification Ability Ranking (CAR) called the CAR classification method based on Coefficient-Weighted kNN (CAR-CWKNN) to improve kNN classifier performance. Correlation weight in Sparse Representation (SR) has been proven can increase classification accuracy. The SR can show the 'neighborhood' structure of the data, which is why it is very suitable for classification based on the Nearest Neighbor. The Classification Ability (CA) function is applied to classify the best training sample data based on rank in the cleaning stage. The Leave One Out (LV1) concept in the CA works by cleaning data that is considered likely to have the wrong classification results from the original training data, thereby reducing the influence of the training sample data quality on the kNN classification performance. The results of experiments with four public UCI data sets related to classification problems show that the CAR-CWKNN method provides better performance in terms of accuracy.

training sample
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Classifying Forest Structure of Red-Cockaded Woodpecker Habitat Using Structure from Motion Elevation Data De-Rived from sUAS Imagery

Short term prediction of atmospheric temperature using neural networks

The Effect of Training Sample Size on the Prediction of White Matter Hyperintensity Volume in a Healthy Population Using BIANCA

A Novel Generative Method for Machine Fault Diagnosis

Rice Mapping in Training Sample Shortage Regions Using a Deep Semantic Segmentation Model Trained on Pseudo-Labels

Identifying Potential miRNA Biomarkers for Gastric Cancer Diagnosis Using Machine Learning Variable Selection Approach

Partial Label Learning Based on Fully Connected Deep Neural Network

Automated fire risk evaluation of electrical installations in the man-machine system

THE USE OF ARTIFICIAL INTELLIGENCE METHODS FOR APPROXIMATION OF THE MECHANICAL BEHAVIOR OF RUBBER-LIKE MATERIALS

MODIFIED CORRELATION WEIGHT K-NEAREST NEIGHBOR CLASSIFIER USING TRAINING DATASET CLEANING METHOD

Export Citation Format

training sampleRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Classifying Forest Structure of Red-Cockaded Woodpecker Habitat Using Structure from Motion Elevation Data De-Rived from sUAS Imagery

Short term prediction of atmospheric temperature using neural networks

The Effect of Training Sample Size on the Prediction of White Matter Hyperintensity Volume in a Healthy Population Using BIANCA

A Novel Generative Method for Machine Fault Diagnosis

Rice Mapping in Training Sample Shortage Regions Using a Deep Semantic Segmentation Model Trained on Pseudo-Labels

Identifying Potential miRNA Biomarkers for Gastric Cancer Diagnosis Using Machine Learning Variable Selection Approach

Partial Label Learning Based on Fully Connected Deep Neural Network

Automated fire risk evaluation of electrical installations in the man-machine system

THE USE OF ARTIFICIAL INTELLIGENCE METHODS FOR APPROXIMATION OF THE MECHANICAL BEHAVIOR OF RUBBER-LIKE MATERIALS

MODIFIED CORRELATION WEIGHT K-NEAREST NEIGHBOR CLASSIFIER USING TRAINING DATASET CLEANING METHOD

training sample
Recently Published Documents