Neural Network Model for Assessing the Physical and Mechanical Properties of a Metal Material Based on Deep Learning

The paper investigates the algorithmic stability of learning a deep neural network in problems of recognition of the materials microstructure. It is shown that at 8% of quantitative deviation in the basic test set the algorithm trained network loses stability. This means that with such a quantitative or qualitative deviation in the training or test sets, the results obtained with such trained network can hardly be trusted. Although the results of this study are applicable to the particular case, i.e. problems of recognition of the microstructure using ResNet-152, the authors propose a cheaper method for studying stability based on the analysis of the test, rather than the training set.

Download Full-text

Training machine learning models on patient level data segregation is crucial in practical clinical applications

10.1101/2020.04.23.20076406 ◽

2020 ◽

Author(s):

Mustafa Umit Oner ◽

Yi-Chih Cheng ◽

Hwee Kuan Lee ◽

Wing-Kin Sung

Keyword(s):

Neural Network ◽

Machine Learning ◽

Real World ◽

Deep Neural Network ◽

Strongly Correlated ◽

Training Set ◽

Test Set ◽

The Real ◽

Patient Level ◽

Test Sets

This article discusses the effect of segregation of histopathology images data into three sets; training set for training machine learning model, validation set for model selection and test set for testing model performance. We found that one must be cautious when segregating histological images data (slides) into training, validation and test sets because subtle mishandling of data can introduce data leakage and gives illusively good results on the test set. We performed this study on gene mutation prediction performance by using the deep neural network in the paper of Coudray et al. [1]. By using the provided code and the same set of data, we discovered that data segregation method of the paper suffered from a data leakage problem [2]. The paper pools all the slides from all patients and then segregates them exclusively into training, validation and test sets. In this way, none of the slides is used in more than one set. This seems to be a clean separation of the data. However, the paper did not consider that some slides were strongly correlated. For example, if the tumor of a patient is cut and stained to produce multiple slides, these slides are strongly correlated. If one slide is used for training and another one is used for testing, essentially, the deep neural network can memorize the pattern on the slide in the training set and apply this memory on the slide in the test set. Hence, by memorization, the deep neural network can predict very well on the slide in the test set. This mechanism of prediction is not useful in a practical clinical setting since no two tumors are the same in the real world. In this real setting, we demand the deep neural network to generalize across patients and tumors. Hereafter, we call this way of data segregation slide-level segregation. There is a better way to perform data segregation that is compatible for deployment of deep learning model in practical clinical settings. First, the patients are segregated exclusively into training, validation and test sets. All the slides belonging to the patients in the training set are used solely for training. Similarly, all the slides belonging to the patients in the test set are used for testing only. Segregation of data in this way forces the deep neural network to generalize across patients. We call this way of data segregation patient-level segregation.In slide-level segregation approach analysis, we obtained similar results to that presented in the paper by Coudray et al. [1]: overall performance on the test set was good. However, it was illusory due to data leakage. The model gave very good testing results on the slides that come from a patient who also has slides in the training set. On the other hand, the test result was quite bad on the slides that come from a patient who does not have any slides in the training set. Hereafter, we call the slide in the test set as seen-patient data if the corresponding patient also has some slides in the training set. Otherwise, the slide in the test set is called unseen-patient data if the corresponding patient does not have slides in the training set. Furthermore, we analyzed performance of the model on the data segregated by the patient-level segregation approach. Note that, in this approach, all patients in the test set mimics the real world clinical workflow. We observed a significant drop in the performance of the model on the test set of patient-level segregation approach compared to the performance on the test set of slide-level segregation approach. Moreover, the performance of the model on the test set of patient-level segregation approach was very similar to the performance on the unseen-patients data in the test set of slide-level segregation approach. Hence, we conclude that patient-level segregation approach is crucial and appropriate to simulate real world scenario, where each patient in the test set can be thought as a patient walking into clinic tomorrow.

Download Full-text

Limited Scalability of Single Deep Neural Network for Surgical Instrument Segmentation in Different Surgical Environments

10.21203/rs.3.rs-888076/v1 ◽

2021 ◽

Author(s):

Daichi Kitaguchi ◽

Toru Fujino ◽

Nobuyoshi Takeshita ◽

Hiro Hasegawa ◽

Kensaku Mori ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Deep Neural Network ◽

Recording System ◽

Surgical Instrument ◽

Training Set ◽

Device Development ◽

Registration Number ◽

Surgical Device ◽

Test Sets

Abstract Clarifying the scalability of deep-learning-based surgical instrument segmentation networks in diverse surgical environments is important in recognizing the challenges of overfitting in surgical device development. This study comprehensively evaluated deep neural network scalability for surgical instrument segmentation, using 5238 images randomly extracted from 128 intraoperative videos. The video dataset contained 112 laparoscopic colorectal resection, 5 laparoscopic distal gastrectomy, 5 laparoscopic cholecystectomy, and 6 laparoscopic partial hepatectomy cases. Deep-learning-based surgical instrument segmentation was performed for test sets with 1) the same conditions as the training set; 2) the same recognition target surgical instrument and surgery type but different laparoscopic recording systems; 3) the same laparoscopic recording system and surgery type but slightly different recognition target laparoscopic surgical forceps; 4) the same laparoscopic recording system and recognition target surgical instrument but different surgery types. The mean average precision and mean intersection over union for test sets 1, 2, 3, and 4 were 0.941 and 0.887, 0.866 and 0.671, 0.772 and 0.676, and 0.588 and 0.395, respectively. Therefore, the recognition accuracy decreased even under slightly different conditions. To enhance the generalization of deep neural networks in surgery, constructing a training set that considers diverse surgical environments under real-world conditions is crucial. Trial Registration Number: 2020–315, date of registration: October 5, 2020

Download Full-text

Identification Method for Series Arc Faults Based on Wavelet Transform and Deep Neural Network

Energies ◽

10.3390/en13010142 ◽

2019 ◽

Vol 13 (1) ◽

pp. 142 ◽

Cited By ~ 2

Author(s):

Qiongfang Yu ◽

Yaqian Hu ◽

Yi Yang

Keyword(s):

Neural Network ◽

Wavelet Transform ◽

Power Supply ◽

Distribution Systems ◽

Deep Neural Network ◽

Experimental Result ◽

Discrete Wavelet ◽

Training Set ◽

Test Set ◽

Power Distribution System

The power supply quality and power supply safety of a low-voltage residential power distribution system is seriously affected by the occurrence of series arc faults. It is difficult to detect and extinguish them due to the characteristics of small current, high stochasticity, and strong concealment. In order to improve the overall safety of residential distribution systems, a novel method based on discrete wavelet transform (DWT) and deep neural network (DNN) is proposed to detect series arc faults in this paper. An experimental bed is built to obtain current signals under two states, normal and arcing. The collected signals are discomposed in different scales applying the DWT. The wavelet coefficient sequences are used for forming training set and test set. The deep neural network trained by training set under 4 different loads adaptively learn the feature of arc faults. The accuracy of arc faults recognition is sent through feeding test set into the model, about 97.75%. The experimental result shows that this method has good accuracy and generality under different types of loading.

Download Full-text

Using Deep Neural Network to Diagnose Thyroid Nodules on Ultrasound in Patients With Hashimoto’s Thyroiditis

Frontiers in Oncology ◽

10.3389/fonc.2021.614172 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yiqing Hou ◽

Chao Chen ◽

Lu Zhang ◽

Wei Zhou ◽

Qinyang Lu ◽

...

Keyword(s):

Neural Network ◽

Hashimoto’S Thyroiditis ◽

High Performance ◽

Deep Neural Network ◽

Thyroid Nodules ◽

Hashimoto's Thyroiditis ◽

Training Set ◽

Test Set ◽

The One ◽

Benign Nodules

ObjectiveThe aim of this study is to develop a model using Deep Neural Network (DNN) to diagnose thyroid nodules in patients with Hashimoto’s Thyroiditis.MethodsIn this retrospective study, we included 2,932 patients with thyroid nodules who underwent thyroid ultrasonogram in our hospital from January 2017 to August 2019. 80% of them were included as training set and 20% as test set. Nodules suspected for malignancy underwent FNA or surgery for pathological results. Two DNN models were trained to diagnose thyroid nodules, and we chose the one with better performance. The features of nodules as well as parenchyma around nodules will be learned by the model to achieve better performance under diffused parenchyma. 10-fold cross-validation and an independent test set were used to evaluate the performance of the algorithm. The performance of the model was compared with that of the three groups of radiologists with clinical experience of <5 years, 5–10 years, >10 years respectively.ResultsIn total, 9,127 images were collected from 2,932 patients with 7,301 images for the training set and 1,806 for the test set. 56% of the patients enrolled had Hashimoto’s Thyroiditis. The model achieved an AUC of 0.924 for distinguishing malignant and benign nodules in the test set. It showed similar performance under diffused thyroid parenchyma and normal parenchyma with sensitivity of 0.881 versus 0.871 (p = 0.938) and specificity of 0.846 versus 0.822 (p = 0.178). In patients with HT, the model achieved an AUC of 0.924 to differentiate malignant and benign nodules which was significantly higher than that of the three groups of radiologists (AUC = 0.824, 0.857, 0.863 respectively, p < 0.05).ConclusionThe model showed high performance in diagnosing thyroid nodules under both normal and diffused parenchyma. In patients with Hashimoto’s Thyroiditis, the model showed a better performance compared to radiologists with various years of experience.

Download Full-text

Feature-Weighted Sampling for Proper Evaluation of Classification Models

Applied Sciences ◽

10.3390/app11052039 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2039

Author(s):

Hyunseok Shin ◽

Sejong Oh

Keyword(s):

Random Sampling ◽

Sampling Method ◽

Classification Model ◽

Training Set ◽

Test Set ◽

Feature Importance ◽

Proper Training ◽

Machine Learning Applications ◽

Test Sets ◽

The Given

In machine learning applications, classification schemes have been widely used for prediction tasks. Typically, to develop a prediction model, the given dataset is divided into training and test sets; the training set is used to build the model and the test set is used to evaluate the model. Furthermore, random sampling is traditionally used to divide datasets. The problem, however, is that the performance of the model is evaluated differently depending on how we divide the training and test sets. Therefore, in this study, we proposed an improved sampling method for the accurate evaluation of a classification model. We first generated numerous candidate cases of train/test sets using the R-value-based sampling method. We evaluated the similarity of distributions of the candidate cases with the whole dataset, and the case with the smallest distribution–difference was selected as the final train/test set. Histograms and feature importance were used to evaluate the similarity of distributions. The proposed method produces more proper training and test sets than previous sampling methods, including random and non-random sampling.

Download Full-text

The Importance of the Test Set Size in Quantification Assessment

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/366 ◽

2020 ◽

Cited By ~ 1

Author(s):

André Maletzke ◽

Waqar Hassan ◽

Denis dos Reis ◽

Gustavo Batista

Keyword(s):

Performance Measures ◽

Training Set ◽

Test Set ◽

Test Size ◽

Critical Variable ◽

Set Size ◽

Quantification Method ◽

Class Distribution ◽

Cherry Picking ◽

Test Sets

Quantification is a task similar to classification in the sense that it learns from a labeled training set. However, quantification is not interested in predicting the class of each observation, but rather measure the class distribution in the test set. The community has developed performance measures and experimental setups tailored to quantification tasks. Nonetheless, we argue that a critical variable, the size of the test sets, remains ignored. Such disregard has three main detrimental effects. First, it implicitly assumes that quantifiers will perform equally well for different test set sizes. Second, it increases the risk of cherry-picking by selecting a test set size for which a particular proposal performs best. Finally, it disregards the importance of designing methods that are suitable for different test set sizes. We discuss these issues with the support of one of the broadest experimental evaluations ever performed, with three main outcomes. (i) We empirically demonstrate the importance of the test set size to assess quantifiers. (ii) We show that current quantifiers generally have a mediocre performance on the smallest test sets. (iii) We propose a metalearning scheme to select the best quantifier based on the test size that can outperform the best single quantification method.

Download Full-text

Modeling and Composition Design of Low-Alloy Steel’s Mechanical Properties Based on Neural Networks and Genetic Algorithms

Materials ◽

10.3390/ma13235316 ◽

2020 ◽

Vol 13 (23) ◽

pp. 5316

Author(s):

Zhenlong Zhu ◽

Yilong Liang ◽

Jianghe Zou

Keyword(s):

Neural Network ◽

Mechanical Properties ◽

Heat Treatment ◽

Genetic Algorithm ◽

Alloy Steel ◽

Deep Neural Network ◽

Treatment Process ◽

Mechanical Performance ◽

Low Alloy Steel ◽

Forward Selection

Accurately improving the mechanical properties of low-alloy steel by changing the alloying elements and heat treatment processes is of interest. There is a mutual relationship between the mechanical properties and process components, and the mechanism for this relationship is complicated. The forward selection-deep neural network and genetic algorithm (FS-DNN&GA) composition design model constructed in this paper is a combination of a neural network and genetic algorithm, where the model trained by the neural network is transferred to the genetic algorithm. The FS-DNN&GA model is trained with the American Society of Metals (ASM) Alloy Center Database to design the composition and heat treatment process of alloy steel. First, with the forward selection (FS) method, influencing factors—C, Si, Mn, Cr, quenching temperature, and tempering temperature—are screened and recombined to be the input of different mechanical performance prediction models. Second, the forward selection-deep neural network (FS-DNN) mechanical prediction model is constructed to analyze the FS-DNN model through experimental data to best predict the mechanical performance. Finally, the FS-DNN trained model is brought into the genetic algorithm to construct the FS-DNN&GA model, and the FS-DNN&GA model outputs the corresponding chemical composition and process when the mechanical performance increases or decreases. The experimental results show that the FS-DNN model has high accuracy in predicting the mechanical properties of 50 furnaces of low-alloy steel. The tensile strength mean absolute error (MAE) is 11.7 MPa, and the yield strength MAE is 13.46 MPa. According to the chemical composition and heat treatment process designed by the FS-DNN&GA model, five furnaces of Alloy1–Alloy5 low-alloy steel were smelted, and tensile tests were performed on these five low-alloy steels. The results show that the mechanical properties of the designed alloy steel are completely within the design range, providing useful guidance for the future development of new alloy steel.

Download Full-text

Effect of Reconstruction Algorithm on the Identification of 3D Printing Polymers Based on Hyperspectral CT Technology Combined with Artificial Neural Network

Materials ◽

10.3390/ma13081963 ◽

2020 ◽

Vol 13 (8) ◽

pp. 1963 ◽

Cited By ~ 2

Author(s):

Zheng Fang ◽

Renbin Wang ◽

Mengyi Wang ◽

Shuo Zhong ◽

Liquan Ding ◽

...

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

3D Printing ◽

Principal Components ◽

Reconstruction Technique ◽

Ct Reconstruction ◽

Training Set ◽

X Ray ◽

Test Sets ◽

Artificial Neural

Hyperspectral X-ray CT (HXCT) technology provides not only structural imaging but also the information of material components therein. The main purpose of this study is to investigate the effect of various reconstruction algorithms on reconstructed X-ray absorption spectra (XAS) of components shown in the CT image by means of HXCT. In this paper, taking 3D printing polymer as an example, seven kinds of commonly used polymers such as thermoplastic elastomer (TPE), carbon fiber reinforced polyamide (PA-CF), acrylonitrile butadiene styrene (ABS), polylactic acid (PLA), ultraviolet photosensitive resin (UV9400), polyethylene terephthalate glycol (PETG), and polyvinyl alcohol (PVA) were selected as samples for hyperspectral CT reconstruction experiments. Seven kinds of 3D printing polymer and two interfering samples were divided into a training set and test sets. First, structural images of specimens were reconstructed by Filtered Back-Projection (FBP), Algebra Reconstruction Technique (ART) and Maximum-Likelihood Expectation-Maximization (ML-EM). Secondly, reconstructed XAS were extracted from the pixels of region of interest (ROI) compartmentalized in the images. Thirdly, the results of principal component analysis (PCA) demonstrated that the first four principal components contain the main features of reconstructed XAS, so we adopted Artificial Neural Network (ANN) trained by the reconstructed XAS expressed by the first four principal components in the training set to identify that the XAS of corresponding polymers exist in both of test sets from the training set. The result of ANN displays that FBP has the best performance of classification, whose ten-fold cross-validation accuracy reached 99%. It suggests that hyperspectral CT reconstruction is a promising way of getting image features and material features at the same time, which can be used in medical imaging and nondestructive testing.

Download Full-text

Radial Basis Function Artificial Neural Network for the Investigation of Thyroid Cytological Lesions

Journal of Thyroid Research ◽

10.1155/2020/5464787 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Christos Fragopoulos ◽

Abraham Pouliakis ◽

Christos Meristoudis ◽

Emmanouil Mastorakis ◽

Niki Margari ◽

...

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Radial Basis Function ◽

Basis Function ◽

Histological Evaluation ◽

Image Analysis System ◽

Training Set ◽

Test Set ◽

Radial Basis ◽

Artificial Neural

Objective. This study investigates the potential of an artificial intelligence (AI) methodology, the radial basis function (RBF) artificial neural network (ANN), in the evaluation of thyroid lesions. Study Design. The study was performed on 447 patients who had both cytological and histological evaluation in agreement. Cytological specimens were prepared using liquid-based cytology, and the histological result was based on subsequent surgical samples. Each specimen was digitized; on these images, nuclear morphology features were measured by the use of an image analysis system. The extracted measurements (41,324 nuclei) were separated into two sets: the training set that was used to create the RBF ANN and the test set that was used to evaluate the RBF performance. The system aimed to predict the histological status as benign or malignant. Results. The RBF ANN obtained in the training set has sensitivity 82.5%, specificity 94.6%, and overall accuracy 90.3%, while in the test set, these indices were 81.4%, 90.0%, and 86.9%, respectively. Algorithm was used to classify patients on the basis of the RBF ANN, the overall sensitivity was 95.0%, the specificity was 95.5%, and no statistically significant difference was observed. Conclusion. AI techniques and especially ANNs, only in the recent years, have been studied extensively. The proposed approach is promising to avoid misdiagnoses and assists the everyday practice of the cytopathology. The major drawback in this approach is the automation of a procedure to accurately detect and measure cell nuclei from the digitized images.

Download Full-text

Combining Proteomics and Gene Expression Profiling Identifies Proteins/Genes Associated with Short Overall Survival in Multiple Myeloma

Blood ◽

10.1182/blood.v120.21.197.197 ◽

2012 ◽

Vol 120 (21) ◽

pp. 197-197

Author(s):

Ricky D Edmondson ◽

Shweta S. Chavan ◽

Christoph Heuck ◽

Bart Barlogie

Keyword(s):

Gene Expression ◽

Overall Survival ◽

Multivariate Analyses ◽

Rank Test ◽

P Value ◽

Newly Diagnosed ◽

Training Set ◽

Test Set ◽

Log Rank Test ◽

Test Sets

Abstract Abstract 197 We and others have used gene expression profiling to classify multiple myeloma into high and low risk groups; here, we report the first combined GEP and proteomics study of a large number of baseline samples (n=85) of highly enriched tumor cells from patients with newly diagnosed myeloma. Peptide expression levels from MS data on CD138-selected plasma cells from a discovery set of 85 patients with newly diagnosed myeloma were used to identify proteins that were linked to short survival (OS < 3 years vs OS ≥ 3 years). The proteomics dataset consisted of intensity values for 11,006 peptides (representing 2,155 proteins), where intensity is the quantitative measure of peptide abundance; Peptide intensities were normalized by Z score transformation and significance analysis of microarray (SAM) was applied resulting in the identification 24 peptides as differentially expressed between the two groups (OS < 3 years vs OS ≥ 3 years), with fold change ≥1.5 and FDR <5%. The 24 peptides mapped to 19 unique proteins, and all were present at higher levels in the group with shorter overall survival than in the group with longer overall survival. An independent SAM analysis with parameters identical to the proteomics analysis (fold change ≥1.5; FDR <5%) was performed with the Affymetrix U133Plus2 microarray chip based expression data. This analysis identified 151 probe sets that were differentially expressed between the two groups; 144 probe sets were present at higher levels and seven at lower levels in the group with shorter overall survival. Comparing the SAM analyses of proteomics and GEP data, we identified nine probe sets, corresponding to seven genes, with increased levels of both protein and mRNA in the short lived group. In order to validate these findings from the discovery experiment we used GEP data from a randomized subset of the TT3 patient population as a training set for determining the optimal cut-points for each of the nine probe sets. Thus, TT3 population was randomized into two sub-populations for the training set (two-thirds of the population; n=294) and test set (one-third of the population; n=147); the Total Therapy 2 (TT2) patient population was used as an additional test set (n=441). A running log rank test was performed on the training set for each of the nine probe sets to determine its optimal gene expression cut-point. The cut-points derived from the training set were then applied to TT3 and TT2 test sets to investigate survival differences for the groups separated by the optimal cutpoint for each probe. The overall survival of the groups was visualized using the method of Kaplan and Meier, and a P-value was calculated (based on log-rank test) to determine whether there was a statistically significant difference in survival between the two groups (P ≤0.05). We performed univariate regression analysis using Cox proportional hazard model with the nine probe sets as variables on the TT3 test set. To identify which of the genes corresponding to these nine probes had an independent prognostic value, we performed a multivariate stepwise Cox regression analysis. wherein CACYBP, FABP5, and IQGAP2 retained significance after competing with the remaining probe sets in the analysis. CACYBP had the highest hazard ratio (HR 2.70, P-value 0.01). We then performed the univariate and multivariate analyses on the TT2 test set where CACYBP, CORO1A, ENO1, and STMN1 were selected by the multivariate analysis, and CACYBP had the highest hazard ratio (HR 1.93, P-value 0.004). CACYBP was the only gene selected by multivariate analyses of both test sets. Disclosures: No relevant conflicts of interest to declare.

Download Full-text