A Simple Method to Train the AI Diagnosis Model of Pulmonary Nodules

Background. The differential diagnosis of subcentimetre lung nodules with a diameter of less than 1 cm has always been one of the problems of imaging doctors and thoracic surgeons. We plan to create a deep learning model for the diagnosis of pulmonary nodules in a simple method. Methods. Image data and pathological diagnosis of patients come from the First Affiliated Hospital of Zhejiang University School of Medicine from October 1, 2016, to October 1, 2019. After data preprocessing and data augmentation, the training set is used to train the model. The test set is used to evaluate the trained model. At the same time, the clinician will also diagnose the test set. Results. A total of 2,295 images of 496 lung nodules and their corresponding pathological diagnosis were selected as a training set and test set. After data augmentation, the number of training set images reached 12,510 images, including 6,648 malignant nodular images and 5,862 benign nodular images. The area under the P-R curve of the trained model is 0.836 in the classification of malignant and benign nodules. The area under the ROC curve of the trained model is 0.896 (95% CI: 78.96%~100.18%), which is higher than that of three doctors. However, the P value is not less than 0.05. Conclusion. With the help of an automatic machine learning system, clinicians can create a deep learning pulmonary nodule pathology classification model without the help of deep learning experts. The diagnostic efficiency of this model is not inferior to that of the clinician.

Download Full-text

Deep Learning-based Sentiment Analysis and Topic Modeling on Tourism During Covid-19 Pandemic

Frontiers in Computer Science ◽

10.3389/fcomp.2021.775368 ◽

2021 ◽

Vol 3 ◽

Author(s):

Ram Krishn Mishra ◽

Siddhaling Urolagin ◽

J. Angel Arul Jothi ◽

Ashwin Sanjay Neogi ◽

Nishad Nawaz

Keyword(s):

Social Media ◽

Deep Learning ◽

Topic Modeling ◽

Tourism Industry ◽

Classification Model ◽

Training Set ◽

Test Set ◽

Multiple Parameters ◽

Social Media Platforms ◽

Flow Of Information

The Covid-19 pandemic has disrupted the world economy and significantly influenced the tourism industry. Millions of people have shared their emotions, views, facts, and circumstances on numerous social media platforms, which has resulted in a massive flow of information. The high-density social media data has drawn many researchers to extract valuable information and understand the user’s emotions during the pandemic time. The research looks at the data collected from the micro-blogging site Twitter for the tourism sector, emphasizing sub-domains hospitality and healthcare. The sentiment of approximately 20,000 tweets have been calculated using Valence Aware Dictionary for Sentiment Reasoning (VADER) model. Furthermore, topic modeling was used to reveal certain hidden themes and determine the narrative and direction of the topics related to tourism healthcare, and hospitality. Topic modeling also helped us to identify inter-cluster similar terms and analyzing the flow of information from a group of a similar opinion. Finally, a cutting-edge deep learning classification model was used with different epoch sizes of the dataset to anticipate and classify the people’s feelings. The deep learning model has been tested with multiple parameters such as training set accuracy, test set accuracy, validation loss, validation accuracy, etc., and resulted in more than a 90% in training set accuracy tourism hospitality and healthcare reported 80.9 and 78.7% respectively on test set accuracy.

Download Full-text

Differential Biases and Variabilities of Deep Learning–Based Artificial Intelligence and Human Experts in Clinical Diagnosis: Retrospective Cohort and Survey Study (Preprint)

10.2196/preprints.33049 ◽

2021 ◽

Author(s):

Dongchul Cha ◽

Chongwon Pae ◽

Se A Lee ◽

Gina Na ◽

Young Kyun Hur ◽

...

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Data Augmentation ◽

Class Imbalance ◽

Classification Model ◽

Kappa Statistics ◽

Test Set ◽

Diagnostic Characteristics ◽

Test Sets ◽

The Given

BACKGROUND Deep learning (DL)–based artificial intelligence may have different diagnostic characteristics than human experts in medical diagnosis. As a data-driven knowledge system, heterogeneous population incidence in the clinical world is considered to cause more bias to DL than clinicians. Conversely, by experiencing limited numbers of cases, human experts may exhibit large interindividual variability. Thus, understanding how the 2 groups classify given data differently is an essential step for the cooperative usage of DL in clinical application. OBJECTIVE This study aimed to evaluate and compare the differential effects of clinical experience in otoendoscopic image diagnosis in both computers and physicians exemplified by the class imbalance problem and guide clinicians when utilizing decision support systems. METHODS We used digital otoendoscopic images of patients who visited the outpatient clinic in the Department of Otorhinolaryngology at Severance Hospital, Seoul, South Korea, from January 2013 to June 2019, for a total of 22,707 otoendoscopic images. We excluded similar images, and 7500 otoendoscopic images were selected for labeling. We built a DL-based image classification model to classify the given image into 6 disease categories. Two test sets of 300 images were populated: balanced and imbalanced test sets. We included 14 clinicians (otolaryngologists and nonotolaryngology specialists including general practitioners) and 13 DL-based models. We used accuracy (overall and per-class) and kappa statistics to compare the results of individual physicians and the ML models. RESULTS Our ML models had consistently high accuracies (balanced test set: mean 77.14%, SD 1.83%; imbalanced test set: mean 82.03%, SD 3.06%), equivalent to those of otolaryngologists (balanced: mean 71.17%, SD 3.37%; imbalanced: mean 72.84%, SD 6.41%) and far better than those of nonotolaryngologists (balanced: mean 45.63%, SD 7.89%; imbalanced: mean 44.08%, SD 15.83%). However, ML models suffered from class imbalance problems (balanced test set: mean 77.14%, SD 1.83%; imbalanced test set: mean 82.03%, SD 3.06%). This was mitigated by data augmentation, particularly for low incidence classes, but rare disease classes still had low per-class accuracies. Human physicians, despite being less affected by prevalence, showed high interphysician variability (ML models: kappa=0.83, SD 0.02; otolaryngologists: kappa=0.60, SD 0.07). CONCLUSIONS Even though ML models deliver excellent performance in classifying ear disease, physicians and ML models have their own strengths. ML models have consistent and high accuracy while considering only the given image and show bias toward prevalence, whereas human physicians have varying performance but do not show bias toward prevalence and may also consider extra information that is not images. To deliver the best patient care in the shortage of otolaryngologists, our ML model can serve a cooperative role for clinicians with diverse expertise, as long as it is kept in mind that models consider only images and could be biased toward prevalent diseases even after data augmentation.

Download Full-text

NIR Reflection Augmentation for DeepLearning-Based NIR Face Recognition

Symmetry ◽

10.3390/sym11101234 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1234 ◽

Cited By ~ 1

Author(s):

Jo ◽

Kim

Keyword(s):

Deep Learning ◽

Face Recognition ◽

Near Infrared ◽

Data Augmentation ◽

Recognition Rate ◽

Learning Approaches ◽

Training Set ◽

Simple Method ◽

Practical Applications ◽

Face Images

Face recognition using a near-infrared (NIR) sensor is widely applied to practical applications such as mobile unlocking or access control. However, unlike RGB sensors, few deep learning approaches have studied NIR face recognition. We conducted comparative experiments for the application of deep learning to NIR face recognition. To accomplish this, we gathered five public databases and trained two deep learning architectures. In our experiments, we found that simple architecture could have a competitive performance on the NIR face databases that are mostly composed of frontal face images. Furthermore, we propose a data augmentation method to train the architectures to improve recognition of users who wear glasses. With this augmented training set, the recognition rate for users who wear glasses increased by up to 16%. This result implies that the recognition of those who wear glasses can be overcome using this simple method without constructing an additional training set. Furthermore, the model that uses augmented data has symmetry with those trained with real glasses-wearing data regarding the recognition of people who wear glasses.

Download Full-text

Feature-Weighted Sampling for Proper Evaluation of Classification Models

Applied Sciences ◽

10.3390/app11052039 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2039

Author(s):

Hyunseok Shin ◽

Sejong Oh

Keyword(s):

Random Sampling ◽

Sampling Method ◽

Classification Model ◽

Training Set ◽

Test Set ◽

Feature Importance ◽

Proper Training ◽

Machine Learning Applications ◽

Test Sets ◽

The Given

In machine learning applications, classification schemes have been widely used for prediction tasks. Typically, to develop a prediction model, the given dataset is divided into training and test sets; the training set is used to build the model and the test set is used to evaluate the model. Furthermore, random sampling is traditionally used to divide datasets. The problem, however, is that the performance of the model is evaluated differently depending on how we divide the training and test sets. Therefore, in this study, we proposed an improved sampling method for the accurate evaluation of a classification model. We first generated numerous candidate cases of train/test sets using the R-value-based sampling method. We evaluated the similarity of distributions of the candidate cases with the whole dataset, and the case with the smallest distribution–difference was selected as the final train/test set. Histograms and feature importance were used to evaluate the similarity of distributions. The proposed method produces more proper training and test sets than previous sampling methods, including random and non-random sampling.

Download Full-text

Weakly supervised deep learning for determining the prognostic value of 18F-FDG PET/CT in extranodal natural killer/T cell lymphoma, nasal type

European Journal of Nuclear Medicine and Molecular Imaging ◽

10.1007/s00259-021-05232-3 ◽

2021 ◽

Author(s):

Rui Guo ◽

Xiaobin Hu ◽

Haoming Song ◽

Pengpeng Xu ◽

Haoping Xu ◽

...

Keyword(s):

Deep Learning ◽

Fdg Pet ◽

Cell Lymphoma ◽

Training Set ◽

Test Set ◽

Natural Killer T Cell ◽

Pet Ct ◽

Weakly Supervised ◽

Fdg Pet Ct ◽

Killer T Cell

Abstract Purpose To develop a weakly supervised deep learning (WSDL) method that could utilize incomplete/missing survival data to predict the prognosis of extranodal natural killer/T cell lymphoma, nasal type (ENKTL) based on pretreatment 18F-FDG PET/CT results. Methods One hundred and sixty-seven patients with ENKTL who underwent pretreatment 18F-FDG PET/CT were retrospectively collected. Eighty-four patients were followed up for at least 2 years (training set = 64, test set = 20). A WSDL method was developed to enable the integration of the remaining 83 patients with incomplete/missing follow-up information in the training set. To test generalization, these data were derived from three types of scanners. Prediction similarity index (PSI) was derived from deep learning features of images. Its discriminative ability was calculated and compared with that of a conventional deep learning (CDL) method. Univariate and multivariate analyses helped explore the significance of PSI and clinical features. Results PSI achieved area under the curve scores of 0.9858 and 0.9946 (training set) and 0.8750 and 0.7344 (test set) in the prediction of progression-free survival (PFS) with the WSDL and CDL methods, respectively. PSI threshold of 1.0 could significantly differentiate the prognosis. In the test set, WSDL and CDL achieved prediction sensitivity, specificity, and accuracy of 87.50% and 62.50%, 83.33% and 83.33%, and 85.00% and 75.00%, respectively. Multivariate analysis confirmed PSI to be an independent significant predictor of PFS in both the methods. Conclusion The WSDL-based framework was more effective for extracting 18F-FDG PET/CT features and predicting the prognosis of ENKTL than the CDL method.

Download Full-text

Oversampling Based on Data Augmentation in Convolutional Neural Network for Silicon Wafer Defect Classification

Knowledge Innovation Through Intelligent Software Methodologies, Tools and Techniques - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200547 ◽

2020 ◽

Author(s):

Uzma Batool ◽

Mohd Ibrahim Shapiai ◽

Nordinah Ismail ◽

Hilman Fauzi ◽

Syahrizal Salleh

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Silicon Wafer ◽

Data Augmentation ◽

Imbalanced Data ◽

Training Data ◽

Defect Classification ◽

Learning Method ◽

Test Set

Silicon wafer defect data collected from fabrication facilities is intrinsically imbalanced because of the variable frequencies of defect types. Frequently occurring types will have more influence on the classification predictions if a model gets trained on such skewed data. A fair classifier for such imbalanced data requires a mechanism to deal with type imbalance in order to avoid biased results. This study has proposed a convolutional neural network for wafer map defect classification, employing oversampling as an imbalance addressing technique. To have an equal participation of all classes in the classifier’s training, data augmentation has been employed, generating more samples in minor classes. The proposed deep learning method has been evaluated on a real wafer map defect dataset and its classification results on the test set returned a 97.91% accuracy. The results were compared with another deep learning based auto-encoder model demonstrating the proposed method, a potential approach for silicon wafer defect classification that needs to be investigated further for its robustness.

Download Full-text

Multiclass Classifier for P-Glycoprotein Substrates, Inhibitors, and Non-Active Compounds

Molecules ◽

10.3390/molecules24102006 ◽

2019 ◽

Vol 24 (10) ◽

pp. 2006 ◽

Cited By ~ 1

Author(s):

Liadys Mora Lagares ◽

Nikola Minovski ◽

Marjana Novič

Keyword(s):

In Silico ◽

Transmembrane Protein ◽

External Validation ◽

Assessment Process ◽

Classification Model ◽

Training Set ◽

Test Set ◽

Active Compounds ◽

P Glycoprotein ◽

Validation Set

P-glycoprotein (P-gp) is a transmembrane protein that actively transports a wide variety of chemically diverse compounds out of the cell. It is highly associated with the ADMET (absorption, distribution, metabolism, excretion and toxicity) properties of drugs/drug candidates and contributes to decreasing toxicity by eliminating compounds from cells, thereby preventing intracellular accumulation. Therefore, in the drug discovery and toxicological assessment process it is advisable to pay attention to whether a compound under development could be transported by P-gp or not. In this study, an in silico multiclass classification model capable of predicting the probability of a compound to interact with P-gp was developed using a counter-propagation artificial neural network (CP ANN) based on a set of 2D molecular descriptors, as well as an extensive dataset of 2512 compounds (1178 P-gp inhibitors, 477 P-gp substrates and 857 P-gp non-active compounds). The model provided a good classification performance, producing non error rate (NER) values of 0.93 for the training set and 0.85 for the test set, while the average precision (AvPr) was 0.93 for the training set and 0.87 for the test set. An external validation set of 385 compounds was used to challenge the model’s performance. On the external validation set the NER and AvPr values were 0.70 for both indices. We believe that this in silico classifier could be effectively used as a reliable virtual screening tool for identifying potential P-gp ligands.

Download Full-text

A personalized, web-based prognostic tool for resectable gastric cancer.

Journal of Clinical Oncology ◽

10.1200/jco.2017.35.15_suppl.e15575 ◽

2017 ◽

Vol 35 (15_suppl) ◽

pp. e15575-e15575

Author(s):

Brice Jabo ◽

John W. Morgan ◽

Mayada A. Aljehani ◽

Matthew J Selleck ◽

Albert Y. Lin

Keyword(s):

Gastric Cancer ◽

Health Care Providers ◽

Classification Model ◽

Adjuvant Chemoradiotherapy ◽

Care Providers ◽

Training Set ◽

Test Set ◽

Web Based ◽

Prognostic Tool ◽

Sensitivity Specificity

e15575 Background: Gastric cancer (GC) mortality remains high, with a 5-year survival of 30 percent. For patients with resectable GC, mortality varies depending on both patient and tumor characteristics. The current study sought to develop a web-based prognostic model to assist patients and health care providers in decision making regarding either surgery-only or adjuvant chemoradiotherapy (CRT). Methods: California SEER data was used and records, including demographic, pathologic, and treatment information, for 2,583 patients diagnosed with stage IB to III GC and treated with either surgery only or adjuvant CRT from 2006 to 2013 were retrieved. Purposeful selection using Cox regression model was used to identify important mortality predictors. Additionally, with simple random sampling, 70% of the data were assigned to the training set and the remaining 30% were assigned to the test set. Furthermore, generalized boosted classification model was trained using the training set and validated using the test set. Area under the curve (AUC) of the receiver operating characteristic (ROC), sensitivity, specificity and accuracy were determined for 5- and 10-year mortality. Results: The median survival was 33 months for patients in the training set, and 32 for the test set. Predictors included in the model were age, ethnicity (Asian/other, Hispanic, non-Hispanic black and non-Hispanic white), T-stage, histology (intestinal, diffuse and other), presence of signet ring (yes/no), proximal location (yes/no), lymph node ratio, and CRT following surgery (yes/no). Validation of the model on the test set showed as follows: AUC, sensitivity, specificity and accuracy of 0.78(95%CI = 0.75,0.82), 0.75, 0.65 and 0.70 for 5-year survival and 0.77(95%CI = 0.74,0.80), 0.79, 0.55 and 0.70 for 10-year survival. Conclusions: The proposed web-based prognostic tool using readily available patient and tumor characteristic provides validated and personalized prognostic information to aide clinicians and patients in GC adjuvant treatment decision process. [Table: see text]

Download Full-text

Segmentation of Cerebral Small Vessel Diseases-White Matter Hyperintensities Based on a Deep Learning System

Frontiers in Medicine ◽

10.3389/fmed.2021.681183 ◽

2021 ◽

Vol 8 ◽

Author(s):

Wei Shan ◽

Yunyun Duan ◽

Yu Zheng ◽

Zhenzhou Wu ◽

Shang Wei Chan ◽

...

Keyword(s):

Deep Learning ◽

White Matter ◽

White Matter Hyperintensities ◽

Automatic Segmentation ◽

Learning System ◽

Test Set ◽

Minor Revision ◽

Lesion Level ◽

External Test ◽

Cerebral Small Vessel Diseases

Objective: Reliable quantification of white matter hyperintensities (WHMs) resulting from cerebral small vessel diseases (CSVD) is essential for understanding their clinical impact. We aim to develop and clinically validate a deep learning system for automatic segmentation of CSVD-WMH from fluid-attenuated inversion recovery (FLAIR) imaging using large multicenter data.Method: A FLAIR imaging dataset of 1,156 patients diagnosed with CSVD associated WMH (median age, 54 years; 653 males) obtained between September 2018 and September 2019 from Beijing Tiantan Hospital was retrospectively analyzed in this study. Locations of CSVD-WMH on the FLAIR scans were manually marked by two experienced neurologists. Using the manually labeled data of 996 patients (development set), a U-shaped novel 2D convolutional neural network (CNN) architecture was trained for automatic segmentation of CSVD-WMH. The segmentation performance of the network was evaluated with per pixel and lesion level dice scores using an independent internal test set (n = 160) and a multi-center external test set (n = 90, three medical centers). The clinical suitability of the segmentation results, classified as acceptable, acceptable with minor revision, acceptable with major revision, and not acceptable, was analyzed by three independent neuroradiologists. The inter-neuroradiologists agreement rate was assessed by the Kendall-W test.Results: On the internal and external test sets, the proposed CNN architecture achieved per pixel and lesion level dice scores of 0.72 (external test set), and they were significantly better than the state-of-the-art deep learning architectures proposed for WMH segmentation. In the clinical evaluation, neuroradiologists observed the segmentation results for 95% of the patients were acceptable or acceptable with a minor revision.Conclusions: A deep learning system can be used for automated, objective, and clinically meaningful segmentation of CSVD-WMH with high accuracy.

Download Full-text

Research on Classification of Fine-Grained Rock Images Based on Deep Learning

Computational Intelligence and Neuroscience ◽

10.1155/2021/5779740 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Yong Liang ◽

Qi Cui ◽

Xing Luo ◽

Zhisong Xie

Keyword(s):

Deep Learning ◽

Image Classification ◽

Data Augmentation ◽

Classification Performance ◽

Classification Model ◽

Image Block ◽

Rock Classification ◽

Original Algorithm ◽

Fine Grained ◽

Low Efficiency

Rock classification is a significant branch of geology which can help understand the formation and evolution of the planet, search for mineral resources, and so on. In traditional methods, rock classification is usually done based on the experience of a professional. However, this method has problems such as low efficiency and susceptibility to subjective factors. Therefore, it is of great significance to establish a simple, fast, and accurate rock classification model. This paper proposes a fine-grained image classification network combining image cutting method and SBV algorithm to improve the classification performance of a small number of fine-grained rock samples. The method uses image cutting to achieve data augmentation without adding additional datasets and uses image block voting scoring to obtain richer complementary information, thereby improving the accuracy of image classification. The classification accuracy of 32 images is 75%, 68.75%, and 75%. The results show that the method proposed in this paper has a significant improvement in the accuracy of image classification, which is 34.375%, 18.75%, and 43.75% higher than that of the original algorithm. It verifies the effectiveness of the algorithm in this paper and at the same time proves that deep learning has great application value in the field of geology.

Download Full-text