scholarly journals Open set task augmentation facilitates generalization of deep neural networks trained on small data sets

Author(s):  
Wadhah Zai El Amri ◽  
Felix Reinhart ◽  
Wolfram Schenck

AbstractMany application scenarios for image recognition require learning of deep networks from small sample sizes in the order of a few hundred samples per class. Then, avoiding overfitting is critical. Common techniques to address overfitting are transfer learning, reduction of model complexity and artificial enrichment of the available data by, e.g., data augmentation. A key idea proposed in this paper is to incorporate additional samples into the training that do not belong to the classes of the target task. This can be accomplished by formulating the original classification task as an open set classification task. While the original closed set classification task is not altered at inference time, the recast as open set classification task enables the inclusion of additional data during training. Hence, the original closed set classification task is augmented with an open set task during training. We therefore call the proposed approach open set task augmentation. In order to integrate additional task-unrelated samples into the training, we employ the entropic open set loss originally proposed for open set classification tasks and also show that similar results can be obtained with a modified sum of squared errors loss function. Learning with the proposed approach benefits from the integration of additional “unknown” samples, which are often available, e.g., from open data sets, and can then be easily integrated into the learning process. We show that this open set task augmentation can improve model performance even when these additional samples are rather few or far from the domain of the target task. The proposed approach is demonstrated on two exemplary scenarios based on subsets of the ImageNet and Food-101 data sets as well as with several network architectures and two loss functions. We further shed light on the impact of the entropic open set loss on the internal representations formed by the networks. Open set task augmentation is particularly valuable when no additional data from the target classes are available—a scenario often faced in practice.

2020 ◽  
Author(s):  
RUIMIN MA ◽  
Yamil J. Colon ◽  
Tengfei Luo

<p>Metal-organic frameworks (MOFs) are a class of materials promising for gas adsorption due to their highly tunable nano-porous structures and host-guest interactions. While machine learning (ML) has been leveraged to aid the design or screen of MOFs for different purposes, the needs of big data are not always met, limiting the applicability of ML models trained against small data sets. In this work, we introduce a transfer learning technique to improve the accuracy and applicability of ML models trained with small amount of MOF adsorption data. This technique leverages potentially shareable knowledge from a source task to improve the models on the target tasks. As demonstrations, a deep neural network (DNN) trained on H<sub>2</sub> adsorption data with 13,506 MOF structures at 100 bar and 243 K is used as the source task. When transferring knowledge from the source task to H<sub>2</sub> adsorption at 100 bar and 130 K (one target task), the predictive accuracy on target task was improved from 0.960 (direct training) to 0.991 (transfer learning). We also tested transfer learning across different gas species (i.e. from H<sub>2</sub> to CH<sub>4</sub>), with predictive accuracy of CH<sub>4</sub> adsorption being improved from 0.935 (direct training) to 0.980 (transfer learning). Based on further analysis, transfer learning will always work on the target tasks with low generalizability. However, when transferring the knowledge from the source task to Xe/Kr adsorption, the transfer learning does not improve the predictive accuracy, which is attributed to the lack of common descriptors that is key to the underlying knowledge. <b></b></p>


2008 ◽  
Vol 08 (04) ◽  
pp. 495-512 ◽  
Author(s):  
PIETRO COLI ◽  
GIAN LUCA MARCIALIS ◽  
FABIO ROLI

The automatic vitality detection of a fingerprint has become an important issue in personal verification systems based on this biometric. It has been shown that fake fingerprints made using materials like gelatine or silicon can deceive commonly used sensors. Recently, the extraction of vitality features from fingerprint images has been proposed to address this problem. Among others, static and dynamic features have been separately studied so far, thus their respective merits are not yet clear; especially because reported results were often obtained with different sensors and using small data sets which could have obscured relative merits, due to the potential small sample-size issues. In this paper, we compare some static and dynamic features by experiments on a larger data set and using the same optical sensor for the extraction of both feature sets. We dealt with fingerprint stamps made using liquid silicon rubber. Reported results show the relative merits of static and dynamic features and the performance improvement achievable by using such features together.


BJPsych Open ◽  
2021 ◽  
Vol 7 (S1) ◽  
pp. S80-S81
Author(s):  
Sarah Harvey ◽  
Joanna Bromley ◽  
Miles Edwards ◽  
Megan Hooper ◽  
Hannah McAndrew ◽  
...  

AimsAn audit to assess the impact of an Integrated Psychological Medicine Service (IPMS) on healthcare utilization pre & post intervention. We hypothesized that an IPMS approach would reduce healthcare utilization.BackgroundThe IPMS focusses on integrating biopsychosocial assessments into physical healthcare pathways. It has developed in stages as opportunities presented in different specialities leading to a heterogeneous non-standardised service. The key aim is involvement of mental health practitioners, psychologists & psychiatrists in complex patients with comorbidity or functional presentations in combination with the specialty MDT. This audit is the first attempt to gather data across all involved specialities and complete a randomised deep dive into cases.MethodReferrals into IMPS from July 2019 to June 2020 pulled 129 referrals, of which a 10% randomised sample of 13 patients was selected to analyse. 5 patients had one year of data either side of the duration of the IPMS intervention (excluding 8 patients with incomplete data sets).We analysed; the duration & nature of the IPMS intervention, the number, duration & speciality of inpatient admissions & short stays, outpatient attendances, non-attendances & patient cancellations. Psychosocial information was also gathered. One non-randomised patient was analysed as a comparative case illustration.ResultRandomised patients; patient 78's utilisation remained static, patient 71 post-referral engaged with health psychology & reduced healthcare utilisation. Patient 7 increased healthcare utilisation post-referral secondary to health complications. Patient 54 did not attend & increased healthcare utilisation post-referral. Patient 106 had increased healthcare utilisation post-referral from a new health condition. The randomised sample identified limitations of using healthcare utilisation as an outcome measure when contrasted to the non-randomised case (which significantly reduced healthcare utilisation post-referral).ConclusionCorrelation only can be inferred from the data due to sample size, limitations & confounding factors e.g. psycho-social life events, acquired illness. Alternative outcome measurements documented (e.g PHQ9/GAD7) were not reliably recorded across pathways.The results evidenced that single cases can demonstrate highly desirable effects of a biopsychosocial approach but they can also skew data sets if results are pooled due to the small sample size & heterogeneous interventions. With some patients an increase in healthcare utilisation was appropriate for an improved clinical outcome. This audit identified that utilising healthcare utilisation as an outcome measure is a crude tool with significant limitations & the need to agree tailored outcome measures based on the type of intervention to assess the impact of IPMS.


2017 ◽  
Vol 156 (5) ◽  
pp. 783-793 ◽  
Author(s):  
Zachary Farhood ◽  
Shaun A. Nguyen ◽  
Stephen C. Miller ◽  
Meredith A. Holcomb ◽  
Ted A. Meyer ◽  
...  

Objective (1) To analyze reported speech perception outcomes in patients with inner ear malformations who undergo cochlear implantation, (2) to review the surgical complications and findings, and (3) to compare the 2 classification systems of Jackler and Sennaroglu. Data Sources PubMed, Scopus (including Embase), Medline, and CINAHL Plus. Review Methods Fifty-nine articles were included that contained speech perception and/or intraoperative data. Cases were differentiated depending on whether the Jackler or Sennaroglu malformation classification was used. A meta-analysis of proportions examined incidences of complete insertion, gusher, and facial nerve aberrancy. For speech perception data, weighted means and standard deviations were calculated for all malformations for short-, medium-, and long-term follow-up. Speech tests were grouped into 3 categories—closed-set words, open-set words, and open-set sentences—and then compared through a comparison-of-means t test. Results Complete insertion was seen in 81.8% of all inner ear malformations (95% CI: 72.6-89.5); gusher was reported in 39.1% of cases (95% CI: 30.3-48.2); and facial nerve anomalies were encountered in 34.4% (95% CI: 20.1-50.3). Significant improvements in average performance were seen for closed- and open-set tests across all malformation types at 12 months postoperatively. Conclusions Cochlear implantation outcomes are favorable for those with inner ear malformations from a surgical and speech outcome standpoint. Accurate classification of anatomic malformations, as well as standardization of postimplantation speech outcomes, is necessary to improve understanding of the impact of implantation in this difficult patient population.


2020 ◽  
Vol 67 (1) ◽  
pp. 83-100 ◽  
Author(s):  
Renze Zhou ◽  
Zhiguo Xing ◽  
Haidou Wang ◽  
Zhongyu Piao ◽  
Yanfei Huang ◽  
...  

Purpose With the development of deep learning-based analytical techniques, increased research has focused on fatigue data analysis methods based on deep learning, which are gaining in popularity. However, the application of deep neural networks in the material science domain is mainly inhibited by data availability. In this paper, to overcome the difficulty of multifactor fatigue life prediction with small data sets, Design/methodology/approach A multiple neural network ensemble (MNNE) is used, and an MNNE with a general and flexible explicit function is developed to accurately quantify the complicated relationships hidden in multivariable data sets. Moreover, a variational autoencoder-based data generator is trained with small sample sets to expand the size of the training data set. A comparative study involving the proposed method and traditional models is performed. In addition, a filtering rule based on the R2 score is proposed and applied in the training process of the MNNE, and this approach has a beneficial effect on the prediction accuracy and generalization ability. Findings A comparative study involving the proposed method and traditional models is performed. The comparative experiment confirms that the use of hybrid data can improve the accuracy and generalization ability of the deep neural network and that the MNNE outperforms support vector machines, multilayer perceptron and deep neural network models based on the goodness of fit and robustness in the small sample case. Practical implications The experimental results imply that the proposed algorithm is a sophisticated and promising multivariate method for predicting the contact fatigue life of a coating when data availability is limited. Originality/value A data generated model based on variational autoencoder was used to make up lack of data. An MNNE method was proposed to apply in the small data case of fatigue life prediction.


2019 ◽  
Vol 52 (3) ◽  
pp. 397-423
Author(s):  
Luc Steinbuch ◽  
Thomas G. Orton ◽  
Dick J. Brus

AbstractArea-to-point kriging (ATPK) is a geostatistical method for creating high-resolution raster maps using data of the variable of interest with a much lower resolution. The data set of areal means is often considerably smaller ($$<\,50 $$<50 observations) than data sets conventionally dealt with in geostatistical analyses. In contemporary ATPK methods, uncertainty in the variogram parameters is not accounted for in the prediction; this issue can be overcome by applying ATPK in a Bayesian framework. Commonly in Bayesian statistics, posterior distributions of model parameters and posterior predictive distributions are approximated by Markov chain Monte Carlo sampling from the posterior, which can be computationally expensive. Therefore, a partly analytical solution is implemented in this paper, in order to (i) explore the impact of the prior distribution on predictions and prediction variances, (ii) investigate whether certain aspects of uncertainty can be disregarded, simplifying the necessary computations, and (iii) test the impact of various model misspecifications. Several approaches using simulated data, aggregated real-world point data, and a case study on aggregated crop yields in Burkina Faso are compared. The prior distribution is found to have minimal impact on the disaggregated predictions. In most cases with known short-range behaviour, an approach that disregards uncertainty in the variogram distance parameter gives a reasonable assessment of prediction uncertainty. However, some severe effects of model misspecification in terms of overly conservative or optimistic prediction uncertainties are found, highlighting the importance of model choice or integration into ATPK.


2003 ◽  
Vol 96 (6) ◽  
pp. 1617-1625 ◽  
Author(s):  
Christian Nansen ◽  
James F. Campbell ◽  
Thomas W. Phillips ◽  
Michael A. Mullen

Aerospace ◽  
2021 ◽  
Vol 8 (2) ◽  
pp. 30
Author(s):  
Jonas Aust ◽  
Sam Shankland ◽  
Dirk Pons ◽  
Ramakrishnan Mukundan ◽  
Antonija Mitrovic

Background—In the field of aviation, maintenance and inspections of engines are vitally important in ensuring the safe functionality of fault-free aircrafts. There is value in exploring automated defect detection systems that can assist in this process. Existing effort has mostly been directed at artificial intelligence, specifically neural networks. However, that approach is critically dependent on large datasets, which can be problematic to obtain. For more specialised cases where data are sparse, the image processing techniques have potential, but this is poorly represented in the literature. Aim—This research sought to develop methods (a) to automatically detect defects on the edges of engine blades (nicks, dents and tears) and (b) to support the decision-making of the inspector when providing a recommended maintenance action based on the engine manual. Findings—For a small sample test size of 60 blades, the combined system was able to detect and locate the defects with an accuracy of 83%. It quantified morphological features of defect size and location. False positive and false negative rates were 46% and 17% respectively based on ground truth. Originality—The work shows that image-processing approaches have potential value as a method for detecting defects in small data sets. The work also identifies which viewing perspectives are more favourable for automated detection, namely, those that are perpendicular to the blade surface.


JMIR Diabetes ◽  
10.2196/10324 ◽  
2019 ◽  
Vol 4 (3) ◽  
pp. e10324 ◽  
Author(s):  
Sally Jane Burford ◽  
Sora Park ◽  
Paresh Dawda

Background As digital healthcare expands to include the use of mobile devices, there are opportunities to integrate these technologies into the self-management of chronic disease. Purpose built apps for diabetes self-management are plentiful and vary in functionality; they offer capability for individuals to record, manage, display, and interpret their own data. The optimal incorporation of mobile tablets into diabetes self-care is little explored in research, and guidelines for use are scant. Objective The purpose of this study was to examine an individual’s use of mobile devices and apps in the self-management of type 2 diabetes to establish the potential and value of this ubiquitous technology for chronic healthcare. Methods In a 9-month intervention, 28 patients at a large multidisciplinary healthcare center were gifted internet connected Apple iPads with preinstalled apps and given digital support to use them. They were invited to take up predefined activities, which included recording their own biometrics, monitoring their diet, and traditional online information seeking. Four online surveys captured the participants’ perceptions and health outcomes throughout the study. This article reports on the qualitative analysis of the open-ended responses in all four surveys. Results Using apps, participants self-curated small data sets that included their blood glucose level, blood pressure, weight, and dietary intake. The dynamic visualizations of the data in the form of charts and diagrams were created using apps and participants were able to interpret the impact of their choices and behaviors from the diagrammatic form of their small personal data sets. Findings are presented in four themes: (1) recording personal data; (2) modelling and visualizing the data; (3) interpreting the data; and (4) empowering and improving health. Conclusions The modelling capability of apps using small personal data sets, collected and curated by individuals, and the resultant graphical information that can be displayed on tablet screens proves a valuable asset for diabetes self-care. Informed by their own data, individuals are well-positioned to make changes in their daily lives that will improve their health.


2016 ◽  
Vol 35 (2) ◽  
pp. 173-190 ◽  
Author(s):  
S. Shahid Shaukat ◽  
Toqeer Ahmed Rao ◽  
Moazzam A. Khan

AbstractIn this study, we used bootstrap simulation of a real data set to investigate the impact of sample size (N = 20, 30, 40 and 50) on the eigenvalues and eigenvectors resulting from principal component analysis (PCA). For each sample size, 100 bootstrap samples were drawn from environmental data matrix pertaining to water quality variables (p = 22) of a small data set comprising of 55 samples (stations from where water samples were collected). Because in ecology and environmental sciences the data sets are invariably small owing to high cost of collection and analysis of samples, we restricted our study to relatively small sample sizes. We focused attention on comparison of first 6 eigenvectors and first 10 eigenvalues. Data sets were compared using agglomerative cluster analysis using Ward’s method that does not require any stringent distributional assumptions.


Sign in / Sign up

Export Citation Format

Share Document