scholarly journals WMW-A: Rank-based two-sample independent test for small sample sizes through an auxiliary sample

2021 ◽  
Author(s):  
Yin Guo ◽  
Limin Li

Two-sample independent test methods are widely used in case-control studies to identify significant changes or differences, for example, to identify key pathogenic genes by comparing the gene expression levels in normal and disease cells. However, due to the high cost of data collection or labelling, many studies face the small sample problem, for which the traditional two-sample test methods often lose power. We propose a novel rank-based nonparametric test method WMW-A for small sample problem by introducing a three-sample statistic through another auxiliary sample. By combining the case, control and auxiliary samples together, we construct a three-sample WMW-A statistic based on the gap between the average ranks of the case and control samples in the combined samples. By assuming that the auxiliary sample follows a mixed distribution of the case and control populations, we analyze the theoretical properties of the WMW-A statistic and approximate the theoretical power. The extensive simulation experiments and real applications on microarray gene expression data sets show the WMW-A test could significantly improve the test power for two-sample problem with small sample sizes, by either available unlabelled auxiliary data or generated auxiliary data.

Author(s):  
Guy M. Goodwin ◽  
Michael Browning

Neuroimaging techniques have been used extensively to compare brain structure and function between patients with, or at risk of, depression and control subjects. The goal of this work has largely been to identify pathophysiological processes in depression. However, progress in this field has been limited by the heterogeneity of patient populations, the use of small sample sizes, and an overreliance on case-control studies. These limitations have increasingly been acknowledged with recent work collecting much larger samples and employing a variety of study designs, including those able to stratify patient populations. This chapter reviews imaging studies in depression, highlighting both outstanding questions and promising recent findings.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
L. Mason ◽  
F. Shic ◽  
T. Falck-Ytter ◽  
B. Chakrabarti ◽  
T. Charman ◽  
...  

Abstract Background The neurocognitive mechanisms underlying autism spectrum disorder (ASD) remain unclear. Progress has been largely hampered by small sample sizes, variable age ranges and resulting inconsistent findings. There is a pressing need for large definitive studies to delineate the nature and extent of key case/control differences to direct research towards fruitful areas for future investigation. Here we focus on perception of biological motion, a promising index of social brain function which may be altered in ASD. In a large sample ranging from childhood to adulthood, we assess whether biological motion preference differs in ASD compared to neurotypical participants (NT), how differences are modulated by age and sex and whether they are associated with dimensional variation in concurrent or later symptomatology. Methods Eye-tracking data were collected from 486 6-to-30-year-old autistic (N = 282) and non-autistic control (N = 204) participants whilst they viewed 28 trials pairing biological (BM) and control (non-biological, CTRL) motion. Preference for the biological motion stimulus was calculated as (1) proportion looking time difference (BM-CTRL) and (2) peak look duration difference (BM-CTRL). Results The ASD group showed a present but weaker preference for biological motion than the NT group. The nature of the control stimulus modulated preference for biological motion in both groups. Biological motion preference did not vary with age, gender, or concurrent or prospective social communicative skill within the ASD group, although a lack of clear preference for either stimulus was associated with higher social-communicative symptoms at baseline. Limitations The paired visual preference we used may underestimate preference for a stimulus in younger and lower IQ individuals. Our ASD group had a lower average IQ by approximately seven points. 18% of our sample was not analysed for various technical and behavioural reasons. Conclusions Biological motion preference elicits small-to-medium-sized case–control effects, but individual differences do not strongly relate to core social autism associated symptomatology. We interpret this as an autistic difference (as opposed to a deficit) likely manifest in social brain regions. The extent to which this is an innate difference present from birth and central to the autistic phenotype, or the consequence of a life lived with ASD, is unclear.


2021 ◽  
Author(s):  
Metin Bulus

A recent systematic review of experimental studies conducted in Turkey between 2010 and 2020 reported that small sample sizes had been a significant drawback (Bulus and Koyuncu, 2021). A small chunk of the studies were small-scale true experiments (subjects randomized into the treatment and control groups). The remaining studies consisted of quasi-experiments (subjects in treatment and control groups were matched on pretest or other covariates) and weak experiments (neither randomized nor matched but had the control group). They had an average sample size below 70 for different domains and outcomes. These small sample sizes imply a strong (and perhaps erroneous) assumption about the minimum relevant effect size (MRES) of intervention before an experiment is conducted; that is, a standardized intervention effect of Cohen’s d < 0.50 is not relevant to education policy or practice. Thus, an introduction to sample size determination for pretest-posttest simple experimental designs is warranted. This study describes nuts and bolts of sample size determination, derives expressions for optimal design under differential cost per treatment and control units, provide convenient tables to guide sample size decisions for MRES values between 0.20 ≤ Cohen’s d ≤ 0.50, and describe the relevant software along with illustrations.


Author(s):  
Wei Ji ◽  
Xi Li ◽  
Yueting Zhuang ◽  
Omar El Farouk Bourahla ◽  
Yixin Ji ◽  
...  

Clothing segmentation is a challenging vision problem typically implemented within a fine-grained semantic segmentation framework. Different from conventional segmentation, clothing segmentation has some domain-specific properties such as texture richness, diverse appearance variations, non-rigid geometry deformations, and small sample learning. To deal with these points, we propose a semantic locality-aware segmentation model, which adaptively attaches an original clothing image with a semantically similar (e.g., appearance or pose) auxiliary exemplar by search. Through considering the interactions of the clothing image and its exemplar, more intrinsic knowledge about the locality manifold structures of clothing images is discovered to make the learning process of small sample problem more stable and tractable. Furthermore, we present a CNN model based on the deformable convolutions to extract the non-rigid geometry-aware features for clothing images. Experimental results demonstrate the effectiveness of the proposed model against the state-of-the-art approaches.


2020 ◽  
Author(s):  
lin cao ◽  
xibao huo ◽  
yanan guo ◽  
yuying shao ◽  
kangning du

Abstract Face photo-sketch recognition refers to the process of matching sketches to photos. Recently, there has been a growing interest in using a convolutional neural network to learn discriminatively deep features. However, due to the large domain discrepancy and the high cost of acquiring sketches, the discriminative power of the deeply learned features will be inevitably reduced. In this paper, we propose a discriminative center loss to learn domain invariant features for face photo-sketch recognition. Specifically, two Mahalanobis distance matrices are proposed to enhance the intra-class compactness during inter-class separability. Moreover, a regularization technique is adopted on the Mahalanobis matrices to alleviate the small sample problem. Extensive experimental results on the e-PRIP dataset verified the effectiveness of the proposed discriminative center loss.


2021 ◽  
Vol 13 (12) ◽  
pp. 2268
Author(s):  
Hang Gong ◽  
Qiuxia Li ◽  
Chunlai Li ◽  
Haishan Dai ◽  
Zhiping He ◽  
...  

Hyperspectral images are widely used for classification due to its rich spectral information along with spatial information. To process the high dimensionality and high nonlinearity of hyperspectral images, deep learning methods based on convolutional neural network (CNN) are widely used in hyperspectral classification applications. However, most CNN structures are stacked vertically in addition to using a onefold size of convolutional kernels or pooling layers, which cannot fully mine the multiscale information on the hyperspectral images. When such networks meet the practical challenge of a limited labeled hyperspectral image dataset—i.e., “small sample problem”—the classification accuracy and generalization ability would be limited. In this paper, to tackle the small sample problem, we apply the semantic segmentation function to the pixel-level hyperspectral classification due to their comparability. A lightweight, multiscale squeeze-and-excitation pyramid pooling network (MSPN) is proposed. It consists of a multiscale 3D CNN module, a squeezing and excitation module, and a pyramid pooling module with 2D CNN. Such a hybrid 2D-3D-CNN MSPN framework can learn and fuse deeper hierarchical spatial–spectral features with fewer training samples. The proposed MSPN was tested on three publicly available hyperspectral classification datasets: Indian Pine, Salinas, and Pavia University. Using 5%, 0.5%, and 0.5% training samples of the three datasets, the classification accuracies of the MSPN were 96.09%, 97%, and 96.56%, respectively. In addition, we also selected the latest dataset with higher spatial resolution, named WHU-Hi-LongKou, as the challenge object. Using only 0.1% of the training samples, we could achieve a 97.31% classification accuracy, which is far superior to the state-of-the-art hyperspectral classification methods.


Sign in / Sign up

Export Citation Format

Share Document