Predicting Classifiers Efficacy in Relation with Data Complexity Metric Using Under-Sampling Techniques

2021 ◽  
pp. 85-92
Author(s):  
Deepika Singh ◽  
Anju Saha ◽  
Anjana Gosain
Author(s):  
R.C. Newell ◽  
L.J. Seiderer ◽  
J.E. Robinson

The purpose of the present study was to assess the relationship between sediment composition and biological community structure in mixed sands and gravel deposits of the eastern English Channel. Although some species are clearly associated with particular sediment types, the results confirm the lack of correspondence between community composition of the benthos and particle size distribution in unconsolidated sand and gravel deposits. The results also suggest that sample-to-sample variability commonly recorded in the species composition of macrofauna may reflect significant under-sampling by conventional grab sampling techniques. The implications of this for environmental monitoring and impact studies is discussed.


Author(s):  
Sebastian Kozerke ◽  
Redha Boubertakh ◽  
Marc Miquel

In cardiovascular magnetic resonance imaging, scan time is of critical importance, as many applications require breath-holding to suppress respiratory-related image artefacts. In this chapter, approaches to reduce scan time, while maintaining resolution, are described. Besides partial sampling of k-space, non-Cartesian k-space trajectories are introduced, followed by an overview of data under-sampling techniques as they are implemented on clinical magnetic resonance systems. Advantages and limitations of each of these methods are briefly described.


2020 ◽  
Vol 8 (1) ◽  
Author(s):  
Akhmad Rezki Purnajaya ◽  
Wisnu Ananta Kusuma ◽  
Medria Kusuma Dewi Hardhienata

The prediction of Compound-Protein Interactions (CPI) is an essential step in the drug-target analysis for developing new drugs as well as for drug repositioning. One challenging issue in this field is that commonly there are more numbers of non-interacting compound-protein pairs than interacting pairs. This problem causes bias, which may degrade the prediction of CPI. Besides, currently, there is not much research on CPI prediction that compares data sampling techniques to handle the class imbalance problem. To address this issue, we compare four data sampling techniques, namely Random Under-sampling (RUS), Combination of Over-Under-sampling (COUS), Synthetic Minority Over-sampling Technique (SMOTE), and Tomek Link (T-Link). The benchmark CPI data: Nuclear Receptor and G-Protein Coupled Receptor (GPCR) are used to test these techniques. Area Under Curve (AUC) applied to evaluate the CPI prediction performance of each technique. Results show that the AUC values for RUS, COUS, SMOTE, and T-Link are 0.75, 0.77, 0.85 and 0.79 respectively on Nuclear Receptor data and 0.70, 0.85, 0.91 and 0.72 respectively on GPCR data. These results indicate that SMOTE has the highest AUC values. Furthermore, we found that the SMOTE technique is more capable of handling class imbalance problems on CPI prediction compared to the remaining three other techniques.


Sign in / Sign up

Export Citation Format

Share Document