data level
Recently Published Documents


TOTAL DOCUMENTS

194
(FIVE YEARS 67)

H-INDEX

14
(FIVE YEARS 4)

Author(s):  
Tiago Knorst ◽  
Julio Vicenzi ◽  
Michael G. Jordan ◽  
Jonathan H. de Almeida ◽  
Guilherme Korol ◽  
...  

2021 ◽  
Vol 10 (4) ◽  
pp. 235
Author(s):  
NI PUTU YULIKA TRISNA WIJAYANTI ◽  
EKA N. KENCANA ◽  
I WAYAN SUMARJAYA

Imbalanced data is a problem that is often found in real-world cases of classification. Imbalanced data causes misclassification will tend to occur in the minority class. This can lead to errors in decision-making if the minority class has important information and it’s the focus of attention in research. Generally, there are two approaches that can be taken to deal with the problem of imbalanced data, the data level approach and the algorithm level approach. The data level approach has proven to be very effective in dealing with imbalanced data and more flexible. The oversampling method is one of the data level approaches that generally gives better results than the undersampling method. SMOTE is the most popular oversampling method used in more applications. In this study, we will discuss in more detail the SMOTE method, potential, and disadvantages of this method. In general, this method is intended to avoid overfitting and improve classification performance in the minority class. However, this method also causes overgeneralization which tends to be overlapping.


Author(s):  
Pengfei Zhang ◽  
Tianrui Li ◽  
Zhong Yuan ◽  
Chuan Luo ◽  
Guoqiang Wang ◽  
...  

2021 ◽  
Author(s):  
Yasamin Salimi ◽  
Daniel Domingo-Fernandez ◽  
Carlos Bobis-Alvarez ◽  
Martin Hofmann-Apitius ◽  
Colin Birkenbihl ◽  
...  

INTRODUCTION: Currently, AD cohort datasets are difficult to find, lack across-cohort interoperability, and the content of the shared datasets often only becomes clear to third-party researchers once data access has been granted. METHODS: We accessed and systematically investigated the content of 20 major AD cohort datasets on data-level. A medical professional and a data specialist manually curated and semantically harmonized the acquired datasets. We developed a platform that facilitates data exploration. RESULTS: We present ADataViewer, an interactive platform that facilitates the exploration of 20 cohort datasets with respect to longitudinal follow-up, demographics, ethnoracial diversity, measured modalities, and statistical properties of individual variables. Additionally, we publish a variable mapping catalog harmonizing 1,196 variables across the 20 cohorts. The platform is available under https://adata.scai.fraunhofer.de/. DISCUSSION: ADataViewer supports robust data-driven research by transparently displaying cohort dataset content and suggesting datasets suited for discovery and validation studies based on selected variables of interest.


Sign in / Sign up

Export Citation Format

Share Document