scholarly journals Hierarchical confounder discovery in the experiment–machine learning cycle

2021 ◽  
Author(s):  
Alex Rogozhnikov ◽  
Pavan Ramkumar ◽  
Saul Kato ◽  
Sean Escola

The promise of using machine learning (ML) to extract scientific insights from high dimensional datasets is tempered by the frequent presence of confounding variables, and it behooves scientists to determine whether or not a model has extracted the desired information or instead may have fallen prey to bias. Due both to features of many natural phenomena and to practical constraints of experimental design, complex bioscience datasets tend to be organized in nested hierarchies which can obfuscate the origin of a confounding effect and obviate traditional methods of confounder amelioration. We propose a simple non-parametric statistical method called the Rank-to-Group (RTG) score that can identify hierarchical confounder effects in raw data and ML-derived data embeddings. We show that RTG scores correctly assign the effects of hierarchical confounders in cases where linear methods such as regression fail. In a large public biomedical image dataset, we discover unreported effects of experimental design. We then use RTG scores to discover cross-modal correlated variability in a complex multi-phenotypic biological dataset. This approach should be of general use in experiment–analysis cycles and to ensure confounder robustness in ML models.

2020 ◽  
Vol 17 (2) ◽  
pp. 531-545
Author(s):  
Cut Amalia Saffiera ◽  
Raini Hassan ◽  
Amelia Ritahani Ismail

Healthy lifestyle is a significant factor that impacts on the budget for medicine. According to psychological studies, personality traits based on the Big Five personality traits especially the neuroticism and conscientiousness, have the ability to predict healthy lifestyle profiling. Electrophysiological signals have been used to explore the nature of individual differences and personality that are related to perception. In this paper, we reviewed studies examining healthy lifestyle profile i.e., preventive and curative using electroencephalography (EEG) and event-related potential (ERP) signals. This study proposed a general experimental model by reviewing the literature to build suitable experimental design for implementing artificial intelligence techniques based on the machine learning.


2017 ◽  
Vol 81 (10) ◽  
pp. S361 ◽  
Author(s):  
Laura Hack ◽  
Tanja Jovanovic ◽  
Sierra Carter ◽  
Kerry Ressler ◽  
Alicia Smith

Science ◽  
2018 ◽  
Vol 362 (6416) ◽  
pp. eaat8603 ◽  
Author(s):  
Kangway V. Chuang ◽  
Michael J. Keiser

Ahneman et al. (Reports, 13 April 2018) applied machine learning models to predict C–N cross-coupling reaction yields. The models use atomic, electronic, and vibrational descriptors as input features. However, the experimental design is insufficient to distinguish models trained on chemical features from those trained solely on random-valued features in retrospective and prospective test scenarios, thus failing classical controls in machine learning.


2020 ◽  
Author(s):  
Renata Silva ◽  
Daniel Oliveira ◽  
Davi Pereira Santos ◽  
Lucio F.D. Santos ◽  
Rodrigo Erthal Wilson ◽  
...  

Principal component analysis (PCA) is an efficient model for the optimization problem of finding d' axes of a subspace Rd' ⊆ Rd so that the mean squared distances from a given set R of points to the axes are minimal. Despite being steadily employed since 1901 in different scenarios, e.g., mechanics, PCA has become an important link in machine learning chained tasks, such as feature learning and AutoML designs. A frequent yet open issue that arises from supervised-based problems is how many PCA axes are required for the performance of machine learning constructs to be tuned. Accordingly, we investigate the behavior of six independent and uncoupled criteria for estimating the number of PCA axes, namely Scree-Plot %, Scree Plot Gap, Kaiser-Guttman, Broken-Stick, p-Score, and 2D. In total, we evaluate the performance of those approaches in 20 high dimensional datasets by using (i) four different classifiers, and (ii) a hypothesis test upon the reported F-Measures. Results indicate Broken-Stick and Scree-Plot % criteria consistently outperformed the competitors regarding supervised-based tasks, whereas estimators Kaiser-Guttman and Scree-Plot Gap delivered poor performances in the same scenarios.


Author(s):  
Eduardo Romero ◽  
Fabio González

This chapter introduces the reader into the main topics covered by the book: biomedical images, biomedical image analysis and machine learning. The general concepts of each topic are presented and the most representative techniques are briefly discussed. Nevertheless, the chapter focuses on the problem of image understanding (i.e., the problem of mapping the low-level image visual content to its high-level semantic meaning). The chapter discusses different important biomedical problems, such as computer assisted diagnosis, biomedical image retrieval, image-user interaction and medical image navigation, which require solutions involving image understanding. Image understanding, thought of as the strategy to associate semantic meaning to the image visual contents, is a difficult problem that opens up many research challenges. In the context of actual biomedical problems, this is probably an invaluable tool for improving the amount of knowledge that medical doctors are currently extracting from their day-to-day work. Finally, the chapter explores some general ideas that may guide the future research in the field.


Author(s):  
Junbang Liang ◽  
Ming C. Lin

Abstract Digital try-on systems for e-commerce have the potential to change people's lives and provide notable economic benefits. However, their development is limited by practical constraints, such as accurate sizing of the body and realism of demonstrations. We enumerate three open challenges remaining for a complete and easy-to-use try-on system that recent advances in machine learning make increasingly tractable. For each, we describe the problem, introduce state-of-the-art approaches, and provide future directions.


Sign in / Sign up

Export Citation Format

Share Document