scholarly journals Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data

Cell Reports ◽  
2021 ◽  
Vol 36 (4) ◽  
pp. 109442
Author(s):  
Yang Yang ◽  
Hongjian Sun ◽  
Yu Zhang ◽  
Tiefu Zhang ◽  
Jialei Gong ◽  
...  
2021 ◽  
Author(s):  
Yang Yang ◽  
Hongjian Sun ◽  
Yu Zhang ◽  
Tiefu Zhang ◽  
Jialei Gong ◽  
...  

AbstractTranscriptome profiling and differential gene expression constitute a ubiquitous tool in biomedical research and clinical application. Linear dimensionality reduction methods especially principal component analysis (PCA) are widely used in detecting sample-to-sample heterogeneity in bulk transcriptomic datasets so that appropriate analytic methods can be used to correct batch effects, remove outliers and distinguish subgroups. In response to the challenge in analysing transcriptomic datasets with large sample size such as single-cell RNA-sequencing (scRNA-seq), non-linear dimensionality reduction methods were developed. t-distributed stochastic neighbour embedding (t-SNE) and uniform manifold approximation and projection (UMAP) show the advantage of preserving local information among samples and enable effective identification of heterogeneity and efficient organisation of clusters in scRNA-seq analysis. However, the utility of t-SNE and UMAP in bulk transcriptomic analysis has not been carefully examined. Therefore, we compared major dimensionality reduction methods (linear: PCA; nonlinear: multidimensional scaling (MDS), t-SNE, and UMAP) in analysing 71 bulk transcriptomic datasets with large sample sizes. UMAP was found superior in preserving sample level neighbourhood information and maintaining clustering accuracy, thus conspicuously differentiating batch effects, identifying pre-defined biological groups and revealing in-depth clustering structures. We further verified that new clustering structures visualised by UMAP were associated with biological features and clinical meaning. Therefore, we recommend the adoption of UMAP in visualising and analysing of sizable bulk transcriptomic datasets.


Author(s):  
Htay Htay Win ◽  
Aye Thida Myint ◽  
Mi Cho Cho

For years, achievements and discoveries made by researcher are made aware through research papers published in appropriate journals or conferences. Many a time, established s researcher and mainly new user are caught up in the predicament of choosing an appropriate conference to get their work all the time. Every scienti?c conference and journal is inclined towards a particular ?eld of research and there is a extensive group of them for any particular ?eld. Choosing an appropriate venue is needed as it helps in reaching out to the right listener and also to further one’s chance of getting their paper published. In this work, we address the problem of recommending appropriate conferences to the authors to increase their chances of receipt. We present three di?erent approaches for the same involving the use of social network of the authors and the content of the paper in the settings of dimensionality reduction and topic modelling. In all these approaches, we apply Correspondence Analysis (CA) to obtain appropriate relationships between the entities in question, such as conferences and papers. Our models show hopeful results when compared with existing methods such as content-based ?ltering, collaborative ?ltering and hybrid ?ltering.


2013 ◽  
Vol 38 (4) ◽  
pp. 465-470 ◽  
Author(s):  
Jingjie Yan ◽  
Xiaolan Wang ◽  
Weiyi Gu ◽  
LiLi Ma

Abstract Speech emotion recognition is deemed to be a meaningful and intractable issue among a number of do- mains comprising sentiment analysis, computer science, pedagogy, and so on. In this study, we investigate speech emotion recognition based on sparse partial least squares regression (SPLSR) approach in depth. We make use of the sparse partial least squares regression method to implement the feature selection and dimensionality reduction on the whole acquired speech emotion features. By the means of exploiting the SPLSR method, the component parts of those redundant and meaningless speech emotion features are lessened to zero while those serviceable and informative speech emotion features are maintained and selected to the following classification step. A number of tests on Berlin database reveal that the recogni- tion rate of the SPLSR method can reach up to 79.23% and is superior to other compared dimensionality reduction methods.


2009 ◽  
Vol 19 (11) ◽  
pp. 2908-2920
Author(s):  
De-Yu MENG ◽  
Nan-Nan GU ◽  
Zong-Ben XU ◽  
Yee LEUNG

2018 ◽  
Vol 64 (1) ◽  
pp. 95-101
Author(s):  
Nazira Aldasheva ◽  
Vyacheslav Kipen ◽  
Zhaynagul Isakova ◽  
Sergey Melnov ◽  
Raisa Smolyakova ◽  
...  

Basing on Multifactor Dimensionality Reduction method we showed that polymorphic variants p.Q399R (rs25487, XRCC1) and p.P72R (rs1042522, TP53) correlated with increased risk of breast cancer for women from the Kyrgyz Republic and the Republic of Belarus. Cohort for investigation included patients with clinically verified breast cancer: 117 women from the Kyrgyz Republic (nationality - Kyrgyz) and 169 - of the Republic of Belarus (nationality - Belarusians). Group for comparison included (healthy patients without history of cancer pathology at the time of blood sampling) 102 patients from the Kyrgyz Republic, 185 - from the Republic of Belarus. Respectively genotyping of polymorphic variants p.Q399R (rs25487, XRCC1) and p.P72R (rs1042522, TP53) was done by PCR-RFLP. Analysis of the intergenic interactions conducted with MDR 3.0.2 software. Both ethnic groups showed an increase of breast cancer risk in the presence of alleles for SNPs Gln p.Q399R (XRCC1) in the heterozygous state: for the group “Kyrgyz” - OR=2,78 (95% CI=[1,60-4,82]), p=0,001; for the group “Belarusians” - OR=1,85 (95% СІ=[1Д1-2,82], p=0,004. Carriers with combination of alleles Gln (p.Q399R, XRCC1) and Pro (p.P72R, TP53) showed statistically significance increases of breast cancer risk as for patients from the Kyrgyz Republic (OR=2,89, 95% CI=[1,33-6,31]), so as for patients from the Republic of Belarus (OR=3,01, 95% CI=[0,79-11,56]).


Sign in / Sign up

Export Citation Format

Share Document