Multiple Datasets
Recently Published Documents





Cells ◽  
2022 ◽  
Vol 11 (2) ◽  
pp. 287
Khaled Bin Satter ◽  
Paul Minh Huy Tran ◽  
Lynn Kim Hoang Tran ◽  
Zach Ramsey ◽  
Katheine Pinkerton ◽  

Publicly available gene expression datasets were analyzed to develop a chromophobe and oncocytoma related gene signature (COGS) to distinguish chRCC from RO. The datasets GSE11151, GSE19982, GSE2109, GSE8271 and GSE11024 were combined into a discovery dataset. The transcriptomic differences were identified with unsupervised learning in the discovery dataset (97.8% accuracy) with density based UMAP (DBU). The top 30 genes were identified by univariate gene expression analysis and ROC analysis, to create a gene signature called COGS. COGS, combined with DBU, was able to differentiate chRCC from RO in the discovery dataset with an accuracy of 97.8%. The classification accuracy of COGS was validated in an independent meta-dataset consisting of TCGA-KICH and GSE12090, where COGS could differentiate chRCC from RO with 100% accuracy. The differentially expressed genes were involved in carbohydrate metabolism, transcriptomic regulation by TP53, beta-catenin-dependent Wnt signaling, and cytokine (IL-4 and IL-13) signaling highly active in cancer cells. Using multiple datasets and machine learning, we constructed and validated COGS as a tool that can differentiate chRCC from RO and complement histology in routine clinical practice to distinguish these two tumors.

2022 ◽  
Shijie Li ◽  
Guojie Wang ◽  
Chenxia Zhu ◽  
Jiao Lu ◽  
Waheed Ullah ◽  

Abstract. Actual evapotranspiration (ET) is an essential variable in the hydrological process, linking the carbon, water, and energy cycles. Global ET has significantly changed in the warming climate. Although increasing vapour pressure deficit (VPD) due to global warming enhances atmospheric water demand, it remains unclear how the dynamics of ET are affected. In this study, using multiple datasets, we disentangled the relative contributions of precipitation, net radiation, air temperature (T1), VPD, and wind speed on affecting annual ET linear trend using an advanced separation method that considers the Budyko framework. It is found that the precipitation variability dominantly controls global ET in the dry climates, the net radiation has substantial control over ET in the tropical regions, and VPD is impacting ET trends in boreal mid-latitude climate. The critical role of VPD in controlling ET trends is particularly emphasized due to its influence in controlling the land-atmosphere interactions.

2022 ◽  
Caibin Sheng ◽  
Rui Lopes ◽  
Gang Li ◽  
Sven Schuierer ◽  
Annick Waldt ◽  

Droplet-based single-cell omics, including single-cell RNA sequencing (scRNAseq), single cell CRISPR perturbations (e.g., CROP-seq) and single-cell protein and transcriptomic profiling (e.g., CITE-seq) hold great promise for comprehensive cell profiling and genetic screening at the single cell resolution, yet these technologies suffer from substantial noise, among which ambient signals present in the cell suspension may be the predominant source. Current efforts to address this issue are highly specific to a certain technology, while a universal model to describe the noise across these technologies may reveal this common source thereby improving the denoising accuracy. To this end, we explicitly examined these unexpected signals and observed a predictable pattern in multiple datasets across different technologies. Based on the finding, we developed single cell Ambient Remover (scAR) which uses probabilistic deep learning to deconvolute the observed signals into native and ambient composition. scAR provides an efficient and universal solution to count denoising for multiple types of single-cell omics data, including single cell CRISPR screens, CITE-seq and scRNAseq. It will facilitate the application of single-cell omics technologies.

2022 ◽  
Vol 8 (1) ◽  
Lulin Zhou ◽  
Zubiao Niu ◽  
Yuqi Wang ◽  
You Zheng ◽  
Yichao Zhu ◽  

AbstractSenescence is believed to be a pivotal player in the onset and progression of tumors as well as cancer therapy. However, the guiding roles of senescence in clinical outcomes and therapy selection for patients with cancer remain obscure, largely due to the absence of a feasible senescence signature. Here, by integrative analysis of single cell and bulk transcriptome data from multiple datasets of gastric cancer patients, we uncovered senescence as a veiled tumor feature characterized by senescence gene signature enriched, unexpectedly, in the noncancerous cells, and further identified two distinct senescence-associated subtypes based on the unsupervised clustering. Patients with the senescence subtype had higher tumor mutation loads and better prognosis as compared with the aggressive subtype. By the machine learning, we constructed a scoring system termed as senescore based on six signature genes: ADH1B, IL1A, SERPINE1, SPARC, EZH2, and TNFAIP2. Higher senescore demonstrated robustly predictive capability for longer overall and recurrence-free survival in 2290 gastric cancer samples, which was independently validated by the multiplex staining analysis of gastric cancer samples on the tissue microarray. Remarkably, the senescore signature served as a reliable predictor of chemotherapeutic and immunotherapeutic efficacies, with high-senescore patients benefited from immunotherapy, while low-senescore patients were responsive to chemotherapy. Collectively, we report senescence as a heretofore unrecognized hallmark of gastric cancer that impacts patient outcomes and therapeutic efficacy.

Knowledge ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. 1-25
Michalis Mountantonakis ◽  
Yannis Tzitzikas

There is a high increase in approaches that receive as input a text and perform named entity recognition (or extraction) for linking the recognized entities of the given text to RDF Knowledge Bases (or datasets). In this way, it is feasible to retrieve more information for these entities, which can be of primary importance for several tasks, e.g., for facilitating manual annotation, hyperlink creation, content enrichment, for improving data veracity and others. However, current approaches link the extracted entities to one or few knowledge bases, therefore, it is not feasible to retrieve the URIs and facts of each recognized entity from multiple datasets and to discover the most relevant datasets for one or more extracted entities. For enabling this functionality, we introduce a research prototype, called LODsyndesisIE, which exploits three widely used Named Entity Recognition and Disambiguation tools (i.e., DBpedia Spotlight, WAT and Stanford CoreNLP) for recognizing the entities of a given text. Afterwards, it links these entities to the LODsyndesis knowledge base, which offers data enrichment and discovery services for millions of entities over hundreds of RDF datasets. We introduce all the steps of LODsyndesisIE, and we provide information on how to exploit its services through its online application and its REST API. Concerning the evaluation, we use three evaluation collections of texts: (i) for comparing the effectiveness of combining different Named Entity Recognition tools, (ii) for measuring the gain in terms of enrichment by linking the extracted entities to LODsyndesis instead of using a single or a few RDF datasets and (iii) for evaluating the efficiency of LODsyndesisIE.

2021 ◽  
MEGAN E. B. Clowse ◽  
Amanda Eudy ◽  
Stephen Balevic ◽  
Gillian Sanders-Schmidler ◽  
Andrzej Kosinski ◽  

Objective: Multiple guidelines recommend continuing hydroxychloroquine (HCQ) for systemic lupus erythematosus (lupus) during pregnancy based on observational data. The goal of this individual patient data meta-analysis was to combine multiple datasets to compare pregnancy outcomes in women with lupus on and off HCQ. Methods: Eligible studies included prospectively-collected pregnancies in women with lupus. After a manuscript search, 7 datasets were obtained. Pregnancy outcomes and lupus activity were compared for pregnancies with a visit in the first trimester in women who did or did not take HCQ throughout pregnancy. Birth defects were not systematically collected. This analysis was conducted in each dataset and results were aggregated to provide a pooled odds ratio. Results: Seven cohorts provided 938 pregnancies in 804 women. After selecting one pregnancy per patient with a 1st trimester visit, 668 pregnancies were included; 63% took HCQ throughout pregnancy. Compared to pregnancies without HCQ, those with HCQ had lower rates of highly active lupus, but did not have different rates of fetal loss, preterm birth, or preeclampsia. Among women with low lupus activity, HCQ reduced the risk for preterm delivery. Conclusion: This large study of prospectively-collected lupus pregnancies demonstrates a decrease in SLE activity among woman who continue HCQ through pregnancy and no harm to pregnancy outcomes. Like all studies of HCQ in lupus pregnancy, this study is confounded by indication and non-adherence. As this study confirms the safety of HCQ and diminished SLE activity with use, it is consistent with current recommendations to continue HCQ throughout pregnancy.

2021 ◽  
Vol 33 (6) ◽  
pp. 1385-1397
Leyuan Sun ◽  
Rohan P. Singh ◽  
Fumio Kanehiro ◽  

Most simultaneous localization and mapping (SLAM) systems assume that SLAM is conducted in a static environment. When SLAM is used in dynamic environments, the accuracy of each part of the SLAM system is adversely affected. We term this problem as dynamic SLAM. In this study, we propose solutions for three main problems in dynamic SLAM: camera tracking, three-dimensional map reconstruction, and loop closure detection. We propose to employ geometry-based method, deep learning-based method, and the combination of them for object segmentation. Using the information from segmentation to generate the mask, we filter the keypoints that lead to errors in visual odometry and features extracted by the CNN from dynamic areas to improve the performance of loop closure detection. Then, we validate our proposed loop closure detection method using the precision-recall curve and also confirm the framework’s performance using multiple datasets. The absolute trajectory error and relative pose error are used as metrics to evaluate the accuracy of the proposed SLAM framework in comparison with state-of-the-art methods. The findings of this study can potentially improve the robustness of SLAM technology in situations where mobile robots work together with humans, while the object-based point cloud byproduct has potential for other robotics tasks.

2021 ◽  
Rogers F Silva ◽  
Eswar Damaraju ◽  
Xinhui Li ◽  
Peter Kochonov ◽  
Aysenil Belger ◽  

With the increasing availability of large-scale multimodal neuroimaging datasets, it is necessary to develop data fusion methods which can extract cross-modal features. A general framework, multidataset independent subspace analysis (MISA), has been developed to encompass multiple blind source separation approaches and identify linked cross-modal components in multiple datasets. In this work we utilized the multimodal independent vector analysis model in MISA to directly identify meaningful linked features across three neuroimaging modalities --- structural magnetic resonance imaging (MRI), resting state functional MRI and diffusion MRI --- in two large independent datasets, one comprising of healthy subjects and the other including patients with schizophrenia. Results show several linked subject profiles (the sources/components) that capture age-associated reductions, schizophrenia-related biomarkers, sex effects, and cognitive performance.

PLoS ONE ◽  
2021 ◽  
Vol 16 (12) ◽  
pp. e0261375
Dana A. Glei ◽  
Maxine Weinstein

Using data from three national surveys of US adults (one cohort and two cross-sectional studies, covering the period from the mid-1990s to the mid-2010s), we quantify the degree to which disparities by socioeconomic status (SES) in self-reported pain and physical limitations widened and explore whether they widened more in midlife than in later life. Unlike most prior studies that use proxy measures of SES (e.g., education), we use a multidimensional measure of SES that enables us to evaluate changes over time in each outcome for fixed percentiles of the population, thereby avoiding the problem of lagged selection bias. Results across multiple datasets demonstrate that socioeconomic disparities in pain and physical limitations consistently widened since the late 1990s, and if anything, widened even more in midlife than in late life (above 75). For those aged 50–74, the SES disparities in most outcomes widened by more than 50% and in some cases, the SES gap more than doubled. In contrast, the magnitude of SES widening was much smaller above age 75 and, in the vast majority of cases, not significant. Pain prevalence increased at all levels of SES, but disadvantaged Americans suffered the largest increases. Physical function deteriorated for those with low SES, but there was little change and perhaps improvement among the most advantaged Americans. At the 10th percentile of SES, the predicted percentage with a physical limitation at age 50 increased by 6-10 points between the late-1990s and the 2010s, whereas at the 90th percentile of SES, there was no change in two surveys and in the third survey, the corresponding percentage declined from 31% in 1996–99 to 22% in 2016–18. The worst-off Americans are being left behind in a sea of pain and physical infirmity, which may have dire consequences for their quality of life and for society as a whole (e.g., lost productivity, public costs).

Sign in / Sign up

Export Citation Format

Share Document