scholarly journals oposSOM-Browser: an interactive tool to explore omics data landscapes in health science

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Henry Loeffler-Wirth ◽  
Jasmin Reikowski ◽  
Siras Hakobyan ◽  
Jonas Wagner ◽  
Hans Binder

Abstract Background oposSOM is a comprehensive, machine learning based open-source data analysis software combining functionalities such as diversity analyses, biomarker selection, function mining, and visualization. Results These functionalities are now available as interactive web-browser application for a broader user audience interested in extracting detailed information from high-throughput omics data sets pre-processed by oposSOM. It enables interactive browsing of single-gene and gene set profiles, of molecular ‘portrait landscapes’, of associated phenotype diversity, and signalling pathway activation patterns. Conclusion The oposSOM-Browser makes available interactive data browsing for five transcriptome data sets of cancer (melanomas, B-cell lymphomas, gliomas) and of peripheral blood (sepsis and healthy individuals) at www.izbi.uni-leipzig.de/opossom-browser.

2016 ◽  
Vol 12 (2) ◽  
pp. 126-149 ◽  
Author(s):  
Masoud Mansoury ◽  
Mehdi Shajari

Purpose This paper aims to improve the recommendations performance for cold-start users and controversial items. Collaborative filtering (CF) generates recommendations on the basis of similarity between users. It uses the opinions of similar users to generate the recommendation for an active user. As a similarity model or a neighbor selection function is the key element for effectiveness of CF, many variations of CF are proposed. However, these methods are not very effective, especially for users who provide few ratings (i.e. cold-start users). Design/methodology/approach A new user similarity model is proposed that focuses on improving recommendations performance for cold-start users and controversial items. To show the validity of the authors’ similarity model, they conducted some experiments and showed the effectiveness of this model in calculating similarity values between users even when only few ratings are available. In addition, the authors applied their user similarity model to a recommender system and analyzed its results. Findings Experiments on two real-world data sets are implemented and compared with some other CF techniques. The results show that the authors’ approach outperforms previous CF techniques in coverage metric while preserves accuracy for cold-start users and controversial items. Originality/value In the proposed approach, the conditions in which CF is unable to generate accurate recommendations are addressed. These conditions affect CF performance adversely, especially in the cold-start users’ condition. The authors show that their similarity model overcomes CF weaknesses effectively and improve its performance even in the cold users’ condition.


2021 ◽  
Author(s):  
Benbo Gao ◽  
Jing Zhu ◽  
Soumya Negi ◽  
Xinmin Zhang ◽  
Stefka Gyoneva ◽  
...  

AbstractSummaryWe developed Quickomics, a feature-rich R Shiny-powered tool to enable biologists to fully explore complex omics data and perform advanced analysis in an easy-to-use interactive interface. It covers a broad range of secondary and tertiary analytical tasks after primary analysis of omics data is completed. Each functional module is equipped with customized configurations and generates both interactive and publication-ready high-resolution plots to uncover biological insights from data. The modular design makes the tool extensible with ease.AvailabilityResearchers can experience the functionalities with their own data or demo RNA-Seq and proteomics data sets by using the app hosted at http://quickomics.bxgenomics.com and following the tutorial, https://bit.ly/3rXIyhL. The source code under GPLv3 license is provided at https://github.com/interactivereport/[email protected], [email protected] informationSupplementary materials are available at https://bit.ly/37HP17g.


2019 ◽  
Vol 488 (1) ◽  
pp. 1035-1065 ◽  
Author(s):  
Girish Kulkarni ◽  
Gábor Worseck ◽  
Joseph F Hennawi

ABSTRACTDeterminations of the ultraviolet (UV) luminosity function of active galactic nuclei (AGN) at high redshifts are important for constraining the AGN contribution to reionization and understanding the growth of supermassive black holes. Recent inferences of the luminosity function suffer from inconsistencies arising from inhomogeneous selection and analysis of data. We address this problem by constructing a sample of more than 80 000 colour-selected AGN from redshift $z$ = 0 to 7.5 using multiple data sets homogenized to identical cosmologies, intrinsic AGN spectra, and magnitude systems. Using this sample, we derive the AGN UV luminosity function from redshift $z$ = 0 to 7.5. The luminosity function has a double power-law form at all redshifts. The break magnitude M* shows a steep brightening from M* ∼ −24 at $z$ = 0.7 to M* ∼ −29 at $z$ = 6. The faint-end slope β significantly steepens from −1.9 at $z$ < 2.2 to −2.4 at $z$ ≃ 6. In spite of this steepening, the contribution of AGN to the hydrogen photoionization rate at $z$ ∼ 6 is subdominant (<3 per cent), although it can be non-negligible (∼10 per cent) if these luminosity functions hold down to M1450 = −18. Under reasonable assumptions, AGN can reionize He ii by redshift $z$ = 2.9. At low redshifts ($z$ < 0.5), AGN can produce about half of the hydrogen photoionization rate inferred from the statistics of H i absorption lines in the intergalactic medium. Our analysis also reveals important systematic errors in the data, which need to be addressed and incorporated in the AGN selection function in future in order to improve our results. We make various fitting functions, codes, and data publicly available.


2020 ◽  
Vol 12 (2) ◽  
pp. 3906-3916 ◽  
Author(s):  
James F Fleming ◽  
Roberto Feuda ◽  
Nicholas W Roberts ◽  
Davide Pisani

Abstract Our ability to correctly reconstruct a phylogenetic tree is strongly affected by both systematic errors and the amount of phylogenetic signal in the data. Current approaches to tackle tree reconstruction artifacts, such as the use of parameter-rich models, do not translate readily to single-gene alignments. This, coupled with the limited amount of phylogenetic information contained in single-gene alignments, makes gene trees particularly difficult to reconstruct. Opsin phylogeny illustrates this problem clearly. Opsins are G-protein coupled receptors utilized in photoreceptive processes across Metazoa and their protein sequences are roughly 300 amino acids long. A number of incongruent opsin phylogenies have been published and opsin evolution remains poorly understood. Here, we present a novel approach, the canary sequence approach, to investigate and potentially circumvent errors in single-gene phylogenies. First, we demonstrate our approach using two well-understood cases of long-branch attraction in single-gene data sets, and simulations. After that, we apply our approach to a large collection of well-characterized opsins to clarify the relationships of the three main opsin subfamilies.


Metabolites ◽  
2019 ◽  
Vol 9 (4) ◽  
pp. 76 ◽  
Author(s):  
Farhana R. Pinu ◽  
David J. Beale ◽  
Amy M. Paten ◽  
Konstantinos Kouremenos ◽  
Sanjay Swarup ◽  
...  

The use of multiple omics techniques (i.e., genomics, transcriptomics, proteomics, and metabolomics) is becoming increasingly popular in all facets of life science. Omics techniques provide a more holistic molecular perspective of studied biological systems compared to traditional approaches. However, due to their inherent data differences, integrating multiple omics platforms remains an ongoing challenge for many researchers. As metabolites represent the downstream products of multiple interactions between genes, transcripts, and proteins, metabolomics, the tools and approaches routinely used in this field could assist with the integration of these complex multi-omics data sets. The question is, how? Here we provide some answers (in terms of methods, software tools and databases) along with a variety of recommendations and a list of continuing challenges as identified during a peer session on multi-omics integration that was held at the recent ‘Australian and New Zealand Metabolomics Conference’ (ANZMET 2018) in Auckland, New Zealand (Sept. 2018). We envisage that this document will serve as a guide to metabolomics researchers and other members of the community wishing to perform multi-omics studies. We also believe that these ideas may allow the full promise of integrated multi-omics research and, ultimately, of systems biology to be realized.


2006 ◽  
Vol 95 (4) ◽  
pp. 2199-2212 ◽  
Author(s):  
Matthew C. Tresch ◽  
Vincent C. K. Cheung ◽  
Andrea d'Avella

Several recent studies have used matrix factorization algorithms to assess the hypothesis that behaviors might be produced through the combination of a small number of muscle synergies. Although generally agreeing in their basic conclusions, these studies have used a range of different algorithms, making their interpretation and integration difficult. We therefore compared the performance of these different algorithms on both simulated and experimental data sets. We focused on the ability of these algorithms to identify the set of synergies underlying a data set. All data sets consisted of nonnegative values, reflecting the nonnegative data of muscle activation patterns. We found that the performance of principal component analysis (PCA) was generally lower than that of the other algorithms in identifying muscle synergies. Factor analysis (FA) with varimax rotation was better than PCA, and was generally at the same levels as independent component analysis (ICA) and nonnegative matrix factorization (NMF). ICA performed very well on data sets corrupted by constant variance Gaussian noise, but was impaired on data sets with signal-dependent noise and when synergy activation coefficients were correlated. Nonnegative matrix factorization (NMF) performed similarly to ICA and FA on data sets with signal-dependent noise and was generally robust across data sets. The best algorithms were ICA applied to the subspace defined by PCA (ICAPCA) and a version of probabilistic ICA with nonnegativity constraints (pICA). We also evaluated some commonly used criteria to identify the number of synergies underlying a data set, finding that only likelihood ratios based on factor analysis identified the correct number of synergies for data sets with signal-dependent noise in some cases. We then proposed an ad hoc procedure, finding that it was able to identify the correct number in a larger number of cases. Finally, we applied these methods to an experimentally obtained data set. The best performing algorithms (FA, ICA, NMF, ICAPCA, pICA) identified synergies very similar to one another. Based on these results, we discuss guidelines for using factorization algorithms to analyze muscle activation patterns. More generally, the ability of several algorithms to identify the correct muscle synergies and activation coefficients in simulated data, combined with their consistency when applied to physiological data sets, suggests that the muscle synergies found by a particular algorithm are not an artifact of that algorithm, but reflect basic aspects of the organization of muscle activation patterns underlying behaviors.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 419-419
Author(s):  
Jochen K Lennerz ◽  
Birgit Schif ◽  
Christian W Kohler ◽  
Stefan Bentink ◽  
Markus Kreuz ◽  
...  

Abstract Abstract 419 Suppressor of cytokine signaling 1 (SOCS1) is frequently mutated in Hodgkin, primary mediastinal and diffuse large B-cell lymphomas (DLBCL). In the primary mediastinal B-cell lymphoma line MedB-1, mutated SOCS1 abnormally stabilizes phospho-JAK2, thereby enhances STAT signaling leading to continuous proliferation. Here, we evaluated the prognostic value of SOCS1 mutations by full-length gene sequencing of SOCS1 in 154 comprehensively characterized DLBCL cases. By sequence analysis, we identified 90 SOCS1 mutations in 16% of lymphomas. We defined two distinct subtypes with respect to putative mutational consequences: those predicting the full-length (minor) and a truncated protein (major), respectively. Neither the SOCS1 mutation group, nor minor/major subgroups can be distinguished by clinical phenotype; however, assignment of four established expression-based classifiers revealed significant associations of SOCS1 major cases with germinal center- and specific pathway activation pattern signatures. Above all, SOCS1 major cases had an excellent overall survival, even better than the GCB-like subgroup (see Figure). SOCS1 minor cases had a dismal survival, even worse than the ABC gene signature group (see Figure). SOCS1 mutation subsets retain prognostic significance in uni- and multivariate analyses. Thus, if a SOCS1 mutation is present, the mutation type is an important single gene prognostic biomarker in DLBCL. Disclosures: No relevant conflicts of interest to declare.


PLoS ONE ◽  
2014 ◽  
Vol 9 (2) ◽  
pp. e89297 ◽  
Author(s):  
Alexander Kaever ◽  
Manuel Landesfeind ◽  
Kirstin Feussner ◽  
Burkhard Morgenstern ◽  
Ivo Feussner ◽  
...  

2006 ◽  
Vol 7 (3) ◽  
pp. 198-210 ◽  
Author(s):  
Andrew R. Joyce ◽  
Bernhard Ø. Palsson
Keyword(s):  

F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 305 ◽  
Author(s):  
Alexandra K. Marr ◽  
Sabri Boughorbel ◽  
Scott Presnell ◽  
Charlie Quinn ◽  
Damien Chaussabel ◽  
...  

Compendia of large-scale datasets made available in public repositories provide a precious opportunity to discover new biomedical phenomena and to fill gaps in our current knowledge. In order to foster novel insights it is necessary to ensure that these data are made readily accessible to research investigators in an interpretable format. Here we make a curated, public, collection of transcriptome datasets relevant to human placenta biology available for further analysis and interpretation via an interactive data browsing interface. We identified and retrieved a total of 24 datasets encompassing 759 transcriptome profiles associated with the development of the human placenta and associated pathologies from the NCBI Gene Expression Omnibus (GEO) and present them in a custom web-based application designed for interactive query and visualization of integrated large-scale datasets (http://placentalendocrinology.gxbsidra.org/dm3/landing.gsp). We also performed quality control checks using relevant biological markers. Multiple sample groupings and rank lists were subsequently created to facilitate data query and interpretation. Via this interface, users can create web-links to customized graphical views which may be inserted into manuscripts for further dissemination, or e-mailed to collaborators for discussion. The tool also enables users to browse a single gene across different projects, providing a mechanism for  developing new perspectives on the role of a molecule of interest across multiple biological states. The dataset collection we created here is available at: http://placentalendocrinology.gxbsidra.org/dm3.


Sign in / Sign up

Export Citation Format

Share Document