Stereochemical nonrigidity of [Rh6(CO)15L] clusters in solutionElectronic supplementary information (ESI) available; the relationship between the rate of S-type exchange in [Rh6(CO)15(PR3)] and the pKa′ values of the phosphine ligand. See http://www.rsc.org/suppdata/dt/b1/b101962g/

Author(s):  
Elena V. Grachova ◽  
Brian T. Heaton ◽  
Jonathan A. Iggo ◽  
Ivan S. Podkorytov ◽  
Daniel J. Smawfield ◽  
...  
Author(s):  
Irzam Sarfraz ◽  
Muhammad Asif ◽  
Joshua D Campbell

Abstract Motivation R Experiment objects such as the SummarizedExperiment or SingleCellExperiment are data containers for storing one or more matrix-like assays along with associated row and column data. These objects have been used to facilitate the storage and analysis of high-throughput genomic data generated from technologies such as single-cell RNA sequencing. One common computational task in many genomics analysis workflows is to perform subsetting of the data matrix before applying down-stream analytical methods. For example, one may need to subset the columns of the assay matrix to exclude poor-quality samples or subset the rows of the matrix to select the most variable features. Traditionally, a second object is created that contains the desired subset of assay from the original object. However, this approach is inefficient as it requires the creation of an additional object containing a copy of the original assay and leads to challenges with data provenance. Results To overcome these challenges, we developed an R package called ExperimentSubset, which is a data container that implements classes for efficient storage and streamlined retrieval of assays that have been subsetted by rows and/or columns. These classes are able to inherently provide data provenance by maintaining the relationship between the subsetted and parent assays. We demonstrate the utility of this package on a single-cell RNA-seq dataset by storing and retrieving subsets at different stages of the analysis while maintaining a lower memory footprint. Overall, the ExperimentSubset is a flexible container for the efficient management of subsets. Availability and implementation ExperimentSubset package is available at Bioconductor: https://bioconductor.org/packages/ExperimentSubset/ and Github: https://github.com/campbio/ExperimentSubset. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Yi Yang ◽  
Xingjie Shi ◽  
Yuling Jiao ◽  
Jian Huang ◽  
Min Chen ◽  
...  

AbstractMotivationAlthough genome-wide association studies (GWAS) have deepened our understanding of the genetic architecture of complex traits, the mechanistic links that underlie how genetic variants cause complex traits remains elusive. To advance our understanding of the underlying mechanistic links, various consortia have collected a vast volume of genomic data that enable us to investigate the role that genetic variants play in gene expression regulation. Recently, a collaborative mixed model (CoMM) [42] was proposed to jointly interrogate genome on complex traits by integrating both the GWAS dataset and the expression quantitative trait loci (eQTL) dataset. Although CoMM is a powerful approach that leverages regulatory information while accounting for the uncertainty in using an eQTL dataset, it requires individual-level GWAS data and cannot fully make use of widely available GWAS summary statistics. Therefore, statistically efficient methods that leverages transcriptome information using only summary statistics information from GWAS data are required.ResultsIn this study, we propose a novel probabilistic model, CoMM-S2, to examine the mechanistic role that genetic variants play, by using only GWAS summary statistics instead of individual-level GWAS data. Similar to CoMM which uses individual-level GWAS data, CoMM-S2 combines two models: the first model examines the relationship between gene expression and genotype, while the second model examines the relationship between the phenotype and the predicted gene expression from the first model. Distinct from CoMM, CoMM-S2 requires only GWAS summary statistics. Using both simulation studies and real data analysis, we demonstrate that even though CoMM-S2 utilizes GWAS summary statistics, it has comparable performance as CoMM, which uses individual-level GWAS [email protected] and implementationThe implement of CoMM-S2 is included in the CoMM package that can be downloaded from https://github.com/gordonliu810822/CoMM.Supplementary informationSupplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Jiaan Dai ◽  
Fengchao Yu ◽  
Ning Li ◽  
Weichuan Yu

AbstractMotivationAnalyzing tandem mass spectrometry data to recognize peptides in a sample is the fundamental task in computational proteomics. Traditional peptide identification algorithms perform well when identifying unmodified peptides. However, when peptides have post-translational modifications (PTMs), these methods cannot provide satisfactory results. Recently, Chick et al., 2015 and Yu et al., 2016 proposed the spectrum-based and tag-based open search methods, respectively, to identify peptides with PTMs. While the performance of these two methods is promising, the identification results vary greatly with respect to the quality of tandem mass spectra and the number of PTMs in peptides. This motivates us to systematically study the relationship between the performance of open search methods and quality parameters of tandem mass spectrum data, as well as the number of PTMs in peptides.ResultsThrough large-scale simulations, we obtain the performance trend when simulated tandem mass spectra are of different quality. We propose an analytical model to describe the relationship between the probability of obtaining correct identifications and the spectrum quality as well as the number of PTMs. Based on the analytical model, we can quantitatively describe the necessary condition to effectively apply open search methods.AvailabilitySource codes of the simulation are available at http://bioinformatics.ust.hk/[email protected] or [email protected] informationSupplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Kamil Said Faisal

In this study, multi-temporal Landsat images obtained from the U.S. Geological Survey are used to monitor two landfill sites, the Trail Road landfill (Ottawa, Canada) and the Al-Jleeb landfill (Al-Farwanyah, Kuwait). The objectives are: 1) to study the land surface temperature (LST) of the two landfill sites; 2) investigate the relationship between the LST and landfill gas in the Trail Road landfill; and 3) detect suspicious dumping areas within the Al-Jleeb landfill. It was found that the LST of the landfill sites are always higher than the air temperature and the immediate surroundings. The correlation between the LST and the methane recorded in the Trail Road landfill is not obviously strong, and five suspicious locations were identified within the Al-Jleeb landfill by overlaying the highest LST contours. The study demonstrates the usefulness of remote sensing techniques that can provide supplementary information for landfill monitoring


Author(s):  
Grégoire Versmée ◽  
Laura Versmée ◽  
Mikaël Dusenne ◽  
Niloofar Jalali ◽  
Paul Avillach

Abstract Summary Based on the Genomic Data Sharing Policy issued in August 2007, the National Institutes of Health (NIH) has supported several repositories such as the database of Genotypes and Phenotypes (dbGaP). dbGaP is an online repository that provides access to large-scale genetic and phenotypic datasets with more than 1,000 studies. However, navigating the website and understanding the relationship between the studies are not easy tasks. Moreover, the decryption of the files is a complex procedure. In this study we propose the dbgap2x R package that covers a broad range of functions for searching dbGaP studies, exploring the characteristics of a study and easily decrypting the files from dbGaP. Availability and implementation dbgap2x is an R package with the code available at https://github.com/gversmee/dbgap2x. A containerized version including the package, a Jupyter server and with a Notebook example is available at https://hub.docker.com/r/gversmee/dbgap2x. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Olivia L. Sabik ◽  
Charles R. Farber

SummaryGenome-wide association studies (GWASs) have identified thousands of loci associated with risk of various diseases; however, the genes responsible for the majority of loci have not been identified. One means of uncovering potential causal genes is the identification of expression quantitative trait loci (eQTL) that colocalize with disease loci. Statistical methods have been developed to assess the likelihood that two associations (e.g. disease locus and eQTL) share a common causal variant, however, visualization of the two loci is often a crucial step in determining if a locus is pleiotropic. While the current convention is to plot two associations side-by-side, it is difficult to compare across two x-axes, even if they are identical. Thus, we have developed the Regional Association ComparER (RACER) package, which creates “mirror plots”, in which the two associations are plotted on a shared x-axis. Mirror plots provide an effective tool for the visual exploration and presentation of the relationship between two genetic associations.Availability and ImplementationRACER is provided under the GNU General Public License version 3 (GPL-3.0). Source code is available at https://github.com/oliviasabik/[email protected] informationSupplementary data are available online with the paper, see the Supplemental Data Manifest.


2018 ◽  
Vol 16 (1) ◽  
pp. 53-68 ◽  
Author(s):  
Josh A. Hendrix ◽  
Travis A. Taniguchi ◽  
Kevin J. Strom ◽  
Kelle A. Barrick ◽  
Nicole J. Johnson

This study examines the relationship between police-community racial asymmetry and the use of surveillance technology by local law enforcement. The data come from a nationally representative survey of law enforcement agencies, with supplementary information provided by the Law Enforcement Management and Administrative Statistics Survey, the Census, and the Uniform Crime Reports. Results indicate that police departments that underrepresent African Americans in the community are more likely to use or plan to implement surveillance technology, controlling for a range of agency-and contextual-level factors. One potential explanation for these findings is that surveillance technology operates as a form of social control that is differentially applied to racial minorities to manage what is perceived to be a greater proclivity toward criminal behavior. The implications of these findings are discussed.


2020 ◽  
Vol 19 (1) ◽  
Author(s):  
Alexander Kremling ◽  
Jan Schildmann

Abstract Background Sedation in palliative care is frequently but controversially discussed. Heterogeneous definitions and conceptual confusion have been cited as contributing to different problems 1) relevant to empirical research, for example, inconsistent data about practice, the ‘data problem’, and 2) relevant for an ethically legitimate characterisation of the practice, the ‘problem of ethical pre-emption’. However, little is known about how exactly definitions differ, how they cause confusion and how this can be overcome. Method Pre-explicative analyses: (A) systematic literature search for guidelines on sedation in palliative care and systematic decomposition of the definitions of the practice in these guidelines; (B) logical distinction of different ways through which the two problems reported might be caused by definitions; and (C) analysis of how content of the definitions contributes to the problems reported in these different ways. Results 29 guidelines from 14 countries were identified. Definitions differ significantly in both structure and content. We identified three ways in which definitions can cause the ‘data problem’ – 1) different definitions, 2) deviating implicit concepts, 3) disagreement about facts. We identified two ways to cause the problem of ethical pre-emption: 1) explicit or 2) implicit normativity. Decomposition of definitions linked to the distinguished ways of causing the conceptual problems shows how exactly single parts of definitions can cause the problems identified. Conclusion Current challenges concerning empirical research on sedation in palliative care can be remediated partly by improved definitions in the future, if content and structure of the used definitions is chosen systematically. In addition, future research should bear in mind that there are distinct purposes of definitions. Regarding the ‘data problem’, improving definitions is possible in terms of supplementary information, checking for implicit understanding, systematic choice of definitional elements. ‘Ethical pre-emption’, in contrast, is a pseudo problem if definitions and the relationship of definitions and norms of good practice are understood correctly.


2019 ◽  
Vol 36 (7) ◽  
pp. 2009-2016 ◽  
Author(s):  
Yi Yang ◽  
Xingjie Shi ◽  
Yuling Jiao ◽  
Jian Huang ◽  
Min Chen ◽  
...  

Abstract Motivation Although genome-wide association studies (GWAS) have deepened our understanding of the genetic architecture of complex traits, the mechanistic links that underlie how genetic variants cause complex traits remains elusive. To advance our understanding of the underlying mechanistic links, various consortia have collected a vast volume of genomic data that enable us to investigate the role that genetic variants play in gene expression regulation. Recently, a collaborative mixed model (CoMM) was proposed to jointly interrogate genome on complex traits by integrating both the GWAS dataset and the expression quantitative trait loci (eQTL) dataset. Although CoMM is a powerful approach that leverages regulatory information while accounting for the uncertainty in using an eQTL dataset, it requires individual-level GWAS data and cannot fully make use of widely available GWAS summary statistics. Therefore, statistically efficient methods that leverages transcriptome information using only summary statistics information from GWAS data are required. Results In this study, we propose a novel probabilistic model, CoMM-S2, to examine the mechanistic role that genetic variants play, by using only GWAS summary statistics instead of individual-level GWAS data. Similar to CoMM which uses individual-level GWAS data, CoMM-S2 combines two models: the first model examines the relationship between gene expression and genotype, while the second model examines the relationship between the phenotype and the predicted gene expression from the first model. Distinct from CoMM, CoMM-S2 requires only GWAS summary statistics. Using both simulation studies and real data analysis, we demonstrate that even though CoMM-S2 utilizes GWAS summary statistics, it has comparable performance as CoMM, which uses individual-level GWAS data. Availability and implementation The implement of CoMM-S2 is included in the CoMM package that can be downloaded from https://github.com/gordonliu810822/CoMM. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document