OPENING THE DOOR TO THE LARGE SCALE USE OF CLINICAL LAB MEASURES FOR ASSOCIATION TESTING: EXPLORING DIFFERENT METHODS FOR DEFINING PHENOTYPES

Normalisr: normalization and association testing for single-cell CRISPR screen and co-expression

10.1101/2021.04.12.439500 ◽

2021 ◽

Author(s):

Lingfei Wang

Keyword(s):

Single Cell ◽

Regulatory Networks ◽

Large Scale ◽

High Sensitivity ◽

Statistical Hypothesis ◽

P Value ◽

Statistical Hypothesis Testing ◽

Experimental Conditions ◽

Library Size ◽

Association Testing

AbstractSingle-cell RNA sequencing (scRNA-seq) provides unprecedented technical and statistical potential to study gene regulation but is subject to technical variations and sparsity. Here we present Normalisr, a linear-model-based normalization and statistical hypothesis testing framework that unifies single-cell differential expression, co-expression, and CRISPR scRNA-seq screen analyses. By systematically detecting and removing nonlinear confounding from library size, Normalisr achieves high sensitivity, specificity, speed, and generalizability across multiple scRNA-seq protocols and experimental conditions with unbiased P-value estimation. We use Normalisr to reconstruct robust gene regulatory networks from trans-effects of gRNAs in large-scale CRISPRi scRNA-seq screens and gene-level co-expression networks from conventional scRNA-seq.

Download Full-text

Distribution of the number of false discoveries in large-scale family-based association testing with application to the association between PTPN1 and hypertension and obesity

Human Genetics ◽

10.1007/s00439-010-0936-y ◽

2010 ◽

Vol 129 (4) ◽

pp. 425-432 ◽

Cited By ~ 1

Author(s):

Wen-Chang Wang ◽

Chao A. Hsiung ◽

Lan-Chao Wang ◽

Lee-Ming Chuang ◽

Thomas Quertermous ◽

...

Keyword(s):

Large Scale ◽

Association Testing ◽

Scale Family ◽

Family Based ◽

False Discoveries

Download Full-text

Efficient estimation and applications of cross-validated genetic predictions

10.1101/517821 ◽

2019 ◽

Cited By ~ 2

Author(s):

Joel Mefford ◽

Danny Park ◽

Zhili Zheng ◽

Arthur Ko ◽

Mika Ala-Korpela ◽

...

Keyword(s):

Large Scale ◽

Mixed Model ◽

Efficient Estimation ◽

Risk Scores ◽

Computational Time ◽

Phenotypic Data ◽

Association Testing ◽

Genotyping Platform ◽

Genetic Predictors ◽

Novel Applications

ABSTRACTLarge-scale cohorts with combined genetic and phenotypic data, coupled with methodological advances, have produced increasingly accurate genetic predictors of complex human phenotypes called polygenic risk scores (PRS). In addition to the potential translational impacts of identifying at-risk individuals, PRS are being utilized for a growing list of scientific applications including causal inference, identifying pleiotropy and genetic correlation, and powerful gene-based and mixed model association tests. Existing PRS approaches rely on external large-scale genetic cohorts that have also measured the phenotype of interest. They further require matching on ancestry and genotyping platform or imputation quality. In this work we present a novel reference-free method to produce PRS that does not rely on an external cohort. We show that naive implementations of reference-free PRS either result in substantial over-fitting or prohibitive increases in computational time. We show that our algorithm avoids both of these issues, and can produce informative in-sample PRS over any existing cohort without over-fitting. We then demonstrate several novel applications of reference-free PRS including detection of pleiotropy across 246 metabolic traits and efficient mixed-model association testing.

Download Full-text

Haplotype Structures and Large-Scale Association Testing of the 5' AMP-Activated Protein Kinase Genes PRKAA2, PRKAB1, and PRKAB2 With Type 2 Diabetes

Diabetes ◽

10.2337/diabetes.55.03.06.db05-1418 ◽

2006 ◽

Vol 55 (3) ◽

pp. 849-855 ◽

Cited By ~ 16

Author(s):

M. W. Sun ◽

J. Y. Lee ◽

P. I.W. de Bakker ◽

N. P. Burtt ◽

P. Almgren ◽

...

Keyword(s):

Type 2 Diabetes ◽

Protein Kinase ◽

Large Scale ◽

Association Testing ◽

Amp Activated Protein Kinase

Download Full-text

Single-cell normalization and association testing unifying CRISPR screen and gene co-expression analyses with Normalisr

Nature Communications ◽

10.1038/s41467-021-26682-1 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Lingfei Wang

Keyword(s):

Single Cell ◽

Regulatory Networks ◽

Large Scale ◽

Linear Models ◽

P Value ◽

Statistical Association ◽

Experimental Conditions ◽

Library Size ◽

Association Testing ◽

Crispr Screen

AbstractSingle-cell RNA sequencing (scRNA-seq) provides unprecedented technical and statistical potential to study gene regulation but is subject to technical variations and sparsity. Furthermore, statistical association testing remains difficult for scRNA-seq. Here we present Normalisr, a normalization and statistical association testing framework that unifies single-cell differential expression, co-expression, and CRISPR screen analyses with linear models. By systematically detecting and removing nonlinear confounders arising from library size at mean and variance levels, Normalisr achieves high sensitivity, specificity, speed, and generalizability across multiple scRNA-seq protocols and experimental conditions with unbiased p-value estimation. The superior scalability allows us to reconstruct robust gene regulatory networks from trans-effects of guide RNAs in large-scale single cell CRISPRi screens. On conventional scRNA-seq, Normalisr recovers gene-level co-expression networks that recapitulated known gene functions.

Download Full-text

BIGwas: Single-command quality control and association testing for multi-cohort and biobank-scale GWAS/PheWAS data

GigaScience ◽

10.1093/gigascience/giab047 ◽

2021 ◽

Vol 10 (6) ◽

Cited By ~ 1

Author(s):

Jan Christian Kässens ◽

Lars Wienbrandt ◽

David Ellinghaus

Keyword(s):

Quality Control ◽

High Performance ◽

Large Scale ◽

Association Studies ◽

Population Based ◽

Genome Wide Association Studies ◽

Association Testing ◽

Genome Wide ◽

Software Execution ◽

Automated Quality Control

Abstract Background Genome-wide association studies (GWAS) and phenome-wide association studies (PheWAS) involving 1 million GWAS samples from dozens of population-based biobanks present a considerable computational challenge and are carried out by large scientific groups under great expenditure of time and personnel. Automating these processes requires highly efficient and scalable methods and software, but so far there is no workflow solution to easily process 1 million GWAS samples. Results Here we present BIGwas, a portable, fully automated quality control and association testing pipeline for large-scale binary and quantitative trait GWAS data provided by biobank resources. By using Nextflow workflow and Singularity software container technology, BIGwas performs resource-efficient and reproducible analyses on a local computer or any high-performance compute (HPC) system with just 1 command, with no need to manually install a software execution environment or various software packages. For a single-command GWAS analysis with 974,818 individuals and 92 million genetic markers, BIGwas takes ∼16 days on a small HPC system with only 7 compute nodes to perform a complete GWAS QC and association analysis protocol. Our dynamic parallelization approach enables shorter runtimes for large HPCs. Conclusions Researchers without extensive bioinformatics knowledge and with few computer resources can use BIGwas to perform multi-cohort GWAS with 1 million GWAS samples and, if desired, use it to build their own (genome-wide) PheWAS resource. BIGwas is freely available for download from http://github.com/ikmb/gwas-qc and http://github.com/ikmb/gwas-assoc.

Download Full-text

SGI: Automatic clinical subgroup identification in omics datasets

10.1101/2021.03.12.435108 ◽

2021 ◽

Author(s):

Mustafa Buyukozkan ◽

Karsten Suhre ◽

Jan Krumsiek

Keyword(s):

Large Scale ◽

R Package ◽

Metabolomics Data ◽

Source Codes ◽

Link Type ◽

Association Testing ◽

Subgroup Identification ◽

Hands On ◽

Control Study

SummaryThe ‘Subgroup Identification’ (SGI) toolbox provides an algorithm to automatically detect clinical subgroups of samples in large-scale omics datasets. It is based on hierarchical clustering trees in combination with a specifically designed association testing and visualization framework that can process an arbitrary number of clinical parameters and outcomes in a systematic fashion. A multi-block extension allows for the simultaneous use of multiple omics datasets on the same samples. In this paper, we describe the functionality of the toolbox and demonstrate an application example on a blood metabolomics dataset with various clinical biochemistry readouts in a type 2 diabetes case-control study.Availability and implementationSGI is an open-source package implemented in R. Package source codes and hands-on tutorials are available at https://github.com/krumsieklab/sgi. The QMdiab metabolomics data is included in the package and can be downloaded from https://doi.org/10.6084/m9.figshare.5904022.

Download Full-text

Some comments about observations and image processing of comet 29P/Schwassmann-Wachmann 1

International Astronomical Union Colloquium ◽

10.1017/s0252921100031493 ◽

1999 ◽

Vol 173 ◽

pp. 243-248

Author(s):

D. Kubáček ◽

A. Galád ◽

A. Pravda

Keyword(s):

Image Processing ◽

Large Scale ◽

Harvard College ◽

Oak Ridge ◽

Large Scale Structures ◽

Short Period Comet ◽

Short Period ◽

Scale Structures ◽

Harvard College Observatory ◽

Period Comet

AbstractUnusual short-period comet 29P/Schwassmann-Wachmann 1 inspired many observers to explain its unpredictable outbursts. In this paper large scale structures and features from the inner part of the coma in time periods around outbursts are studied. CCD images were taken at Whipple Observatory, Mt. Hopkins, in 1989 and at Astronomical Observatory, Modra, from 1995 to 1998. Photographic plates of the comet were taken at Harvard College Observatory, Oak Ridge, from 1974 to 1982. The latter were digitized at first to apply the same techniques of image processing for optimizing the visibility of features in the coma during outbursts. Outbursts and coma structures show various shapes.

Download Full-text

Evolution of Large-Scale Coronal Structures

International Astronomical Union Colloquium ◽

10.1017/s0252921100024945 ◽

1994 ◽

Vol 144 ◽

pp. 29-33

Author(s):

P. Ambrož

Keyword(s):

Magnetic Field ◽

Field Line ◽

Large Scale ◽

Coronal Magnetic Field ◽

Source Surface ◽

Solar Cycles ◽

Characteristic Relationship ◽

The Magnetic Field ◽

Coronal Structures

AbstractThe large-scale coronal structures observed during the sporadically visible solar eclipses were compared with the numerically extrapolated field-line structures of coronal magnetic field. A characteristic relationship between the observed structures of coronal plasma and the magnetic field line configurations was determined. The long-term evolution of large scale coronal structures inferred from photospheric magnetic observations in the course of 11- and 22-year solar cycles is described.Some known parameters, such as the source surface radius, or coronal rotation rate are discussed and actually interpreted. A relation between the large-scale photospheric magnetic field evolution and the coronal structure rearrangement is demonstrated.

Download Full-text

Large-scale Motion of Solar Filaments

International Astronomical Union Colloquium ◽

10.1017/s0252921100064502 ◽

2000 ◽

Vol 179 ◽

pp. 205-208

Author(s):

Pavel Ambrož ◽

Alfred Schroll

Keyword(s):

Magnetic Flux ◽

Proper Motion ◽

Time Scale ◽

Large Scale ◽

Accuracy Of Measurements ◽

Solar Filaments ◽

General Movement ◽

Scale Motion ◽

Velocity Scatter

AbstractPrecise measurements of heliographic position of solar filaments were used for determination of the proper motion of solar filaments on the time-scale of days. The filaments have a tendency to make a shaking or waving of the external structure and to make a general movement of whole filament body, coinciding with the transport of the magnetic flux in the photosphere. The velocity scatter of individual measured points is about one order higher than the accuracy of measurements.

Download Full-text