scholarly journals Characterizing batch effects and binding site-specific variability in ChIP-seq data

2021 ◽  
Vol 3 (4) ◽  
Author(s):  
Mingxiang Teng ◽  
Dongliang Du ◽  
Danfeng Chen ◽  
Rafael A Irizarry

Abstract Multiple sources of variability can bias ChIP-seq data toward inferring transcription factor (TF) binding profiles. As ChIP-seq datasets increase in public repositories, it is now possible and necessary to account for complex sources of variability in ChIP-seq data analysis. We find that two types of variability, the batch effects by sequencing laboratories and differences between biological replicates, not associated with changes in condition or state, vary across genomic sites. This implies that observed differences between samples from different conditions or states, such as cell-type, must be assessed statistically, with an understanding of the distribution of obscuring noise. We present a statistical approach that characterizes both differences of interests and these source of variability through the parameters of a mixed effects model. We demonstrate the utility of our approach on a CTCF binding dataset composed of 211 samples representing 90 different cell-types measured across three different laboratories. The results revealed that sites exhibiting large variability were associated with sequence characteristics such as GC-content and low complexity. Finally, we identified TFs associated with high-variance CTCF sites using TF motifs documented in public databases, pointing the possibility of these being false positives if the sources of variability are not properly accounted for.

Author(s):  
Massimo Andreatta ◽  
Santiago J. Carmona

AbstractComputational tools for the integration of single-cell transcriptomics data are designed to correct batch effects between technical replicates or different technologies applied to the same population of cells. However, they have inherent limitations when applied to heterogeneous sets of data with moderate overlap in cell states or sub-types. STACAS is a package for the identification of integration anchors in the Seurat environment, optimized for the integration of datasets that share only a subset of cell types. We demonstrate that by i) correcting batch effects while preserving relevant biological variability across datasets, ii) filtering aberrant integration anchors with a quantitative distance measure, and iii) constructing optimal guide trees for integration, STACAS can accurately align scRNA-seq datasets composed of only partially overlapping cell populations. We anticipate that the algorithm will be a useful tool for the construction of comprehensive single-cell atlases by integration of the growing amount of single-cell data becoming available in public repositories.Code availabilityR package:https://github.com/carmonalab/STACASDocker image:https://hub.docker.com/repository/docker/mandrea1/stacas_demo


2018 ◽  
Author(s):  
Sirui Liu ◽  
Ling Zhang ◽  
Hui Quan ◽  
Hao Tian ◽  
Luming Meng ◽  
...  

AbstractThe high-order chromatin structure plays a non-negligible role in gene regulation. However, the mechanism for the formation of different chromatin structures in different cells and the sequence dependence of this process remain to be elucidated. As the nucleotide distributions in human and mouse genomes are highly uneven, we identified CGI forest and prairie genomic domains based on CGI density, which better segregates genomic elements along the genome than GC content. The genome is then divided into two sequentially, epigenetically, and transcriptionally distinct regions. These two types of megabase-sized domains spatially segregate, but to a different extent in different cell types. Overall, the forests and prairies gradually segregate from each other in development, differentiation, and senescence. The multi-scale forest-prairie spatial intermingling is cell-type specific and increases in differentiation, thus helps define the cell identity. We propose that the phase separation of the 1D mosaic sequence in space, serving as a potential driving force, together with cell type specific epigenetic marks and transcription factors, shapes the chromatin structure in different cell types and renders them distinct genomic properties. The mosaicity of the genome manifested in terms of alternative forests and prairies of a species could be related to its biological processes such as differentiation, aging and body temperature control.


2020 ◽  
Author(s):  
Songting Shi

AbstractWe proposed a method for data integration and cell type annotation (Dincta) of single cell transcriptomes in a unify framework. The Dincta can handle three cases. In the first case, the data has been annotated the cell type for all cells, Dincta can integrate the the data into a common low dimension embedding space such that cells with different cell types separate while cells from the different batches but in the same cell type cluster together. In the second case, the data was only annotated for part of cells, such as one sample, Dincta can integrate the data into a common low dimension embedding space such that cells with different cell types separate while cells from the different batches but in the same cell type cluster together. Moreover, it can infer the known or novel cell type of the cells with unknown cell type initially. In the third case, there are no cell type information of cells, we can run Dincta in an unsupervised way. It can infer the number of new cell types and annotate the cells into its correspond cell type, and do data integration keeping cells from different cell type separate while removing the batch effects to mix cells in the same cell type. Dincta is simple, accurate and efficient to integrate data, which keeps the cell type information preserved while removes the batch effects, and infers the known or novel cell types of cells.


2018 ◽  
Vol 2018 ◽  
pp. 1-17 ◽  
Author(s):  
Svetlana V. Kostyuk ◽  
Lev N. Porokhovnik ◽  
Elizaveta S. Ershova ◽  
Elena M. Malinovskaya ◽  
Marina S. Konkova ◽  
...  

Cell-free DNA (cfDNA) is a circulating DNA of nuclear and mitochondrial origin mainly derived from dying cells. Recent studies have shown that cfDNA is a stress signaling DAMP (damage-associated molecular pattern) molecule. We report here that the expression profiles of cfDNA-induced factors NRF2 and NF-κB are distinct depending on the target cell’s type and the GC-content and oxidation rate of the cfDNA. Stem cells (MSC) have shown higher expression ofNRF2without inflammation in response to cfDNA. In contrast, inflammatory response launched by NF-κB was dominant in differentiated cells HUVEC, MCF7, and fibroblasts, with a possibility of transition to massive apoptosis. In each cell type examined, the response for oxidized cfDNA was more acute with higher peak intensity and faster resolution than that for nonoxidized cfDNA. GC-rich nonoxidized cfDNA evoked a weaker and prolonged response with proinflammatory component (NF-κB) as predominant. The exploration of apoptosis rates after adding cfDNA showed that cfDNA with moderately increased GC-content and lightly oxidized DNA promoted cell survival in a hormetic manner. Novel potential therapeutic approaches are proposed, which depend on the current cfDNA content: either preconditioning with low doses of cfDNA before a planned adverse impact or eliminating (binding, etc.) cfDNA when its content has already become high.


Author(s):  
U. Aebi ◽  
P. Rew ◽  
T.-T. Sun

Various types of intermediate-sized (10-nm) filaments have been found and described in many different cell types during the past few years. Despite the differences in the chemical composition among the different types of filaments, they all yield common structural features: they are usually up to several microns long and have a diameter of 7 to 10 nm; there is evidence that they are made of several 2 to 3.5 nm wide protofilaments which are helically wound around each other; the secondary structure of the polypeptides constituting the filaments is rich in ∞-helix. However a detailed description of their structural organization is lacking to date.


1992 ◽  
Vol 67 (01) ◽  
pp. 154-160 ◽  
Author(s):  
P Meulien ◽  
M Nishino ◽  
C Mazurier ◽  
K Dott ◽  
G Piétu ◽  
...  

SummaryThe cloning of the cDNA encoding von Willebrand factor (vWF) has revealed that it is synthesized as a large precursor (pre-pro-vWF) molecule and it is now clear that the prosequence or vWAgll is responsible for the intracellular multimerization of vWF. We have cloned the complete vWF cDNA and expressed it using a recombinant vaccinia virus as vector. We have characterized the structure and function of the recombinant vWF (rvWF) secreted from five different cell types: baby hamster kidney (BHK), Chinese hamster ovary (CHO), human fibroblasts (143B), mouse fibroblasts (L) and primary embryonic chicken cells. Forty-eight hours after infection, the quantity of vWF antigen found in the cell supernatant varied from 3 to 12 U/dl depending on the cell type. By SDS-agarose gel electrophoresis, the percentage of high molecular weight forms of vWF varied from 39 to 49% relative to normal plasma for BHK, CHO, 143B and chicken cells but was less than 10% for L cells. In all cell types, the two anodic subbands of each multimer were missing. The two cathodic subbands were easily detected only in BHK and L cells. By SDS-PAGE of reduced samples, pro-vWF was present in similar quantity to the fully processed vWF subunit in L cells, present in moderate amounts in BHK and CHO and in very low amounts in 143B and chicken cells. rvWF from all cells bound to collagen and to platelets in the presence of ristocetin, the latter showing a high correlation between binding efficiency and degree of multimerization. rvWF from all cells was also shown to bind to purified FVIII and in this case binding appeared to be independent of the degree of multimerization. We conclude that whereas vWF is naturally synthesized only by endothelial cells and megakaryocytes, it can be expressed in a biologically active form from various other cell types.


Acta Naturae ◽  
2016 ◽  
Vol 8 (2) ◽  
pp. 79-86 ◽  
Author(s):  
P. V. Elizar’ev ◽  
D. V. Lomaev ◽  
D. A. Chetverina ◽  
P. G. Georgiev ◽  
M. M. Erokhin

Maintenance of the individual patterns of gene expression in different cell types is required for the differentiation and development of multicellular organisms. Expression of many genes is controlled by Polycomb (PcG) and Trithorax (TrxG) group proteins that act through association with chromatin. PcG/TrxG are assembled on the DNA sequences termed PREs (Polycomb Response Elements), the activity of which can be modulated and switched from repression to activation. In this study, we analyzed the influence of transcriptional read-through on PRE activity switch mediated by the yeast activator GAL4. We show that a transcription terminator inserted between the promoter and PRE doesnt prevent switching of PRE activity from repression to activation. We demonstrate that, independently of PRE orientation, high levels of transcription fail to dislodge PcG/TrxG proteins from PRE in the absence of a terminator. Thus, transcription is not the main factor required for PRE activity switch.


2020 ◽  
Vol 19 (4) ◽  
pp. 248-256
Author(s):  
Yangmin Zheng ◽  
Ziping Han ◽  
Haiping Zhao ◽  
Yumin Luo

Conclusion: Stroke is a complex disease caused by genetic and environmental factors, and its etiological mechanism has not been fully clarified yet, which brings great challenges to its effective prevention and treatment. MAPK signaling pathway regulates gene expression of eukaryotic cells and basic cellular processes such as cell proliferation, differentiation, migration, metabolism and apoptosis, which are considered as therapeutic targets for many diseases. Up to now, mounting evidence has shown that MAPK signaling pathway is involved in the pathogenesis and development of ischemic stroke. However, the upstream kinase and downstream kinase of MAPK signaling pathway are complex and the influencing factors are numerous, the exact role of MAPK signaling pathway in the pathogenesis of ischemic stroke has not been fully elucidated. MAPK signaling molecules in different cell types in the brain respond variously after stroke injury, therefore, the present review article is committed to summarizing the pathological process of different cell types participating in stroke, discussed the mechanism of MAPK participating in stroke. We further elucidated that MAPK signaling pathway molecules can be used as therapeutic targets for stroke, thus promoting the prevention and treatment of stroke.


Viruses ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 257
Author(s):  
Zuzanna Drulis-Kawa ◽  
Barbara Maciejewska

Biofilms are a community of surface-associated microorganisms characterized by the presence of different cell types in terms of physiology and phenotype [...]


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Dvir Gur ◽  
Emily J. Bain ◽  
Kory R. Johnson ◽  
Andy J. Aman ◽  
H. Amalia Pasoili ◽  
...  

AbstractSkin color patterns are ubiquitous in nature, impact social behavior, predator avoidance, and protection from ultraviolet irradiation. A leading model system for vertebrate skin patterning is the zebrafish; its alternating blue stripes and yellow interstripes depend on light-reflecting cells called iridophores. It was suggested that the zebrafish’s color pattern arises from a single type of iridophore migrating differentially to stripes and interstripes. However, here we find that iridophores do not migrate between stripes and interstripes but instead differentiate and proliferate in-place, based on their micro-environment. RNA-sequencing analysis further reveals that stripe and interstripe iridophores have different transcriptomic states, while cryogenic-scanning-electron-microscopy and micro-X-ray diffraction identify different crystal-arrays architectures, indicating that stripe and interstripe iridophores are different cell types. Based on these results, we present an alternative model of skin patterning in zebrafish in which distinct iridophore crystallotypes containing specialized, physiologically responsive, organelles arise in stripe and interstripe by in-situ differentiation.


Sign in / Sign up

Export Citation Format

Share Document