scholarly journals A merged microarray meta-dataset for transcriptionally profiling colorectal neoplasm formation and progression

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Michael Rohr ◽  
Jordan Beardsley ◽  
Sai Preethi Nakkina ◽  
Xiang Zhu ◽  
Jihad Aljabban ◽  
...  

AbstractTranscriptional profiling of pre- and post-malignant colorectal cancer (CRC) lesions enable temporal monitoring of molecular events underlying neoplastic progression. However, the most widely used transcriptomic dataset for CRC, TCGA-COAD, is devoid of adenoma samples, which increases reliance on an assortment of disparate microarray studies and hinders consensus building. To address this, we developed a microarray meta-dataset comprising 231 healthy, 132 adenoma, and 342 CRC tissue samples from twelve independent studies. Utilizing a stringent analytic framework, select datasets were downloaded from the Gene Expression Omnibus, normalized by frozen robust multiarray averaging and subsequently merged. Batch effects were then identified and removed by empirical Bayes estimation (ComBat). Finally, the meta-dataset was filtered for low variant probes, enabling downstream differential expression as well as quantitative and functional validation through cross-platform correlation and enrichment analyses, respectively. Overall, our meta-dataset provides a robust tool for investigating colorectal adenoma formation and malignant transformation at the transcriptional level with a pipeline that is modular and readily adaptable for similar analyses in other cancer types.

Reproduction ◽  
2010 ◽  
Vol 139 (5) ◽  
pp. 809-823 ◽  
Author(s):  
Piraye Yurttas ◽  
Eric Morency ◽  
Scott A Coonrod

As IVF becomes an increasingly popular method for human reproduction, it is more critical than ever to understand the unique molecular composition of the mammalian oocyte. DNA microarray studies have successfully provided valuable information regarding the identity and dynamics of factors at the transcriptional level. However, the oocyte transcribes and stores a large amount of material that plays no obvious role in oogenesis, but instead is required to regulate embryogenesis. Therefore, an accurate picture of the functional state of the oocyte requires both transcriptional profiling and proteomics. Here, we summarize our previous studies of the oocyte proteome, and present new panels of oocyte proteins that we recently identified in screens of metaphase II-arrested mouse oocytes. Importantly, our studies indicate that several abundant oocyte proteins are not, as one might predict, ubiquitous housekeeping proteins, but instead are unique to the oocyte. Furthermore, mouse studies indicate that a number of these factors arise from maternal effect genes (MEGs). One of the identified MEG proteins, peptidylarginine deiminase 6, localizes to and is required for the formation of a poorly characterized, highly abundant cytoplasmic structure: the oocyte cytoplasmic lattices. Additionally, a number of other MEG-derived abundant proteins identified in our proteomic screens have been found by others to localize to another unique oocyte feature: the subcortical maternal complex. Based on these observations, we put forth the hypothesis that the mammalian oocyte contains several unique storage structures, which we have named maternal effect structures, that facilitate the oocyte-to-embryo transition.


2021 ◽  
Vol 11 (10) ◽  
pp. 4429
Author(s):  
Ana Šarčević ◽  
Damir Pintar ◽  
Mihaela Vranić ◽  
Ante Gojsalić

The prediction of sport event results has always drawn attention from a vast variety of different groups of people, such as club managers, coaches, betting companies, and the general population. The specific nature of each sport has an important role in the adaption of various predictive techniques founded on different mathematical and statistical models. In this paper, a common approach of modeling sports with a strongly defined structure and a rigid scoring system that relies on an assumption of independent and identical point distributions is challenged. It is demonstrated that such models can be improved by introducing dynamics into the match models in the form of sport momentums. Formal mathematical models for implementing these momentums based on conditional probability and empirical Bayes estimation are proposed, which are ultimately combined through a unifying hybrid approach based on the Monte Carlo simulation. Finally, the method is applied to real-life volleyball data demonstrating noticeable improvements over the previous approaches when it comes to predicting match outcomes. The method can be implemented into an expert system to obtain insight into the performance of players at different stages of the match or to study field scenarios that may arise under different circumstances.


BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Ping Yan ◽  
Zuotian Huang ◽  
Tong Mou ◽  
Yunhai Luo ◽  
Yanyao Liu ◽  
...  

Abstract Background Hepatocellular carcinoma (HCC) is one of the most common and deadly malignant tumors, with a high rate of recurrence worldwide. This study aimed to investigate the mechanism underlying the progression of HCC and to identify recurrence-related biomarkers. Methods We first analyzed 132 HCC patients with paired tumor and adjacent normal tissue samples from the Gene Expression Omnibus (GEO) database to identify differentially expressed genes (DEGs). The expression profiles and clinical information of 372 HCC patients from The Cancer Genome Atlas (TCGA) database were next analyzed to further validate the DEGs, construct competing endogenous RNA (ceRNA) networks and discover the prognostic genes associated with recurrence. Finally, several recurrence-related genes were evaluated in two external cohorts, consisting of fifty-two and forty-nine HCC patients, respectively. Results With the comprehensive strategies of data mining, two potential interactive ceRNA networks were constructed based on the competitive relationships of the ceRNA hypothesis. The ‘upregulated’ ceRNA network consists of 6 upregulated lncRNAs, 3 downregulated miRNAs and 5 upregulated mRNAs, and the ‘downregulated’ network includes 4 downregulated lncRNAs, 12 upregulated miRNAs and 67 downregulated mRNAs. Survival analysis of the genes in the ceRNA networks demonstrated that 20 mRNAs were significantly associated with recurrence-free survival (RFS). Based on the prognostic mRNAs, a four-gene signature (ADH4, DNASE1L3, HGFAC and MELK) was established with the least absolute shrinkage and selection operator (LASSO) algorithm to predict the RFS of HCC patients, the performance of which was evaluated by receiver operating characteristic curves. The signature was also validated in two external cohort and displayed effective discrimination and prediction for the RFS of HCC patients. Conclusions In conclusion, the present study elucidated the underlying mechanisms of tumorigenesis and progression, provided two visualized ceRNA networks and successfully identified several potential biomarkers for HCC recurrence prediction and targeted therapies.


Pharmaceutics ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 42
Author(s):  
Walter M. Yamada ◽  
Michael N. Neely ◽  
Jay Bartroff ◽  
David S. Bayard ◽  
James V. Burke ◽  
...  

Population pharmacokinetic (PK) modeling has become a cornerstone of drug development and optimal patient dosing. This approach offers great benefits for datasets with sparse sampling, such as in pediatric patients, and can describe between-patient variability. While most current algorithms assume normal or log-normal distributions for PK parameters, we present a mathematically consistent nonparametric maximum likelihood (NPML) method for estimating multivariate mixing distributions without any assumption about the shape of the distribution. This approach can handle distributions with any shape for all PK parameters. It is shown in convexity theory that the NPML estimator is discrete, meaning that it has finite number of points with nonzero probability. In fact, there are at most N points where N is the number of observed subjects. The original infinite NPML problem then becomes the finite dimensional problem of finding the location and probability of the support points. In the simplest case, each point essentially represents the set of PK parameters for one patient. The probability of the points is found by a primal-dual interior-point method; the location of the support points is found by an adaptive grid method. Our method is able to handle high-dimensional and complex multivariate mixture models. An important application is discussed for the problem of population pharmacokinetics and a nontrivial example is treated. Our algorithm has been successfully applied in hundreds of published pharmacometric studies. In addition to population pharmacokinetics, this research also applies to empirical Bayes estimation and many other areas of applied mathematics. Thereby, this approach presents an important addition to the pharmacometric toolbox for drug development and optimal patient dosing.


2002 ◽  
Vol 14 (4) ◽  
pp. 435-448 ◽  
Author(s):  
R. J. Karunamuni ◽  
R. S. Singh ◽  
S. Zhang

Sign in / Sign up

Export Citation Format

Share Document