scholarly journals Joint Modeling of RNAseq and Radiomics Data for Glioma Molecular Characterization and Prediction

2021 ◽  
Vol 8 ◽  
Author(s):  
Zeina A. Shboul ◽  
Norou Diawara ◽  
Arastoo Vossough ◽  
James Y. Chen ◽  
Khan M. Iftekharuddin

RNA sequencing (RNAseq) is a recent technology that profiles gene expression by measuring the relative frequency of the RNAseq reads. RNAseq read counts data is increasingly used in oncologic care and while radiology features (radiomics) have also been gaining utility in radiology practice such as disease diagnosis, monitoring, and treatment planning. However, contemporary literature lacks appropriate RNA-radiomics (henceforth, radiogenomics) joint modeling where RNAseq distribution is adaptive and also preserves the nature of RNAseq read counts data for glioma grading and prediction. The Negative Binomial (NB) distribution may be useful to model RNAseq read counts data that addresses potential shortcomings. In this study, we propose a novel radiogenomics-NB model for glioma grading and prediction. Our radiogenomics-NB model is developed based on differentially expressed RNAseq and selected radiomics/volumetric features which characterize tumor volume and sub-regions. The NB distribution is fitted to RNAseq counts data, and a log-linear regression model is assumed to link between the estimated NB mean and radiomics. Three radiogenomics-NB molecular mutation models (e.g., IDH mutation, 1p/19q codeletion, and ATRX mutation) are investigated. Additionally, we explore gender-specific effects on the radiogenomics-NB models. Finally, we compare the performance of the proposed three mutation prediction radiogenomics-NB models with different well-known methods in the literature: Negative Binomial Linear Discriminant Analysis (NBLDA), differentially expressed RNAseq with Random Forest (RF-genomics), radiomics and differentially expressed RNAseq with Random Forest (RF-radiogenomics), and Voom-based count transformation combined with the nearest shrinkage classifier (VoomNSC). Our analysis shows that the proposed radiogenomics-NB model significantly outperforms (ANOVA test, p < 0.05) for prediction of IDH and ATRX mutations and offers similar performance for prediction of 1p/19q codeletion, when compared to the competing models in the literature, respectively.

2020 ◽  
Author(s):  
Javad Nazari ◽  
Parnia-Sadat Fathi ◽  
Nahid Sharahi ◽  
Majid Taheri ◽  
Payam Amini ◽  
...  

Abstract Background: Measles is a feverish condition labeled among the most infectious viral illnesses in the globe. Despite the presence of a secure, accessible, affordable and efficient vaccine, measles continues to be a worldwide concern. Methods: This study uses machine learning and time series methods to assess factors that placed people at a higher risk of measles. This historical cohort study contained the Measles incidence in Markazi Province, the center of Iran, from April 1997 to February 2020. Logistic regression, linear discriminant analysis, random forest, artificial neural network, bagging, support vector machine, and naïve Bayes were used to make the classification. Zero-inflated negative binomial regression for time series was utilized to assess development of measles over time. Results: The prevalence of measles was 14.5% over the recent 24 years and a constant trend of almost zero cases was observed from 2002 to 2020. The order of independent variable importance were recent years, age, vaccination, rhinorrhea, male sex, contact with measles patients, cough, conjunctivitis, ethnic, and fever. Younger age, less probability of contact and no fever is associated with less odds of zero cases. Only 7 new cases were forecasted for the next two years. Bagging and random forest were the most accurate classification methods. Conclusion: Even if the numbers of new cases are almost zero during the recent years, it has been showed that age and contact are responsible for non-occurrence of measles. October and May are prone to have new cases for 2021 and 2022.


2020 ◽  
Vol 15 ◽  
Author(s):  
Mohanad Mohammed ◽  
Henry Mwambi ◽  
Bernard Omolo

Background: Colorectal cancer (CRC) is the third most common cancer among women and men in the USA, and recent studies have shown an increasing incidence in less developed regions, including Sub-Saharan Africa (SSA). We developed a hybrid (DNA mutation and RNA expression) signature and assessed its predictive properties for the mutation status and survival of CRC patients. Methods: Publicly-available microarray and RNASeq data from 54 matched formalin-fixed paraffin-embedded (FFPE) samples from the Affymetrix GeneChip and RNASeq platforms, were used to obtain differentially expressed genes between mutant and wild-type samples. We applied the support-vector machines, artificial neural networks, random forests, k-nearest neighbor, naïve Bayes, negative binomial linear discriminant analysis, and the Poisson linear discriminant analysis algorithms for classification. Cox proportional hazards model was used for survival analysis. Results: Compared to the genelist from each of the individual platforms, the hybrid genelist had the highest accuracy, sensitivity, specificity, and AUC for mutation status, across all the classifiers and is prognostic for survival in patients with CRC. NBLDA method was the best performer on the RNASeq data while the SVM method was the most suitable classifier for CRC across the two data types. Nine genes were found to be predictive of survival. Conclusion: This signature could be useful in clinical practice, especially for colorectal cancer diagnosis and therapy. Future studies should determine the effectiveness of integration in cancer survival analysis and the application on unbalanced data, where the classes are of different sizes, as well as on data with multiple classes.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Rowan AlEjielat ◽  
Anas Khaleel ◽  
Amneh H. Tarkhan

Abstract Background Ankylosing spondylitis (AS) is a rare inflammatory disorder affecting the spinal joints. Although we know some of the genetic factors that are associated with the disease, the molecular basis of this illness has not yet been fully elucidated, and the genes involved in AS pathogenesis have not been entirely identified. The current study aimed at constructing a gene network that may serve as an AS gene signature and biomarker, both of which will help in disease diagnosis and the identification of therapeutic targets. Previously published gene expression profiles of 16 AS patients and 16 gender- and age-matched controls that were profiled on the Illumina HumanHT-12 V3.0 Expression BeadChip platform were mined. Patients were Portuguese, 21 to 64 years old, were diagnosed based on the modified New York criteria, and had Bath Ankylosing Spondylitis Disease Activity Index scores > 4 and Bath Ankylosing Spondylitis Functional Index scores > 4. All patients were receiving only NSAIDs and/or sulphasalazine. Functional enrichment and pathway analysis were performed to create an interaction network of differentially expressed genes. Results ITM2A, ICOS, VSIG10L, CD59, TRAC, and CTLA-4 were among the significantly differentially expressed genes in AS, but the most significantly downregulated genes were the HLA-DRB6, HLA-DRB5, HLA-DRB4, HLA-DRB3, HLA-DRB1, HLA-DQB1, ITM2A, and CTLA-4 genes. The genes in this study were mostly associated with the regulation of the immune system processes, parts of cell membrane, and signaling related to T cell receptor and antigen receptor, in addition to some overlaps related to the IL2 STAT signaling, as well as the androgen response. The most significantly over-represented pathways in the data set were associated with the “RUNX1 and FOXP3 which control the development of regulatory T lymphocytes (Tregs)” and the “GABA receptor activation” pathways. Conclusions Comprehensive gene analysis of differentially expressed genes in AS reveals a significant gene network that is involved in a multitude of important immune and inflammatory pathways. These pathways and networks might serve as biomarkers for AS and can potentially help in diagnosing the disease and identifying future targets for treatment.


2020 ◽  
Vol 98 (Supplement_3) ◽  
pp. 165-166
Author(s):  
Elisa B Carvalho ◽  
Letícia P Sanglard ◽  
Karolina B Nascimento ◽  
Javier M Meneses ◽  
Daniel R Casagrande ◽  
...  

Abstract Gestating cows have an increased nutrient demand to meet the needs of developing the fetus and the mid-gestation is a critical period for the fetal skeletal muscle development. The aim of this study was to evaluate the skeletal muscle transcriptome in the progeny as a function of the maternal protein nutrition during mid-gestation. Eleven Tabapuã cows and their male calves were used in this study. In the first third of gestation (0 to 100 days of gestation; dg), all cows were kept on pasture. From 100 to 200 dg, the control group (CTRL; 7 animals) received a basal diet achieving 5.5% crude protein (CP), whereas the supplemented group (SUPPL; 4 animals) received a basal diet plus protein supplementation (40% CP). After 200 dg, all animals received the same diet. Weaning was performed at 205 ± 7.5 days of age and animals were kept on pasture until reaching 240 days of age, when they were transferred to a feedlot. Muscle samples were collected at 260 days of age and RNA was extracted for RNA-seq analysis. Gene expression data was analyzed with a negative binomial model to identify (q-value ≤ 0.05) differentially expressed genes (DEG) between treatments. A total of 716 DEG were identified (289 DEG up-regulated and 427 down-regulated in SUPPL group; q-value ≤ 0.05). From the 10 most significant down-regulated DEG in the SUPPL group, two genes associated with apoptotic process were identified: MAPK8IP1 and GRINA, with log2 Fold-Changes (log2FC) of 1.04 and 0.49, respectively. From the 10 most significant up-regulated DEG in the SUPPL group, mTOR was identified, with log2FC=0.31. This is a well-known gene involved in muscle protein synthesis. In conclusion, maternal protein supplementation during mid-gestation affects the expression of genes related to energy metabolism and muscle development, which can lead to long-term impacts on production efficiency.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4523 ◽  
Author(s):  
Carlos Cabo ◽  
Celestino Ordóñez ◽  
Fernando Sáchez-Lasheras ◽  
Javier Roca-Pardiñas ◽  
and Javier de Cos-Juez

We analyze the utility of multiscale supervised classification algorithms for object detection and extraction from laser scanning or photogrammetric point clouds. Only the geometric information (the point coordinates) was considered, thus making the method independent of the systems used to collect the data. A maximum of five features (input variables) was used, four of them related to the eigenvalues obtained from a principal component analysis (PCA). PCA was carried out at six scales, defined by the diameter of a sphere around each observation. Four multiclass supervised classification models were tested (linear discriminant analysis, logistic regression, support vector machines, and random forest) in two different scenarios, urban and forest, formed by artificial and natural objects, respectively. The results obtained were accurate (overall accuracy over 80% for the urban dataset, and over 93% for the forest dataset), in the range of the best results found in the literature, regardless of the classification method. For both datasets, the random forest algorithm provided the best solution/results when discrimination capacity, computing time, and the ability to estimate the relative importance of each variable are considered together.


2021 ◽  
Vol 95 ◽  
Author(s):  
A. Čeirāns ◽  
E. Gravele ◽  
I. Gavarane ◽  
M. Pupins ◽  
L. Mezaraupe ◽  
...  

Abstract Helminth infracommunities were studied at 174 sites of Latvia in seven hosts from six amphibian taxa of different taxonomical, ontogenic and ecological groups. They were described using a standard set of parasitological parameters, compared by ecological indices and linear discriminant analysis. Their species associations were identified by Kendall's rank correlation, but relationships with host size and waterbody area were analysed by zero-inflated Poisson and zero-inflated negative binomial regressions. The richest communities (25 species) were found in post-metamorphic semi-aquatic Pelophylax spp. frogs, which were dominated by trematode species of both adult and larval stages. Both larval and terrestrial hosts yielded depauperate trematode communities with accession of aquatic and soil-transmitted nematode species, respectively. Nematode loads peaked in terrestrial Bufo bufo. Helminth infracommunities suggested some differences in host microhabitat or food object selection not detected by their ecology studies. Associations were present in 96% of helminth species (on average, 7.3 associations per species) and dominated positive ones. Species richness and abundances, in most cases, were positively correlated with host size, which could be explained by increasing parasite intake rates over host ontogeny (trematode adult stages) or parasite accumulation (larval Alaria alata). Two larval diplostomid species (Strigea strigis, Tylodelphys excavata) had a negative relationship with host size, which could be caused by parasite-induced host mortality. The adult trematode abundances were higher in larger waterbodies, most likely due to their ecosystem richness, while higher larval abundances in smaller waterbodies could be caused by elevated infection rates under high host densities.


GeroPsych ◽  
2011 ◽  
Vol 24 (4) ◽  
pp. 177-185 ◽  
Author(s):  
Graciela Muniz Terrera ◽  
Andrea M. Piccinin ◽  
Fiona Matthews ◽  
Scott M. Hofer

Joint longitudinal-survival models are useful when repeated measures and event time data are available and possibly associated. The application of this joint model in aging research is relatively rare, albeit particularly useful, when there is the potential for nonrandom dropout. In this article we illustrate the method and discuss some issues that may arise when fitting joint models of this type. Using prose recall scores from the Swedish OCTO-Twin Longitudinal Study of Aging, we fitted a joint longitudinal-survival model to investigate the association between risk of mortality and individual differences in rates of change in memory. A model describing change in memory scores as following an accelerating decline trajectory and a Weibull survival model was identified as the best fitting. This model adjusted for random effects representing individual variation in initial memory performance and change in rate of decline as linking terms between the longitudinal and survival models. Memory performance and change in rate of memory decline were significant predictors of proximity to death. Joint longitudinal-survival models permit researchers to gain a better understanding of the association between change functions and risk of particular events, such as disease diagnosis or death. Careful consideration of computational issues may be required because of the complexities of joint modeling methodologies.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 264-265
Author(s):  
Duy Ngoc Do ◽  
Guoyu Hu ◽  
Younes Miar

Abstract American mink (Neovison vison) is the major source of fur for the fur industries worldwide and Aleutian disease (AD) is causing severe financial losses to the mink industry. Different methods have been used to diagnose the AD in mink, but the combination of several methods can be the most appropriate approach for the selection of AD resilient mink. Iodine agglutination test (IAT) and counterimmunoelectrophoresis (CIEP) methods are commonly employed in test-and-remove strategy; meanwhile, enzyme-linked immunosorbent assay (ELISA) and packed-cell volume (PCV) methods are complementary. However, using multiple methods are expensive; and therefore, hindering the corrected use of AD tests in selection. This research presented the assessments of the AD classification based on machine learning algorithms. The Aleutian disease was tested on 1,830 individuals using these tests in an AD positive mink farm (Canadian Centre for Fur Animal Research, NS, Canada). The accuracy of classification for CIEP was evaluated based on the sex information, and IAT, ELISA and PCV test results implemented in seven machine learning classification algorithms (Random Forest, Artificial Neural Networks, C50Tree, Naive Bayes, Generalized Linear Models, Boost, and Linear Discriminant Analysis) using the Caret package in R. The accuracy of prediction varied among the methods. Overall, the Random Forest was the best-performing algorithm for the current dataset with an accuracy of 0.89 in the training data and 0.94 in the testing data. Our work demonstrated the utility and relative ease of using machine learning algorithms to assess the CIEP information, and consequently reducing the cost of AD tests. However, further works require the inclusion of production and reproduction information in the models and extension of phenotypic collection to increase the accuracy of current methods.


Sign in / Sign up

Export Citation Format

Share Document