scholarly journals Multi-Run Concrete Autoencoder to Identify Prognostic lncRNAs for 12 Cancers

2021 ◽  
Vol 22 (21) ◽  
pp. 11919
Author(s):  
Abdullah Al Mamun ◽  
Raihanul Bari Tanvir ◽  
Masrur Sobhan ◽  
Kalai Mathee ◽  
Giri Narasimhan ◽  
...  

Background: Long non-coding RNA plays a vital role in changing the expression profiles of various target genes that lead to cancer development. Thus, identifying prognostic lncRNAs related to different cancers might help in developing cancer therapy. Method: To discover the critical lncRNAs that can identify the origin of different cancers, we propose the use of the state-of-the-art deep learning algorithm concrete autoencoder (CAE) in an unsupervised setting, which efficiently identifies a subset of the most informative features. However, CAE does not identify reproducible features in different runs due to its stochastic nature. We thus propose a multi-run CAE (mrCAE) to identify a stable set of features to address this issue. The assumption is that a feature appearing in multiple runs carries more meaningful information about the data under consideration. The genome-wide lncRNA expression profiles of 12 different types of cancers, with a total of 4768 samples available in The Cancer Genome Atlas (TCGA), were analyzed to discover the key lncRNAs. The lncRNAs identified by multiple runs of CAE were added to a final list of key lncRNAs that are capable of identifying 12 different cancers. Results: Our results showed that mrCAE performs better in feature selection than single-run CAE, standard autoencoder (AE), and other state-of-the-art feature selection techniques. This study revealed a set of top-ranking 128 lncRNAs that could identify the origin of 12 different cancers with an accuracy of 95%. Survival analysis showed that 76 of 128 lncRNAs have the prognostic capability to differentiate high- and low-risk groups of patients with different cancers. Conclusion: The proposed mrCAE, which selects actual features, outperformed the AE even though it selects the latent or pseudo-features. By selecting actual features instead of pseudo-features, mrCAE can be valuable for precision medicine. The identified prognostic lncRNAs can be further studied to develop therapies for different cancers.

2021 ◽  
Author(s):  
Abdullah Al Mamun ◽  
Raihanul Bari Tanvir ◽  
Masrur Sobhan ◽  
Kalai Mathee ◽  
Giri Narasimhan ◽  
...  

Long non-coding RNA plays a vital role in changing the expression profiles of various target genes that leads to cancer development. Thus, identifying prognostic lncRNAs related to different cancers might help in developing cancer therapy. To discover the critical lncRNAs that can identify the origin of different cancers, we proposed to use the state-of-the-art deep learning algorithm Concreate Autoencoder (CAE) in an unsupervised setting, which efficiently identifies a subset of the most informative features. However, CAE does not identify reproducible features in different runs due to its stochastic nature. We proposed a multi-run CAE (mrCAE) to identify a stable set of features to address this issue. The assumption is that a feature appearing in multiple runs carries more meaningful information about the data under consideration. The genome-wide lncRNA expression profiles of 12 different types of cancers, a total of 4,768 samples available in The Cancer Genome Atlas (TCGA), were analyzed to discover the key lncRNAs. The lncRNAs identified by multiple runs of CAE were added to the final list of key lncRNAs, which are capable of identifying 12 different cancers. Our results showed that mrCAE performs better in feature selection than single-run CAE, standard autoencoder (AE), and other state-of-the-art feature selection techniques. This study discovered a set of top-ranking 128 lncRNAs that could identify the origin of 12 different cancers with an accuracy of 95%. Survival analysis showed that 76 of 128 lncRNAs have the prognostic capability in differentiating high- and low-risk groups of patients in different cancers. The proposed mrCAE outperformed the standard autoencoder, which selects the latent features and is thought to be the upper limit in dimension reduction. Since the proposed mrCAE selects actual features and outperformed AE, it has the potential to provide information that can be used for precision medicine, such as identifying prognostic lncRNAs (this work) and mRNAs, miRNAs, and DNA methylated genes (future work) for different cancers.


Agronomy ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2364
Author(s):  
Ali Mirzazadeh ◽  
Afshin Azizi ◽  
Yousef Abbaspour-Gilandeh ◽  
José Luis Hernández-Hernández ◽  
Mario Hernández-Hernández ◽  
...  

Estimation of crop damage plays a vital role in the management of fields in the agriculture sector. An accurate measure of it provides key guidance to support agricultural decision-making systems. The objective of the study was to propose a novel technique for classifying damaged crops based on a state-of-the-art deep learning algorithm. To this end, a dataset of rapeseed field images was gathered from the field after birds’ attacks. The dataset consisted of three classes including undamaged, partially damaged, and fully damaged crops. Vgg16 and Res-Net50 as pre-trained deep convolutional neural networks were used to classify these classes. The overall classification accuracy reached 93.7% and 98.2% for the Vgg16 and the ResNet50 algorithms, respectively. The results indicated that a deep neural network has a high ability in distinguishing and categorizing different image-based datasets of rapeseed. The findings also revealed a great potential of deep learning-based models to classify other damaged crops.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Yuntao Shi ◽  
Yingying Zhuang ◽  
Jialing Zhang ◽  
Mengxue Chen ◽  
Shangnong Wu

Objective. Although noncoding RNAs, especially the microRNAs, have been found to play key roles in CRC development in intestinal tissue, the specific mechanism of these microRNAs has not been fully understood. Methods. GEO and TCGA database were used to explore the microRNA expression profiles of normal mucosa, adenoma, and carcinoma. And the differential expression genes were selected. Computationally, we built the SVM model and multivariable Cox regression model to evaluate the performance of tumorigenic microRNAs in discriminating the adenomas from normal tissues and risk prediction. Results. In this study, we identified 20 miRNA biomarkers dysregulated in the colon adenomas. The functional enrichment analysis showed that MAPK activity and MAPK cascade were highly enriched by these tumorigenic microRNAs. We also investigated the target genes of the tumorigenic microRNAs. Eleven genes, including PIGF, TPI1, KLF4, RARS, PCBP2, EIF5A, HK2, RAVER2, HMGN1, MAPK6, and NDUFA2, were identified to be frequently targeted by the tumorigenic microRNAs. The high AUC value and distinct overall survival rates between the two risk groups suggested that these tumorigenic microRNAs had the potential of diagnostic and prognostic value in CRC. Conclusions. The present study revealed possible mechanisms and pathways that may contribute to tumorigenesis of CRC, which could not only be used as CRC early detection biomarkers, but also be useful for tumorigenesis mechanism studies.


2021 ◽  
Vol 11 (8) ◽  
pp. 1288-1298
Author(s):  
Liang Wang ◽  
Fengxia Xue

Endometrial cancer is one of the most common gynecological malignancies, and DNA methylation plays a vital role in its occurrence and development. In this study, we collected the relevant data on endometrial cancer from the Cancer Genome Atlas database and UCSC website. By screening and processing the data, we obtained 410 samples and 16,381 methylation sites. Endometrial carcinoma can be divided into seven molecular subtypes using consensus clustering method. Based on the analysis of the differences among subtypes, the methylation degree of different sites was obtained, and the prognosis model of methylation sites was established. Based on the median value of the train group, the train and test groups were divided into high and low-risk groups. The survival between the high and low-risk groups was different. It also showed that this model can predict the survival of patients, with better accuracy. In conclusion, the tumor subtypes based on methylation sites can provide a better guidance for treatment, relapse, and prognosis of endometrial cancer. In this study, magnetic nanoparticles can be used to extract genomic DNA and total RNA due to their paramagnetism and biocompatibility, then transcriptome high-throughput sequencing was performed. It may serve as potential cancer immune biomarker targets for developing future oncological treatments.


BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Chao Yang ◽  
Shuoyang Huang ◽  
Fengyu Cao ◽  
Yongbin Zheng

Abstract Background and aim Lipid metabolic reprogramming is considered to be a new hallmark of malignant tumors. The purpose of this study was to explore the expression profiles of lipid metabolism-related genes (LMRG) in colorectal cancer (CRC). Methods The lipid metabolism statuses of 500 CRC patients from the Cancer Genome Atlas (TCGA) and 523 from the Gene Expression Omnibus (GEO GSE39582) database were analyzed. The risk signature was constructed by univariate Cox regression and least absolute shrinkage and selection operator (LASSO) Cox regression. Results A novel four-LMRG signature (PROCA1, CCKBR, CPT2, and FDFT1) was constructed to predict clinical outcomes in CRC patients. The risk signature was shown to be an independent prognostic factor for CRC and was associated with tumour malignancy. Principal components analysis demonstrated that the risk signature could distinguish between low- and high-risk patients. There were significantly differences in abundances of tumor-infiltrating immune cells and mutational landscape between the two risk groups. Patients in the low-risk group were more likely to have higher tumor mutational burden, stem cell characteristics, and higher PD-L1 expression levels. Furthermore, a genomic-clinicopathologic nomogram was established and shown to be a more effective risk stratification tool than any clinical parameter alone. Conclusions This study demonstrated the prognostic value of LMRG and showed that they may be partially involved in the suppressive immune microenvironment formation.


Author(s):  
Usman Ahmed ◽  
Jerry Chun-Wei Lin ◽  
Gautam Srivastava

Deep learning methods have led to a state of the art medical applications, such as image classification and segmentation. The data-driven deep learning application can help stakeholders to collaborate. However, limited labelled data set limits the deep learning algorithm to generalize for one domain into another. To handle the problem, meta-learning helps to learn from a small set of data. We proposed a meta learning-based image segmentation model that combines the learning of the state-of-the-art model and then used it to achieve domain adoption and high accuracy. Also, we proposed a prepossessing algorithm to increase the usability of the segments part and remove noise from the new test image. The proposed model can achieve 0.94 precision and 0.92 recall. The ability to increase 3.3% among the state-of-the-art algorithms.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Zhixun Zhao ◽  
Xiaocai Zhang ◽  
Fang Chen ◽  
Liang Fang ◽  
Jinyan Li

Abstract Background DNA N4-methylcytosine (4mC) is a critical epigenetic modification and has various roles in the restriction-modification system. Due to the high cost of experimental laboratory detection, computational methods using sequence characteristics and machine learning algorithms have been explored to identify 4mC sites from DNA sequences. However, state-of-the-art methods have limited performance because of the lack of effective sequence features and the ad hoc choice of learning algorithms to cope with this problem. This paper is aimed to propose new sequence feature space and a machine learning algorithm with feature selection scheme to address the problem. Results The feature importance score distributions in datasets of six species are firstly reported and analyzed. Then the impact of the feature selection on model performance is evaluated by independent testing on benchmark datasets, where ACC and MCC measurements on the performance after feature selection increase by 2.3% to 9.7% and 0.05 to 0.19, respectively. The proposed method is compared with three state-of-the-art predictors using independent test and 10-fold cross-validations, and our method outperforms in all datasets, especially improving the ACC by 3.02% to 7.89% and MCC by 0.06 to 0.15 in the independent test. Two detailed case studies by the proposed method have confirmed the excellent overall performance and correctly identified 24 of 26 4mC sites from the C.elegans gene, and 126 out of 137 4mC sites from the D.melanogaster gene. Conclusions The results show that the proposed feature space and learning algorithm with feature selection can improve the performance of DNA 4mC prediction on the benchmark datasets. The two case studies prove the effectiveness of our method in practical situations.


2019 ◽  
Vol 20 (20) ◽  
pp. 5137 ◽  
Author(s):  
Pengjie Wang ◽  
Xuejin Chen ◽  
Yongchun Guo ◽  
Yucheng Zheng ◽  
Chuan Yue ◽  
...  

C-repeat binding factors (CBFs) are key signaling genes that can be rapidly induced by cold and bind to the C-repeat/dehydration-responsive motif (CRT/DRE) in the promoter region of the downstream cold-responsive (COR) genes, which play a vital role in the plant response to low temperature. However, the CBF family in tea plants has not yet been elucidated, and the possible target genes regulated by this family under low temperature are still unclear. In this study, we identified five CsCBF family genes in the tea plant genome and analyzed their phylogenetic tree, conserved domains and motifs, and cis-elements. These results indicate that CsCBF3 may be unique in the CsCBF family. This is further supported by our findings from the low-temperature treatment: all the CsCBF genes except CsCBF3 were significantly induced after treatment at 4 °C. The expression profiles of eight tea plant tissues showed that CsCBFs were mainly expressed in winter mature leaves, roots and fruits. Furthermore, 685 potential target genes were identified by transcriptome data and CRT/DRE element information. These target genes play a functional role under the low temperatures of winter through multiple pathways, including carbohydrate metabolism, lipid metabolism, cell wall modification, circadian rhythm, calcium signaling, transcriptional cascade, and hormone signaling pathways. Our findings will further the understanding of the stress regulatory network of CsCBFs in tea plants.


2014 ◽  
Vol 641-642 ◽  
pp. 1287-1290
Author(s):  
Lan Zhang ◽  
Yu Feng Nie ◽  
Zhen Hai Wang

Deep neural network as a part of deep learning algorithm is a state-of-the-art approach to find higher level representations of input data which has been introduced to many practical and challenging learning problems successfully. The primary goal of deep learning is to use large data to help solving a given task on machine learning. We propose an methodology for image de-noising project defined by this model and conduct training a large image database to get the experimental output. The result shows the robustness and efficient our our algorithm.


Sign in / Sign up

Export Citation Format

Share Document