scholarly journals Development and Validation of Epigenetic Modification-Related Signals for the Diagnosis and Prognosis of Hepatocellular Carcinoma

2021 ◽  
Vol 11 ◽  
Author(s):  
Maoqing Lu ◽  
Sheng Qiu ◽  
Xianyao Jiang ◽  
Diguang Wen ◽  
Ronggui Zhang ◽  
...  

BackgroundIncreasing evidence has indicated that abnormal epigenetic factors such as RNA m6A modification, histone modification, DNA methylation, RNA binding proteins and transcription factors are correlated with hepatocarcinogenesis. However, it is unknown how epigenetic modification-associated genes contribute to the occurrence and clinical outcome of hepatocellular carcinoma (HCC). Thus, we constructed the epigenetic modification-associated models that may enhance the diagnosis and prognosis of HCC.MethodsIn this study, we focused on the clinical value of epigenetic modification-associated genes for HCC. Our gene expression data were collected from TCGA and HCC data sets from the GEO database to ensure the reliability of the data. Their functions were analyzed by bioinformatics methods. We used lasso regression, Support vector machine (SVM), logistic regression and Cox regression to construct the diagnostic and prognostic models. We also constructed a nomogram of the practicability of the above-mentioned prognostic model. The above results were verified in an independent liver cancer data set from the ICGC database and clinical samples. Furthermore, we carried out pan-cancer analysis to verify the specificity of the above model and screened a wide range of drug candidates.ResultsMany epigenetic modification-associated genes were significantly different in HCC and normal liver tissues. The gene signatures showed a good ability to predict the occurrence and survival of HCC patients, as verified by DCA and ROC curve analysis.ConclusionGene signatures based on epigenetic modification-associated genes can be used to identify the occurrence and prognosis of liver cancer.

2021 ◽  
Author(s):  
Diguang Wen ◽  
Sheng Qiu ◽  
Zuojin Liu

Abstract Background: Increasing evidence has indicated that abnormal epigenetic modification such as RNAm6a modification, histone modification, DNA methylation modification, RNA binding proteins and transcription factors, is correlated with Hepatocarcinogenesis. However, it is unknown how epigenetic modification associated genes contribute to the occurrence and clinical outcome of hepatocellular carcinoma (HCC). Thus, we constructed epigenetic modification associated model that may enhance the diagnosis and prognosis of HCC.METHODS: In this study, we focused on the clinical values of epigenetic modification associated genes for HCC. Our gene expression data were collected from TCGA and a HCC datasets from GEO dataset in order to ensure the reliability of data. Their function was analyzed by bioinformatics methods. We used lasso regression, SUV, logistic regression and cox regression to construct the diagnosis and prognosis models. We also constructed a nomogram for the practicability of the above-mentioned prognosis model. The above results have been verified in an independent liver cancer dataset from ICGC database. Furthermore, we carried out pan cancer analysis to verify the specificity of the above model.RESULT: A large number of epigenetic modification associated genes were significantly different in HCC and normal liver tissues. The gene signatures showed good performance for predicting the occurrence and survival of HCC patients verified by DCA and ROC curve.CONCLUSION: Gene signatures based on epigenetic modification associated genes can be used to identify the occurrence and prognosis of liver cancer.


PLoS ONE ◽  
2011 ◽  
Vol 6 (9) ◽  
pp. e24582 ◽  
Author(s):  
Irena Ivanovska ◽  
Chunsheng Zhang ◽  
Angela M. Liu ◽  
Kwong F. Wong ◽  
Nikki P. Lee ◽  
...  

2021 ◽  
Vol 12 (3) ◽  
pp. 1728-1737
Author(s):  
Juliana Wahid Et.al

The fourth most frequent cause of cancer death in women is cervical cancer. No sign can be observed in the early stages of the disease. In addition, cervical cancer diagnosis methods used in health centers are time-consuming and costly. Data classification has been widely applied in the diagnosis of cervical cancer for knowledge acquisition. However, none of the existing intelligent methods are comprehensible, and they look like a black box to clinicians. In this paper, an ant colony optimization-based classification algorithm, Ant-Miner is applied to analyze the cervical cancer data set. The cervical cancer data set used was obtained from the repository of the University of California, Irvine. The proposed algorithm outperforms the previous approach, support vector machine, in the same domain, in terms of the better result of classification accuracy. The proposed method is implemented as an engine in a prototype system named as the cervical cancer detection system. Evaluation of the prototype system demonstrates a good result on its usability and functionality.


2020 ◽  
Author(s):  
Mazin Mohammed ◽  
Karrar Hameed Abdulkareem ◽  
Mashael S. Maashi ◽  
Salama A. Mostafa A. Mostafa ◽  
Abdullah Baz ◽  
...  

BACKGROUND In most recent times, global concern has been caused by a coronavirus (COVID19), which is considered a global health threat due to its rapid spread across the globe. Machine learning (ML) is a computational method that can be used to automatically learn from experience and improve the accuracy of predictions. OBJECTIVE In this study, the use of machine learning has been applied to Coronavirus dataset of 50 X-ray images to enable the development of directions and detection modalities with risk causes.The dataset contains a wide range of samples of COVID-19 cases alongside SARS, MERS, and ARDS. The experiment was carried out using a total of 50 X-ray images, out of which 25 images were that of positive COVIDE-19 cases, while the other 25 were normal cases. METHODS An orange tool has been used for data manipulation. To be able to classify patients as carriers of Coronavirus and non-Coronavirus carriers, this tool has been employed in developing and analysing seven types of predictive models. Models such as , artificial neural network (ANN), support vector machine (SVM), linear kernel and radial basis function (RBF), k-nearest neighbour (k-NN), Decision Tree (DT), and CN2 rule inducer were used in this study.Furthermore, the standard InceptionV3 model has been used for feature extraction target. RESULTS The various machine learning techniques that have been trained on coronavirus disease 2019 (COVID-19) dataset with improved ML techniques parameters. The data set was divided into two parts, which are training and testing. The model was trained using 70% of the dataset, while the remaining 30% was used to test the model. The results show that the improved SVM achieved a F1 of 97% and an accuracy of 98%. CONCLUSIONS :. In this study, seven models have been developed to aid the detection of coronavirus. In such cases, the learning performance can be improved through knowledge transfer, whereby time-consuming data labelling efforts are not required.the evaluations of all the models are done in terms of different parameters. it can be concluded that all the models performed well, but the SVM demonstrated the best result for accuracy metric. Future work will compare classical approaches with deep learning ones and try to obtain better results. CLINICALTRIAL None


2015 ◽  
Vol 54 (05) ◽  
pp. 455-460 ◽  
Author(s):  
M. Ganzinger ◽  
T. Muley ◽  
M. Thomas ◽  
P. Knaup ◽  
D. Firnkorn

Summary Objective: Joint data analysis is a key requirement in medical research networks. Data are available in heterogeneous formats at each network partner and their harmonization is often rather complex. The objective of our paper is to provide a generic approach for the harmonization process in research networks. We applied the process when harmonizing data from three sites for the Lung Cancer Phenotype Database within the German Center for Lung Research. Methods: We developed a spreadsheet-based solution as tool to support the harmonization process for lung cancer data and a data integration procedure based on Talend Open Studio. Results: The harmonization process consists of eight steps describing a systematic approach for defining and reviewing source data elements and standardizing common data elements. The steps for defining common data elements and harmonizing them with local data definitions are repeated until consensus is reached. Application of this process for building the phenotype database led to a common basic data set on lung cancer with 285 structured parameters. The Lung Cancer Phenotype Database was realized as an i2b2 research data warehouse. Conclusion: Data harmonization is a challenging task requiring informatics skills as well as domain knowledge. Our approach facilitates data harmonization by providing guidance through a uniform process that can be applied in a wide range of projects.


2013 ◽  
Vol 2013 ◽  
pp. 1-10 ◽  
Author(s):  
Jianwei Liu ◽  
Shuang Cheng Li ◽  
Xionglin Luo

Support vector machine is an effective classification and regression method that uses machine learning theory to maximize the predictive accuracy while avoiding overfitting of data.L2regularization has been commonly used. If the training dataset contains many noise variables,L1regularization SVM will provide a better performance. However, bothL1andL2are not the optimal regularization method when handing a large number of redundant values and only a small amount of data points is useful for machine learning. We have therefore proposed an adaptive learning algorithm using the iterative reweightedp-norm regularization support vector machine for 0 <p≤ 2. A simulated data set was created to evaluate the algorithm. It was shown that apvalue of 0.8 was able to produce better feature selection rate with high accuracy. Four cancer data sets from public data banks were used also for the evaluation. All four evaluations show that the new adaptive algorithm was able to achieve the optimal prediction error using apvalue less thanL1norm. Moreover, we observe that the proposedLppenalty is more robust to noise variables than theL1andL2penalties.


Author(s):  
I.A. Borisova ◽  
O.A. Kutnenko

The problem of outliers detection is one of the important problems in Data Mining of biomedical datasets particularly in case when there could be misclassified objects, caused by diagnostic pitfalls on a stage of a data collection. Occurrence of such objects complicates and slows down dataset processing, distorts and corrupts detected regularities, reduces their accuracy score. We propose the censoring algorithm which could detect misclassified objects after which they are either removed from the dataset or the class attribute of such objects is corrected. Correction procedure keeps the volume of the analyzed dataset as big as it is possible. Such quality is very useful in case of small datasets analysis, when every bit of information can be important. The base concept in the presented work is a measure of similarity of objects with its surroundings. To evaluate the local similarity of the object with its closest neighbors the ternary relative measure called the function of rival similarity (FRiS-function) is used. Mean of similarity values of all objects in the dataset gives us a notion of a class’s separability, how close objects from the same class are to each other and how far they are from the objects of the different classes (with the different diagnosis) in the attribute space. It is supposed misclassified objects are more similar to objects from rival classes, than their own class, so their elimination from the dataset, or the target attribute correction should increase data separability value. The procedure of filtering-correcting of misclassified objects is based on the observation of changes in the evaluation of data separability calculated before and after making corrections to the dataset. The censoring process continues until the inflection point of the separability function is reached. The proposed algorithm was tested on a wide range of model tasks of different complexity. Also it was tested on biomedical tasks such as Pima Indians Diabetes data set, Breast Cancer data set and Parkinson data set. On these tasks the censoring algorithm showed high misclassification sensitivity. Accuracy score increasing and data set volume preservation after censoring procedure proved our base assumptions and the effectiveness of the algorithm.


2018 ◽  
Author(s):  
Paul F. Harrison ◽  
Andrew D. Pattison ◽  
David R. Powell ◽  
Traude H. Beilharz

AbstractBackgroundA differential gene expression analysis may produce a set of significantly differentially expressed genes that is too large to easily investigate, so that a means of ranking genes by their biological interest level is desirable. The life-sciences have grappled with the abuse of p-values to rank genes for this purpose. As an alternative, a lower confidence bound on the magnitude of Log Fold Change (LFC) could be used to rank genes, but it has been unclear how to reconcile this with the need to perform False Discovery Rate (FDR) correction. The TREAT test of McCarthy and Smyth is a step in this direction, finding genes significantly exceeding a specified LFC threshold. Here we describe the use of test inversion on TREAT to present genes ranked by a confidence bound on the LFC, while still controlling FDR.ResultsTesting the Topconfects R package with simulated gene expression data shows the method outperforming current statistical approaches across a wide range of experiment sizes in the identification of genes with largest LFCs. Applying the method to a TCGA breast cancer data-set shows the method ranks some genes with large LFC higher than would traditional ranking by p-value. Importantly these two ranking methods lead to a different biological emphasis, in terms both of specific highly ranked genes and gene-set enrichment.ConclusionsThe choice of ranking method in differential expression analysis can affect the biological interpretation. The common default of ranking by p-value is implicitly by an effect size in which each gene is standardized to its own variability, rather than comparing genes on a common scale, which may not be appropriate. The Topconfects approach of presenting genes ranked by confident LFC effect size is a variation on the TREAT method with improved usability, removing the need to fine-tune a threshold parameter and removing the temptation to abuse p-values as a de-facto effect size.


2021 ◽  
Vol 10 ◽  
Author(s):  
Rong Liang ◽  
Jinyan Zhang ◽  
Zhihui Liu ◽  
Ziyu Liu ◽  
Qian Li ◽  
...  

RNA-binding motif protein 8A (RBM8A) is abnormally overexpressed in hepatocellular carcinoma (HCC) and involved in the epithelial-mesenchymal transition (EMT). The EMT plays an important role in the development of drug resistance, suggesting that RBM8A may be involved in the regulation of oxaliplatin (OXA) resistance in HCC. Here we examined the potential involvement of RBM8A and its downstream pathways in OXA resistance using in vitro and in vivo models. RBM8A overexpression induced the EMT in OXA-resistant HCC cells, altering cell proliferation, apoptosis, migration, and invasion. Moreover, whole-genome microarrays combined with bioinformatics analysis revealed that RBM8A has a wide range of transcriptional regulatory capabilities in OXA-resistant HCC, including the ability to regulate several important tumor-related signaling pathways. In particular, histone deacetylase 9 (HDAC9) emerged as an important mediator of RBM8A activity related to OXA resistance. These data suggest that RBM8A and its related regulatory pathways represent potential markers of OXA resistance and therapeutic targets in HCC.


2021 ◽  
Author(s):  
Robin J Borchert ◽  
Tiago Azevedo ◽  
Amanpreet Badhwar ◽  
Jose Bernal ◽  
Matthew Betts ◽  
...  

Introduction Recent developments in artificial intelligence (AI) and neuroimaging offer new opportunities for improving diagnosis and prognosis of dementia. To synthesise the available literature, we performed a systematic review. Methods We systematically reviewed primary research publications up to January 2021, using AI for neuroimaging to predict diagnosis and/or prognosis in cognitive neurodegenerative diseases. After initial screening, data from each study was extracted, including: demographic information, AI methods, neuroimaging features, and results. Results We found 2709 reports, with 252 eligible papers remaining following screening. Most studies relied on the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset (n=178) with no other individual dataset used more than 5 times. Algorithmic classifiers, such as support vector machine (SVM), were the most commonly used AI method (47%) followed by discriminative (32%) and generative (11%) classifiers. Structural MRI was used in 71% of studies with a wide range of accuracies for the diagnosis of neurodegenerative diseases and predicting prognosis. Lower accuracy was found in studies using a multi-class classifier or an external cohort as the validation group. There was improvement in accuracy when neuroimaging modalities were combined, e.g. PET and structural MRI. Only 17 papers studied non-Alzheimers disease dementias. Conclusion The use of AI with neuroimaging for diagnosis and prognosis in dementia is a rapidly emerging field. We make a number of recommendations addressing the definition of key clinical questions, heterogeneity of AI methods, and the availability of appropriate and representative data. We anticipate that addressing these issues will enable the field to move towards meaningful clinical translation.


Sign in / Sign up

Export Citation Format

Share Document