Integrative transcriptomic, proteomic, and machine learning approach to identifying feature genes of atrial fibrillation

Abstract Background: Atrial fibrillation (AF) is the most common arrhythmia with poorly understood mechanisms. We aimed to investigate the biological mechanism of AF and to discover feature genes by analyzing multi-omics data and by applying a machine learning approach. Methods: At the transcriptomic level, four microarray datasets (GSE41177, GSE79768, GSE115574, GSE14975) were downloaded from the Gene Expression Omnibus database, which included 130 available atrial samples from AF and sinus rhythm (SR) group. Microarray meta-analysis was adopted to identified differentially expressed genes (DEGs). At the proteomic level, a qualitative and quantitative analysis of proteomics in the left atrial appendage of 18 patients (9 with AF and 9 with SR) was conducted. The machine learning correlation-based feature selection (CSF) method was introduced to selected feature genes of AF using the training set of 130 samples involved in the microarray meta-analysis. The Naive Bayes (NB) based classifier constructed using training set was evaluated on an independent validation test set GSE2240. Results: 863 DEGs with a FDR<0.05 and 482 differentially expressed proteins (DEPs) with a FDR<0.1 and fold change >1.2 were obtained from the transcriptomic and proteomic study, respectively. The DEGs and DEPs were then analyzed together which identified 30 biomarkers with consistent trends. Further, 10 feature, including 8 upregulated genes (CD44, CHGB, FHL2, GGT5, IGFBP2, NRAP, SEPTIN6, YWHAQ) and 2 downregulated genes (TNNT1, TRDN) were selected from the 30 biomarkers through machine learning CFS method using training set. The NB based classifier constructed using the training set accurately and reliably classify AF from SR samples in the validation test set with a precision of 87.5% and AUC of 0.995.Conclusion: Taken together, our present work might provide novel insights into the molecular mechanism and provide some promising diagnostic and therapeutic targets of AF.

Download Full-text

Integrative transcriptomic, proteomic, and machine learning approach to identifying feature genes of atrial fibrillation using atrial samples from patients with valvular heart disease

BMC Cardiovascular Disorders ◽

10.1186/s12872-020-01819-0 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Yaozhong Liu ◽

Fan Bai ◽

Zhenwei Tang ◽

Na Liu ◽

Qiming Liu

Keyword(s):

Machine Learning ◽

Atrial Fibrillation ◽

Heart Disease ◽

Valvular Heart Disease ◽

Meta Analysis ◽

Learning Approach ◽

Training Set ◽

Test Set ◽

Validation Test ◽

Machine Learning Approach

Abstract Background Atrial fibrillation (AF) is the most common arrhythmia with poorly understood mechanisms. We aimed to investigate the biological mechanism of AF and to discover feature genes by analyzing multi-omics data and by applying a machine learning approach. Methods At the transcriptomic level, four microarray datasets (GSE41177, GSE79768, GSE115574, GSE14975) were downloaded from the Gene Expression Omnibus database, which included 130 available atrial samples from AF and sinus rhythm (SR) patients with valvular heart disease. Microarray meta-analysis was adopted to identified differentially expressed genes (DEGs). At the proteomic level, a qualitative and quantitative analysis of proteomics in the left atrial appendage of 18 patients (9 with AF and 9 with SR) who underwent cardiac valvular surgery was conducted. The machine learning correlation-based feature selection (CFS) method was introduced to selected feature genes of AF using the training set of 130 samples involved in the microarray meta-analysis. The Naive Bayes (NB) based classifier constructed using training set was evaluated on an independent validation test set GSE2240. Results 863 DEGs with FDR < 0.05 and 482 differentially expressed proteins (DEPs) with FDR < 0.1 and fold change > 1.2 were obtained from the transcriptomic and proteomic study, respectively. The DEGs and DEPs were then analyzed together which identified 30 biomarkers with consistent trends. Further, 10 features, including 8 upregulated genes (CD44, CHGB, FHL2, GGT5, IGFBP2, NRAP, SEPTIN6, YWHAQ) and 2 downregulated genes (TNNI1, TRDN) were selected from the 30 biomarkers through machine learning CFS method using training set. The NB based classifier constructed using the training set accurately and reliably classify AF from SR samples in the validation test set with a precision of 87.5% and AUC of 0.995. Conclusion Taken together, our present work might provide novel insights into the molecular mechanism and provide some promising diagnostic and therapeutic targets of AF.

Download Full-text

COVID-19 ICU mortality prediction: a machine learning approach using SuperLearner algorithm

Journal of Anesthesia, Analgesia and Critical Care ◽

10.1186/s44158-021-00002-x ◽

2021 ◽

Vol 1 (1) ◽

Author(s):

Giulia Lorenzoni ◽

Nicolò Sella ◽

Annalisa Boscolo ◽

Danila Azzolina ◽

Patrizia Bartolotta ◽

...

Keyword(s):

Machine Learning ◽

Mechanical Ventilation ◽

Predictive Models ◽

Invasive Mechanical Ventilation ◽

External Validation ◽

Learning Approach ◽

Training Set ◽

Icu Mortality ◽

Test Set ◽

Machine Learning Approach

Abstract Background Since the beginning of coronavirus disease 2019 (COVID-19), the development of predictive models has sparked relevant interest due to the initial lack of knowledge about diagnosis, treatment, and prognosis. The present study aimed at developing a model, through a machine learning approach, to predict intensive care unit (ICU) mortality in COVID-19 patients based on predefined clinical parameters. Results Observational multicenter cohort study. All COVID-19 adult patients admitted to 25 ICUs belonging to the VENETO ICU network (February 28th 2020-april 4th 2021) were enrolled. Patients admitted to the ICUs before 4th March 2021 were used for model training (“training set”), while patients admitted after the 5th of March 2021 were used for external validation (“test set 1”). A further group of patients (“test set 2”), admitted to the ICU of IRCCS Ca’ Granda Ospedale Maggiore Policlinico of Milan, was used for external validation. A SuperLearner machine learning algorithm was applied for model development, and both internal and external validation was performed. Clinical variables available for the model were (i) age, gender, sequential organ failure assessment score, Charlson Comorbidity Index score (not adjusted for age), Palliative Performance Score; (ii) need of invasive mechanical ventilation, non-invasive mechanical ventilation, O2 therapy, vasoactive agents, extracorporeal membrane oxygenation, continuous venous-venous hemofiltration, tracheostomy, re-intubation, prone position during ICU stay; and (iii) re-admission in ICU. One thousand two hundred ninety-three (80%) patients were included in the “training set”, while 124 (8%) and 199 (12%) patients were included in the “test set 1” and “test set 2,” respectively. Three different predictive models were developed. Each model included different sets of clinical variables. The three models showed similar predictive performances, with a training balanced accuracy that ranged between 0.72 and 0.90, while the cross-validation performance ranged from 0.75 to 0.85. Age was the leading predictor for all the considered models. Conclusions Our study provides a useful and reliable tool, through a machine learning approach, for predicting ICU mortality in COVID-19 patients. In all the estimated models, age was the variable showing the most important impact on mortality.

Download Full-text

Revisiting the dynamic risk profile of cardiovascular/non‐cardiovascular multimorbidity in incident atrial fibrillation patients and five cardiovascular/non‐cardiovascular outcomes: A machine‐learning approach

Journal of Arrhythmia ◽

10.1002/joa3.12555 ◽

2021 ◽

Author(s):

Gregory Y. H. Lip ◽

George Tran ◽

Ash Genaidy ◽

Patricia Marroquin ◽

Cara Estes

Keyword(s):

Machine Learning ◽

Atrial Fibrillation ◽

Risk Profile ◽

Cardiovascular Outcomes ◽

Learning Approach ◽

Dynamic Risk ◽

Machine Learning Approach

Download Full-text

kScore: a novel machine learning approach that is not dependent on the data structure of the training set

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-007-9108-0 ◽

2007 ◽

Vol 21 (1-3) ◽

pp. 87-95 ◽

Cited By ~ 4

Author(s):

Scott Oloff ◽

Ingo Muegge

Keyword(s):

Machine Learning ◽

Data Structure ◽

Learning Approach ◽

Training Set ◽

Machine Learning Approach

Download Full-text

Real time object detection in images based on an AdaBoost machine learning approach and a small training set

10.22215/etd/2005-07497 ◽

2005 ◽

Author(s):

Miloš Stojmenović

Keyword(s):

Machine Learning ◽

Object Detection ◽

Real Time ◽

Learning Approach ◽

Training Set ◽

Machine Learning Approach

Download Full-text

Prediction of sepsis patients using machine learning approach: A meta-analysis

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2018.12.027 ◽

2019 ◽

Vol 170 ◽

pp. 1-9 ◽

Cited By ~ 24

Author(s):

Md. Mohaimenul Islam ◽

Tahmina Nasrin ◽

Bruno Andreas Walther ◽

Chieh-Chen Wu ◽

Hsuan-Chia Yang ◽

...

Keyword(s):

Machine Learning ◽

Meta Analysis ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

P-Wave Area Predicts New Onset Atrial Fibrillation in Mitral Stenosis: A Machine Learning Approach

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2020.00479 ◽

2020 ◽

Vol 8 ◽

Cited By ~ 2

Author(s):

Gary Tse ◽

Ishan Lakhani ◽

Jiandong Zhou ◽

Ka Hou Christien Li ◽

Sharen Lee ◽

...

Keyword(s):

Machine Learning ◽

Atrial Fibrillation ◽

Mitral Stenosis ◽

P Wave ◽

Learning Approach ◽

Machine Learning Approach ◽

Onset Atrial Fibrillation ◽

New Onset

Download Full-text

Plagiarism Detection in Programming Assignments using Machine Learning

Journal of Artificial Intelligence and Capsule Networks - September 2019 ◽

10.36548/jaicn.2020.3.005 ◽

2020 ◽

Vol 2 (3) ◽

pp. 177-184

Author(s):

Nishesh Awale ◽

Mitesh Pandey ◽

Anish Dulal ◽

Bibek Timsina

Keyword(s):

Machine Learning ◽

Source Code ◽

Similarity Score ◽

Support Vector ◽

Learning Approach ◽

Accuracy Score ◽

Plagiarism Detection ◽

Test Set ◽

Vector Machines ◽

Machine Learning Approach

Plagiarism in programming assignments has been increasing these days which affects the evaluation of students. Thispaper proposes a machine learning approach for plagiarism detection of programming assignments. Different features related to source code are computed based on similarity score of n-grams, code style similarity and dead codes. Then, xgboost model is used for training and predicting whether a pair of source code are plagiarised or not. Many plagiarism techniques ignores dead codes such as unused variables and functions in their predictions tasks. But number of unused variables and functions in the source code are considered in this paper. Using our features, the model achieved an accuracy score of 94% and average f1-score of 0.905 on the test set. We also compared the result of xgboost model with support vector machines(SVM) and report that xgboost model performed better on our dataset.

Download Full-text