scholarly journals Machine learning approach for the search of resonances with topological features at the Large Hadron Collider

Author(s):  
Salah-Eddine Dahbi ◽  
Joshua Choma ◽  
Gaogalalwe Mokgatitswane ◽  
Xifeng Ruan ◽  
Benjamin Lieberman ◽  
...  
2020 ◽  
Author(s):  
Jose Arturo Molina-Mora ◽  
Pablo Montero-Manso ◽  
Raquel García Batán ◽  
Rebeca Campos Sánchez ◽  
Jose Vilar Fernández ◽  
...  

AbstractTolerance to stress conditions is vital for organismal survival, including bacteria under specific environmental conditions, antibiotics and other perturbations. Some studies have described common modulation and shared genes during stress response to different types of disturbances (termed as perturbome), leading to the idea of a central control at the molecular level. We implemented a robust machine learning approach to identify and describe genes associated with multiple perturbations or perturbome in a Pseudomonas aeruginosa PAO1 model.Using public transcriptomic data, we evaluated six approaches to rank and select genes: using two methodologies, data single partition (SP method) or multiple partitions (MP method) for training and testing datasets, we evaluated three classification algorithms (SVM Support Vector Machine, KNN K-Nearest neighbor and RF Random Forest). Gene expression patterns and topological features at systems level were include to describe the perturbome elements.We were able to select and describe 46 core response genes associated to multiple perturbations in Pseudomonas aeruginosa PAO1 and it can be considered a first report of the P. aeruginosa perturbome. Molecular annotations, patterns in expression levels and topological features in molecular networks revealed biological functions of biosynthesis, binding and metabolism, many of them related to DNA damage repair and aerobic respiration in the context of tolerance to stress. We also discuss different issues related to implemented and assessed algorithms, including normalization analysis, data partitioning, classification approaches and metrics. Altogether, this work offers a different and robust framework to select genes using a machine learning approach.


Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 1552-P
Author(s):  
KAZUYA FUJIHARA ◽  
MAYUKO H. YAMADA ◽  
YASUHIRO MATSUBAYASHI ◽  
MASAHIKO YAMAMOTO ◽  
TOSHIHIRO IIZUKA ◽  
...  

2020 ◽  
Author(s):  
Clifford A. Brown ◽  
Jonny Dowdall ◽  
Brian Whiteaker ◽  
Lauren McIntyre

2017 ◽  
Author(s):  
Sabrina Jaeger ◽  
Simone Fulle ◽  
Samo Turk

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.


Sign in / Sign up

Export Citation Format

Share Document