Finding Signal Peptides in Human Protein Sequences Using Recurrent Neural Networks

Author(s):  
Martin Reczko ◽  
Petko Fiziev ◽  
Eike Staub ◽  
Artemis Hatzigeorgiou
1992 ◽  
Vol 03 (supp01) ◽  
pp. 221-226 ◽  
Author(s):  
Edgardo A. Ferrán ◽  
Pascual Ferrara

In a previous work we have described a method, based on Kohonen’s unsupervised-learning algorithm, to cluster a set of known protein sequences into families. We show here some examples of how a network, trained with 1758 human protein sequences, can correctly classify nonlearned sequences (nonhuman sequences and small fragments or random amino acid mutations of human sequences) in a very fast way.


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


Author(s):  
Faisal Ladhak ◽  
Ankur Gandhe ◽  
Markus Dreyer ◽  
Lambert Mathias ◽  
Ariya Rastrow ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document