scholarly journals Efficient large-scale language model training on GPU clusters using megatron-LM

2021 ◽  
Author(s):  
Deepak Narayanan ◽  
Mohammad Shoeybi ◽  
Jared Casper ◽  
Patrick LeGresley ◽  
Mostofa Patwary ◽  
...  
2020 ◽  
Vol 36 (10) ◽  
pp. 3011-3017 ◽  
Author(s):  
Olga Mineeva ◽  
Mateo Rojas-Carulla ◽  
Ruth E Ley ◽  
Bernhard Schölkopf ◽  
Nicholas D Youngblut

Abstract Motivation Methodological advances in metagenome assembly are rapidly increasing in the number of published metagenome assemblies. However, identifying misassemblies is challenging due to a lack of closely related reference genomes that can act as pseudo ground truth. Existing reference-free methods are no longer maintained, can make strong assumptions that may not hold across a diversity of research projects, and have not been validated on large-scale metagenome assemblies. Results We present DeepMAsED, a deep learning approach for identifying misassembled contigs without the need for reference genomes. Moreover, we provide an in silico pipeline for generating large-scale, realistic metagenome assemblies for comprehensive model training and testing. DeepMAsED accuracy substantially exceeds the state-of-the-art when applied to large and complex metagenome assemblies. Our model estimates a 1% contig misassembly rate in two recent large-scale metagenome assembly publications. Conclusions DeepMAsED accurately identifies misassemblies in metagenome-assembled contigs from a broad diversity of bacteria and archaea without the need for reference genomes or strong modeling assumptions. Running DeepMAsED is straight-forward, as well as is model re-training with our dataset generation pipeline. Therefore, DeepMAsED is a flexible misassembly classifier that can be applied to a wide range of metagenome assembly projects. Availability and implementation DeepMAsED is available from GitHub at https://github.com/leylabmpi/DeepMAsED. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Li Wang ◽  
Wenjie Pan ◽  
QingHua Wang ◽  
Heming Bai ◽  
Wei Liu ◽  
...  

Drug-drug interactions (DDIs) are one of the indispensable factors leading to adverse event reactions. Considering the unique structure of AERS (Food and Drug Administration Adverse Event Reporting System (FDA AERS)) reports, we changed the scope of the window value in the original skip-gram algorithm, then propose a language concept representation model and extract features of drug name and reaction information from large-scale AERS reports. The validation of our scheme was tested and verified by comparing with vectors originated from the cooccurrence matrix in tenfold cross-validation. In the verification of description enrichment of the DrugBank DDI database, accuracy was calculated for measurement. The average area under the receiver operating characteristic curve of logistic regression classifiers based on the proposed language model is 6% higher than that of the cooccurrence matrix. At the same time, the average accuracy in five severe adverse event classes is 88%. These results indicate that our language model can be useful for extracting drug and reaction features from large-scale AERS reports.


2014 ◽  
Vol 40 (3) ◽  
pp. 687-723 ◽  
Author(s):  
Cyril Allauzen ◽  
Bill Byrne ◽  
Adrià de Gispert ◽  
Gonzalo Iglesias ◽  
Michael Riley

This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT.


Sign in / Sign up

Export Citation Format

Share Document