sequence generation Latest Research Papers

Achieving 90% In Data-Centric Industry Deep Learning Task

10.36227/techrxiv.17128475.v2 ◽

2022 ◽

Author(s):

Tong Guo

Keyword(s):

Deep Learning ◽

Object Detection ◽

Noisy Data ◽

Learning Task ◽

Simple Method ◽

Sequence Generation ◽

Learning Tasks ◽

Human Evaluation ◽

Model Predictions ◽

Click Through Rate

In industry deep learning application, our manually labeled data has a certain number of noisy data. To solve this problem and achieve more than 90 score in dev dataset, we present a simple method to find the noisy data and re-label the noisy data by human, given the model predictions as references in human labeling. In this paper, we illustrate our idea for a broad set of deep learning tasks, includes classification, sequence tagging, object detection, sequence generation, click-through rate prediction. The experimental results and human evaluation results verify our idea.

Download Full-text

Narrowband internet of things: analysis of frame structure, NPSS sequence generation and detection

International Journal of Systems Control and Communications ◽

10.1504/ijscc.2022.119711 ◽

2022 ◽

Vol 13 (1) ◽

pp. 67

Author(s):

N.A. Rasveen ◽

Khyati Chopra ◽

Sandeep Kumar

Keyword(s):

Internet Of Things ◽

Frame Structure ◽

Sequence Generation

Download Full-text

Achieving 90% In Data-Centric Industry Deep Learning Task

10.36227/techrxiv.17128475 ◽

2021 ◽

Author(s):

Tong Guo

Keyword(s):

Deep Learning ◽

Object Detection ◽

Noisy Data ◽

Learning Task ◽

Simple Method ◽

Sequence Generation ◽

Learning Tasks ◽

Human Evaluation ◽

Model Predictions ◽

Click Through Rate

In industry deep learning application, our manually labeled data has a certain number of noisy data. To solve this problem and achieve more than 90 score in dev dataset, we present a simple method to find the noisy data and re-label the noisy data by human, given the model predictions as references in human labeling. In this paper, we illustrate our idea for a broad set of deep learning tasks, includes classification, sequence tagging, object detection, sequence generation, click-through rate prediction. The experimental results and human evaluation results verify our idea.

Download Full-text

Achieving 90% In Data-Centric Industry Deep Learning Task

10.36227/techrxiv.17128475.v1 ◽

2021 ◽

Author(s):

Tong Guo

Keyword(s):

Deep Learning ◽

Object Detection ◽

Noisy Data ◽

Learning Task ◽

Simple Method ◽

Sequence Generation ◽

Learning Tasks ◽

Human Evaluation ◽

Model Predictions ◽

Click Through Rate

In industry deep learning application, our manually labeled data has a certain number of noisy data. To solve this problem and achieve more than 90 score in dev dataset, we present a simple method to find the noisy data and re-label the noisy data by human, given the model predictions as references in human labeling. In this paper, we illustrate our idea for a broad set of deep learning tasks, includes classification, sequence tagging, object detection, sequence generation, click-through rate prediction. The experimental results and human evaluation results verify our idea.

Download Full-text

Generative Language Modeling for Antibody Design

10.1101/2021.12.13.472419 ◽

2021 ◽

Author(s):

Richard W. Shuai ◽

Jeffrey A. Ruffolo ◽

Jeffrey J. Gray

Keyword(s):

Language Model ◽

Successful Development ◽

Generation Task ◽

Sequence Generation ◽

Generative Language ◽

Antibody Libraries ◽

Antibody Design ◽

Low Solubility ◽

High Immunogenicity ◽

Chain Type

Successful development of monoclonal antibodies (mAbs) for therapeutic applications is hindered by developability issues such as low solubility, low thermal stability, high aggregation, and high immunogenicity. The discovery of more developable mAb candidates relies on high-quality antibody libraries for isolating candidates with desirable properties. We present Immunoglobulin Language Model (IgLM), a deep generative language model for generating synthetic libraries by re-designing variable-length spans of antibody sequences. IgLM formulates antibody design as an autoregressive sequence generation task based on text-infilling in natural language. We trained IgLM on approximately 558M antibody heavy- and light-chain variable sequences, conditioning on each sequence's chain type and species-of-origin. We demonstrate that IgLM can be applied to generate synthetic libraries that may accelerate the discovery of therapeutic antibody candidates

Download Full-text

Activity Sequence Generation Using Universal Mobility Patterns

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211062483 ◽

2021 ◽

pp. 036119812110624

Author(s):

Wim Ectors ◽

Bruno Kochan ◽

Davy Janssens ◽

Tom Bellemans ◽

Geert Wets

Keyword(s):

Power Law ◽

Sequential Sampling ◽

Model Development ◽

Generation Model ◽

Mobility Patterns ◽

Power Law Distribution ◽

Sequence Generation ◽

Activity Sequence ◽

Basic Set ◽

The World

Previous work has established that rank ordered single-day activity sequences from various study areas exhibit a universal power law distribution called Zipf’s law. By analyzing datasets from across the world, evidence was provided that it is in fact a universal distribution. This study focuses on a potential mechanism that leads to the power law distribution that was previously discovered. It makes use of 15 household travel survey (HTS) datasets from study areas all over the world to demonstrate that reasonably accurate sets of activity sequences (or “schedules”) can be generated with extremely little information required; the model requires no input data and contains few tunable parameters. The activity sequence generation mechanism is based on sequential sampling from two universal distributions: (i) the distributions of the number of activities (trips) and (ii) the activity types (trip purposes). This paper also attempts to demonstrate the universal nature of these distributions by fitting several equations to the 15 HTS datasets. The lightweight activity sequence generation model can be implemented in any (lightweight) transportation model to create a basic set of activity sequences, saving effort and cost in data collection and in model development and calibration.

Download Full-text

An Analysis of the Use of Feed-Forward Sub-Modules for Transformer-Based Image Captioning Tasks

Applied Sciences ◽

10.3390/app112411635 ◽

2021 ◽

Vol 11 (24) ◽

pp. 11635

Author(s):

Raymond Ian Osolo ◽

Zhan Yang ◽

Jun Long

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Current Transformer ◽

Image Features ◽

Language Models ◽

Image Captioning ◽

Feed Forward ◽

Sequence Generation ◽

Current State ◽

Batch Sizes

In the quest to make deep learning systems more capable, a number of more complex, more computationally expensive and memory intensive algorithms have been proposed. This switchover glosses over the capabilities of many of the simpler systems or modules within them to adequately address current and future problems. This has led to some of the deep learning research being inaccessible to researchers who don’t possess top-of-the-line hardware. The use of simple feed forward networks has not been explicitly explored in the current transformer-based vision-language field. In this paper, we use a series of feed-forward layers to encode image features, and caption embeddings, alleviating some of the effects of the computational complexities that accompany the use of the self-attention mechanism and limit its application in long sequence task scenarios. We demonstrate that a decoder does not require masking for conditional short sequence generation where the task is not only dependent on the previously generated sequence, but another input such as image features. We perform an empirical and qualitative analysis of the use of linear transforms in place of self-attention layers in vision-language models, and obtain competitive results on the MSCOCO dataset. Our best feed-forward model obtains average scores of over 90% of the current state-of-the-art pre-trained Oscar model in the conventional image captioning metrics. We also demonstrate that the proposed models take less time training and use less memory at larger batch sizes and longer sequence lengths.

Download Full-text

Efficient Sequence Generation for Hardware Verification Using Machine Learning

10.1109/icecs53924.2021.9665495 ◽

2021 ◽

Author(s):

Muhammad Gad ◽

Mostafa Aboelmaged ◽

Maggie Mashaly ◽

Mohamed A. Abd el Ghany

Keyword(s):

Machine Learning ◽

Hardware Verification ◽

Sequence Generation ◽

Efficient Sequence

Download Full-text

Conditional Generative Modeling for De Novo Protein Design with Hierarchical Functions

10.1101/2021.11.10.467885 ◽

2021 ◽

Author(s):

Tim Kucera ◽

Matteo Togninalli ◽

Laetitia Meng-Papaxanthos

Keyword(s):

Protein Design ◽

De Novo ◽

General Purpose ◽

Generative Models ◽

Generative Adversarial Network ◽

Sequence Generation ◽

Adversarial Network ◽

Evaluation Scheme ◽

Generative Modeling ◽

Experimental Protocols

Motivation: Protein Design has become increasingly important for medical and biotechnological applications. Because of the complex mechanisms underlying protein formation, the creation of a novel protein requires tedious and time-consuming computational or experimental protocols. At the same time, Machine Learning has enabled to solve complex problems by leveraging the large amounts of available data, more recently with great improvements on the domain of generative modeling. Yet, generative models have mainly been applied to specific sub-problems of protein design. Results: Here we approach the problem of general purpose Protein Design conditioned on functional labels of the hierarchical Gene Ontology. Since a canonical way to evaluate generative models in this domain is missing, we devise an evaluation scheme of several biologically and statistically inspired metrics. We then develop the conditional generative adversarial network ProteoGAN and show that it outperforms several classic and more recent deep learning baselines for protein sequence generation. We further give insights into the model by analysing hyperparameters and ablation baselines. Lastly, we hypothesize that a functionally conditional model could create proteins with novel functions by combining labels and provide first steps into this direction of research.

Download Full-text

Offline memory replay in recurrent neuronal networks emerges from constraints on online dynamics

10.1101/2021.10.27.466186 ◽

2021 ◽

Author(s):

Aaron D Milstein ◽

Sarah Tran ◽

Grace Ng ◽

Ivan Soltesz

Keyword(s):

Neural Circuits ◽

Neural Circuit ◽

Sensory Information ◽

Neuronal Population ◽

Emergent Properties ◽

Sequence Generation ◽

Spiking Network ◽

Hippocampal Area ◽

Spatial Trajectories ◽

Area Ca3

During spatial exploration, neural circuits in the hippocampus store memories of sequences of sensory events encountered in the environment. When sensory information is absent during "offline" resting periods, brief neuronal population bursts can "replay" sequences of activity that resemble bouts of sensory experience. These sequences can occur in either forward or reverse order, and can even include spatial trajectories that have not been experienced, but are consistent with the topology of the environment. The neural circuit mechanisms underlying this variable and flexible sequence generation are unknown. Here we demonstrate in a recurrent spiking network model of hippocampal area CA3 that experimental constraints on network dynamics such as spike rate adaptation, population sparsity, stimulus selectivity, and rhythmicity enable additional emergent properties, including variable offline memory replay. In an online stimulus-driven state, we observed the emergence of neuronal sequences that swept from representations of past to future stimuli on the timescale of the theta rhythm. In an offline state driven only by noise, the network generated both forward and reverse neuronal sequences, and recapitulated the experimental observation that offline memory replay events tend to include salient locations like the site of a reward. These results demonstrate that biological constraints on the dynamics of recurrent neural circuits are sufficient to enable memories of sensory events stored in the strengths of synaptic connections to be flexibly read out during rest and sleep, which is thought to be important for memory consolidation and planning of future behavior.

Download Full-text

sequence generation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Achieving 90% In Data-Centric Industry Deep Learning Task

Narrowband internet of things: analysis of frame structure, NPSS sequence generation and detection

Achieving 90% In Data-Centric Industry Deep Learning Task

Achieving 90% In Data-Centric Industry Deep Learning Task

Generative Language Modeling for Antibody Design

Activity Sequence Generation Using Universal Mobility Patterns

An Analysis of the Use of Feed-Forward Sub-Modules for Transformer-Based Image Captioning Tasks

Efficient Sequence Generation for Hardware Verification Using Machine Learning

Conditional Generative Modeling for De Novo Protein Design with Hierarchical Functions

Offline memory replay in recurrent neuronal networks emerges from constraints on online dynamics

Export Citation Format

sequence generationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Achieving 90% In Data-Centric Industry Deep Learning Task

Narrowband internet of things: analysis of frame structure, NPSS sequence generation and detection

Achieving 90% In Data-Centric Industry Deep Learning Task

Achieving 90% In Data-Centric Industry Deep Learning Task

Generative Language Modeling for Antibody Design

Activity Sequence Generation Using Universal Mobility Patterns

An Analysis of the Use of Feed-Forward Sub-Modules for Transformer-Based Image Captioning Tasks

Efficient Sequence Generation for Hardware Verification Using Machine Learning

Conditional Generative Modeling for De Novo Protein Design with Hierarchical Functions

Offline memory replay in recurrent neuronal networks emerges from constraints on online dynamics

sequence generation
Recently Published Documents