Deep learning of genomic variation and regulatory network data

Abstract The human genome is now investigated through high-throughput functional assays, and through the generation of population genomic data. These advances support the identification of functional genetic variants and the prediction of traits (e.g. deleterious variants and disease). This review summarizes lessons learned from the large-scale analyses of genome and exome data sets, modeling of population data and machine-learning strategies to solve complex genomic sequence regions. The review also portrays the rapid adoption of artificial intelligence/deep neural networks in genomics; in particular, deep learning approaches are well suited to model the complex dependencies in the regulatory landscape of the genome, and to provide predictors for genetic variant calling and interpretation.

Download Full-text

Improving variant calling using population data and deep learning

10.1101/2021.01.06.425550 ◽

2021 ◽

Author(s):

Nae-Chyun Chen ◽

Alexey Kolesnikov ◽

Sidharth Goel ◽

Taedong Yun ◽

Pi-Chuan Chang ◽

...

Keyword(s):

Deep Learning ◽

Large Scale ◽

Variant Calling ◽

Population Data ◽

Training Data ◽

1000 Genomes Project ◽

1000 Genomes ◽

Population Information ◽

Scale Population ◽

The Impact

Large-scale population variant data is often used to filter and aid interpretation of variant calls in a single sample. These approaches do not incorporate population information directly into the process of variant calling, and are often limited to filtering which trades recall for precision. In this study, we modify DeepVariant to add a new channel encoding population allele frequencies from the 1000 Genomes Project. We show that this model reduces variant calling errors, improving both precision and recall. We assess the impact of using population-specific or diverse reference panels. We achieve the greatest accuracy with diverse panels, suggesting that large, diverse panels are preferable to individual populations, even when the population matches sample ancestry. Finally, we show that this benefit generalizes to samples with different ancestry from the training data even when the ancestry is also excluded from the reference panel.

Download Full-text

Self-Supervised Pre-Training of Transformers for Satellite Image Time Series Classification

10.36227/techrxiv.13025039.v1 ◽

2020 ◽

Author(s):

Yuan Yuan ◽

Lei Lin

Keyword(s):

Time Series ◽

Deep Learning ◽

Large Scale ◽

Temporal Structure ◽

Satellite Image ◽

Fine Tuning ◽

Small Scale ◽

Model Parameters ◽

Learning Approaches ◽

Wide Range

Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 1.91% to 6.69%. <div>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</div>

Download Full-text

Leveraging coherent wave field analysis and deep learning in fiber-optic seismology

10.5194/egusphere-egu21-7856 ◽

2021 ◽

Author(s):

Benjamin Schwarz ◽

Korbinian Sager ◽

Philippe Jousset ◽

Gilda Currenti ◽

Charlotte Krawczyk ◽

...

Keyword(s):

Deep Learning ◽

Wave Field ◽

Real Time ◽

Learning Strategies ◽

Data Storage ◽

Large Scale ◽

Fiber Optic ◽

Positive Impact ◽

Field Analysis ◽

Coherent Wave

Fiber-optic cables form an integral part of modern telecommunications infrastructure and are ubiquitous in particular in regions where dedicated seismic instrumentation is traditionally sparse or lacking entirely. Fiber-optic seismology promises to enable affordable and time-extended observations of earth and environmental processes at an unprecedented temporal and spatial resolution. The method&#8217;s unique potential for combined large-N and large-T observations implies intriguing opportunities but also significant challenges in terms of data storage, data handling and computation.Our goal is to enable real-time data enhancement, rapid signal detection and wave field characterization without the need for time-demanding user interaction. We therefore combine coherent wave field analysis, an optics-inspired processing framework developed in controlled-source seismology, with state-of-the-art deep convolutional neural network (CNN) architectures commonly used in visual perception. While conventional deep learning strategies have to rely on manually labeled or purely synthetic training datasets, coherent wave field analysis labels field data based on physical principles and enables large-scale and purely data-driven training of the CNN models. The shear amount of data already recorded in various settings makes artificial data generation by numerical modeling superfluous &#8211; a task that is often constrained by incomplete knowledge of the embedding medium and an insufficient description of processes at or close to the surface, which are challenging to capture in integrated simulations.Applications to extensive field datasets acquired with dark-fiber infrastructure at a geothermal field in SW Iceland and in a town at the flank of Mt Etna, Italy, reveal that the suggested framework generalizes well across different observational scales and environments, and sheds new light on the origin of a broad range of physically distinct wave fields that can be sensed with fiber-optic technology. Owing to the real-time applicability with affordable computing infrastructure, our analysis lends itself well to rapid on-the-fly data enhancement, wave field separation and compression strategies, thereby promising to have a positive impact on the full processing chain currently in use in fiber-optic seismology.

Download Full-text

Deep Learning Approaches for Sentiment Analysis Challenges and Future Issues

10.4018/978-1-7998-8161-2.ch003 ◽

2022 ◽

pp. 27-50

Author(s):

Rajalaxmi Prabhu B. ◽

Seema S.

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Model Building ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets ◽

Learning Approaches ◽

Learning Techniques ◽

Important Challenge

A lot of user-generated data is available these days from huge platforms, blogs, websites, and other review sites. These data are usually unstructured. Analyzing sentiments from these data automatically is considered an important challenge. Several machine learning algorithms are implemented to check the opinions from large data sets. A lot of research has been undergone in understanding machine learning approaches to analyze sentiments. Machine learning mainly depends on the data required for model building, and hence, suitable feature exactions techniques also need to be carried. In this chapter, several deep learning approaches, its challenges, and future issues will be addressed. Deep learning techniques are considered important in predicting the sentiments of users. This chapter aims to analyze the deep-learning techniques for predicting sentiments and understanding the importance of several approaches for mining opinions and determining sentiment polarity.

Download Full-text

A general approach for improving deep learning-based medical relation extraction using a pre-trained model and fine-tuning

Database ◽

10.1093/database/baz116 ◽

2019 ◽

Vol 2019 ◽

Cited By ~ 2

Author(s):

Tao Chen ◽

Mingfen Wu ◽

Hexi Li

Keyword(s):

Deep Learning ◽

Large Scale ◽

Relation Extraction ◽

Training Model ◽

Biomedical Literature ◽

Training Data ◽

Fine Tuning ◽

Learning Approaches ◽

Additional Time ◽

Clinical Records

Abstract The automatic extraction of meaningful relations from biomedical literature or clinical records is crucial in various biomedical applications. Most of the current deep learning approaches for medical relation extraction require large-scale training data to prevent overfitting of the training model. We propose using a pre-trained model and a fine-tuning technique to improve these approaches without additional time-consuming human labeling. Firstly, we show the architecture of Bidirectional Encoder Representations from Transformers (BERT), an approach for pre-training a model on large-scale unstructured text. We then combine BERT with a one-dimensional convolutional neural network (1d-CNN) to fine-tune the pre-trained model for relation extraction. Extensive experiments on three datasets, namely the BioCreative V chemical disease relation corpus, traditional Chinese medicine literature corpus and i2b2 2012 temporal relation challenge corpus, show that the proposed approach achieves state-of-the-art results (giving a relative improvement of 22.2, 7.77, and 38.5% in F1 score, respectively, compared with a traditional 1d-CNN classifier). The source code is available at https://github.com/chentao1999/MedicalRelationExtraction.

Download Full-text

Conditional assessment of large-scale infrastructure systems using deep learning approaches (Conference Presentation)

Smart Structures and NDE for Industry 4.0, Smart Cities, and Energy Systems ◽

10.1117/12.2560133 ◽

2020 ◽

Author(s):

Hong Pan ◽

Zi Zhang ◽

Qi Cao ◽

Xingyu Wang ◽

Zhibin Lin

Keyword(s):

Deep Learning ◽

Large Scale ◽

Learning Approaches ◽

Infrastructure Systems

Download Full-text

Rehabilitation recognition skeleton data depth learning based on RNN

MATEC Web of Conferences ◽

10.1051/matecconf/201927702007 ◽

2019 ◽

Vol 277 ◽

pp. 02007

Author(s):

Qingzhi Zhang ◽

Panfeng Wu ◽

Xiaohui Du ◽

Hualiang Sun ◽

Lijia Yu

Keyword(s):

Deep Learning ◽

Large Scale ◽

Data Depth ◽

Spatial Aggregation ◽

Data Sets ◽

Key Factor ◽

Local Aggregation ◽

Representation Method ◽

Inter Frame ◽

Depth Learning

With the extensive application of deep learning in the field of human rehabilitation, skeleton based rehabilitation recognition is becoming more and more concerned with large-scale bone data sets. The key factor of this task is the two intra frame representations of the combined co-and the inter-frame. In this paper, an inter frame representation method based on RNN is proposed. Pointtion of each joint is joint-coded they are assembled into semantic both spatial and temporal domains.we introduce a global spatial aggregation which is able to learn superior joint co features over local aggregation.

Download Full-text

Deep Learning for Recommender Systems: A Netflix Case Study

AI Magazine ◽

10.1609/aimag.v42i3.18140 ◽

2022 ◽

Vol 42 (3) ◽

pp. 7-18

Author(s):

Harald Steck ◽

Linas Baltrunas ◽

Ehtsham Elahi ◽

Dawen Liang ◽

Yves Raimond ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Recommender Systems ◽

Input Data ◽

Lessons Learned ◽

Learning Approaches ◽

Learning Models ◽

Learning Methods ◽

Recommendation Algorithms

Deep learning has profoundly impacted many areas of machine learning. However, it took a while for its impact to be felt in the field of recommender systems. In this article, we outline some of the challenges encountered and lessons learned in using deep learning for recommender systems at Netflix. We first provide an overview of the various recommendation tasks on the Netflix service. We found that different model architectures excel at different tasks. Even though many deep-learning models can be understood as extensions of existing (simple) recommendation algorithms, we initially did not observe significant improvements in performance over well-tuned non-deep-learning approaches. Only when we added numerous features of heterogeneous types to the input data, deep-learning models did start to shine in our setting. We also observed that deep-learning methods can exacerbate the problem of offline–online metric (mis-)alignment. After addressing these challenges, deep learning has ultimately resulted in large improvements to our recommendations as measured by both offline and online metrics. On the practical side, integrating deep-learning toolboxes in our system has made it faster and easier to implement and experiment with both deep-learning and non-deep-learning approaches for various recommendation tasks. We conclude this article by summarizing our take-aways that may generalize to other applications beyond Netflix.

Download Full-text

BUILDING GENERALIZATION USING DEEP LEARNING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-4-565-2018 ◽

2018 ◽

Vol XLII-4 ◽

pp. 565-572 ◽

Cited By ~ 4

Author(s):

M. Sester ◽

Y. Feng ◽

F. Thiemann

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Physical Reality ◽

Training Data ◽

Data Sets ◽

Learning Approaches ◽

Depth Analysis ◽

Map Series ◽

Training Examples ◽

Future Work

Abstract. Cartographic generalization is a problem, which poses interesting challenges to automation. Whereas plenty of algorithms have been developed for the different sub-problems of generalization (e.g. simplification, displacement, aggregation), there are still cases, which are not generalized adequately or in a satisfactory way. The main problem is the interplay between different operators. In those cases the benchmark is the human operator, who is able to design an aesthetic and correct representation of the physical reality.Deep Learning methods have shown tremendous success for interpretation problems for which algorithmic methods have deficits. A prominent example is the classification and interpretation of images, where deep learning approaches outperform the traditional computer vision methods. In both domains &ndash; computer vision and cartography &ndash; humans are able to produce a solution; a prerequisite for this is, that there is the possibility to generate many training examples for the different cases. Thus, the idea in this paper is to employ Deep Learning for cartographic generalizations tasks, especially for the task of building generalization. An advantage of this task is the fact that many training data sets are available from given map series. The approach is a first attempt using an existing network.In the paper, the details of the implementation will be reported, together with an in depth analysis of the results. An outlook on future work will be given.

Download Full-text

Using Peer Instruction Pedagogy for Teaching Dynamics: Lessons Learned from Pre-Class Reading Quizzes

Proceedings of the Canadian Engineering Education Association (CEEA) ◽

10.24908/pceea.v0i0.4918 ◽

2013 ◽

Author(s):

Janice Miller-Young

Keyword(s):

Deep Learning ◽

Learning Strategies ◽

Lessons Learned ◽

Peer Instruction ◽

Group Discussions ◽

Small Group Discussions ◽

Negative Comments ◽

Cognitive Principles ◽

Second Year ◽

First Time

Peer Instruction (PI) is a widely used pedagogy which generally includes the use of two main teaching strategies: student pre-class preparation with an associated online quiz, and active in-class engagement including small-group discussions about conceptual questions. As an instructor trying this pedagogy for the first time, my purpose was to investigate both students’ learning and attitudes in my first/second year engineering dynamics course, using their answers to the reading quizzes as the main source of data. In short, students with the highest quiz marks did well in the course, indicating successful reading and learning strategies. Similarly, students with the lowest quiz marks attained lower overall marks. Students who did less well in the course were also more negative about the PI format (the class size of 17 did not allow for statistical analysis). Negative comments tended to be related to an expectation that the teacher should lecture more, indicating less understanding of cognitive principles. These results will provide a baseline for evaluating future teaching efforts which will include examining whether more directly encouraging deep learning strategies will be more effective for student learning.

Download Full-text