scholarly journals Deep learning of genomic variation and regulatory network data

2018 ◽  
Vol 27 (Supplement_R1) ◽  
pp. R63-R71 ◽  
Author(s):  
Amalio Telenti ◽  
Christoph Lippert ◽  
Pi-Chuan Chang ◽  
Mark DePristo

Abstract The human genome is now investigated through high-throughput functional assays, and through the generation of population genomic data. These advances support the identification of functional genetic variants and the prediction of traits (e.g. deleterious variants and disease). This review summarizes lessons learned from the large-scale analyses of genome and exome data sets, modeling of population data and machine-learning strategies to solve complex genomic sequence regions. The review also portrays the rapid adoption of artificial intelligence/deep neural networks in genomics; in particular, deep learning approaches are well suited to model the complex dependencies in the regulatory landscape of the genome, and to provide predictors for genetic variant calling and interpretation.

2021 ◽  
Author(s):  
Nae-Chyun Chen ◽  
Alexey Kolesnikov ◽  
Sidharth Goel ◽  
Taedong Yun ◽  
Pi-Chuan Chang ◽  
...  

Large-scale population variant data is often used to filter and aid interpretation of variant calls in a single sample. These approaches do not incorporate population information directly into the process of variant calling, and are often limited to filtering which trades recall for precision. In this study, we modify DeepVariant to add a new channel encoding population allele frequencies from the 1000 Genomes Project. We show that this model reduces variant calling errors, improving both precision and recall. We assess the impact of using population-specific or diverse reference panels. We achieve the greatest accuracy with diverse panels, suggesting that large, diverse panels are preferable to individual populations, even when the population matches sample ancestry. Finally, we show that this benefit generalizes to samples with different ancestry from the training data even when the ancestry is also excluded from the reference panel.


2020 ◽  
Author(s):  
Yuan Yuan ◽  
Lei Lin

Satellite image time series (SITS) classification is a major research topic in remote sensing and is relevant for a wide range of applications. Deep learning approaches have been commonly employed for SITS classification and have provided state-of-the-art performance. However, deep learning methods suffer from overfitting when labeled data is scarce. To address this problem, we propose a novel self-supervised pre-training scheme to initialize a Transformer-based network by utilizing large-scale unlabeled data. In detail, the model is asked to predict randomly contaminated observations given an entire time series of a pixel. The main idea of our proposal is to leverage the inherent temporal structure of satellite time series to learn general-purpose spectral-temporal representations related to land cover semantics. Once pre-training is completed, the pre-trained network can be further adapted to various SITS classification tasks by fine-tuning all the model parameters on small-scale task-related labeled data. In this way, the general knowledge and representations about SITS can be transferred to a label-scarce task, thereby improving the generalization performance of the model as well as reducing the risk of overfitting. Comprehensive experiments have been carried out on three benchmark datasets over large study areas. Experimental results demonstrate the effectiveness of the proposed method, leading to a classification accuracy increment up to 1.91% to 6.69%. <div><b>This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.</b></div>


2021 ◽  
Author(s):  
Benjamin Schwarz ◽  
Korbinian Sager ◽  
Philippe Jousset ◽  
Gilda Currenti ◽  
Charlotte Krawczyk ◽  
...  

&lt;p&gt;&lt;span&gt;Fiber-optic cables form an integral part of modern telecommunications infrastructure and are ubiquitous in particular in regions where dedicated seismic instrumentation is traditionally sparse or lacking entirely. Fiber-optic seismology promises to enable affordable and time-extended observations of earth and environmental processes at an unprecedented temporal and spatial resolution. The method&amp;#8217;s unique potential for combined large-N and large-T observations implies intriguing opportunities but also significant challenges in terms of data storage, data handling and computation.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span&gt;Our goal is to enable real-time data enhancement, rapid signal detection and wave field characterization without the need for time-demanding user interaction. We therefore combine coherent wave field analysis, an optics-inspired processing framework developed in controlled-source seismology, with state-of-the-art deep convolutional neural network (CNN) architectures commonly used in visual perception. While conventional deep learning strategies have to rely on manually labeled or purely synthetic training datasets, coherent wave field analysis labels field data based on physical principles and enables large-scale and purely data-driven training of the CNN models. The shear amount of data already recorded in various settings makes artificial data generation by numerical modeling superfluous &amp;#8211; a task that is often constrained by incomplete knowledge of the embedding medium and an insufficient description of processes at or close to the surface, which are challenging to capture in integrated simulations.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span&gt;Applications to extensive field datasets acquired with dark-fiber infrastructure at a geothermal field in SW Iceland and in a town at the flank of Mt Etna, Italy, reveal that the suggested framework generalizes well across different observational scales and environments, and sheds new light on the origin of a broad range of physically distinct wave fields that can be sensed with fiber-optic technology. Owing to the real-time applicability with affordable computing infrastructure, our analysis lends itself well to rapid on-the-fly data enhancement, wave field separation and compression strategies, thereby promising to have a positive impact on the full processing chain currently in use in fiber-optic seismology.&lt;/span&gt;&lt;/p&gt;


2022 ◽  
pp. 27-50
Author(s):  
Rajalaxmi Prabhu B. ◽  
Seema S.

A lot of user-generated data is available these days from huge platforms, blogs, websites, and other review sites. These data are usually unstructured. Analyzing sentiments from these data automatically is considered an important challenge. Several machine learning algorithms are implemented to check the opinions from large data sets. A lot of research has been undergone in understanding machine learning approaches to analyze sentiments. Machine learning mainly depends on the data required for model building, and hence, suitable feature exactions techniques also need to be carried. In this chapter, several deep learning approaches, its challenges, and future issues will be addressed. Deep learning techniques are considered important in predicting the sentiments of users. This chapter aims to analyze the deep-learning techniques for predicting sentiments and understanding the importance of several approaches for mining opinions and determining sentiment polarity.


Database ◽  
2019 ◽  
Vol 2019 ◽  
Author(s):  
Tao Chen ◽  
Mingfen Wu ◽  
Hexi Li

Abstract The automatic extraction of meaningful relations from biomedical literature or clinical records is crucial in various biomedical applications. Most of the current deep learning approaches for medical relation extraction require large-scale training data to prevent overfitting of the training model. We propose using a pre-trained model and a fine-tuning technique to improve these approaches without additional time-consuming human labeling. Firstly, we show the architecture of Bidirectional Encoder Representations from Transformers (BERT), an approach for pre-training a model on large-scale unstructured text. We then combine BERT with a one-dimensional convolutional neural network (1d-CNN) to fine-tune the pre-trained model for relation extraction. Extensive experiments on three datasets, namely the BioCreative V chemical disease relation corpus, traditional Chinese medicine literature corpus and i2b2 2012 temporal relation challenge corpus, show that the proposed approach achieves state-of-the-art results (giving a relative improvement of 22.2, 7.77, and 38.5% in F1 score, respectively, compared with a traditional 1d-CNN classifier). The source code is available at https://github.com/chentao1999/MedicalRelationExtraction.


2019 ◽  
Vol 277 ◽  
pp. 02007
Author(s):  
Qingzhi Zhang ◽  
Panfeng Wu ◽  
Xiaohui Du ◽  
Hualiang Sun ◽  
Lijia Yu

With the extensive application of deep learning in the field of human rehabilitation, skeleton based rehabilitation recognition is becoming more and more concerned with large-scale bone data sets. The key factor of this task is the two intra frame representations of the combined co-and the inter-frame. In this paper, an inter frame representation method based on RNN is proposed. Pointtion of each joint is joint-coded they are assembled into semantic both spatial and temporal domains.we introduce a global spatial aggregation which is able to learn superior joint co features over local aggregation.


AI Magazine ◽  
2022 ◽  
Vol 42 (3) ◽  
pp. 7-18
Author(s):  
Harald Steck ◽  
Linas Baltrunas ◽  
Ehtsham Elahi ◽  
Dawen Liang ◽  
Yves Raimond ◽  
...  

Deep learning has profoundly impacted many areas of machine learning. However, it took a while for its impact to be felt in the field of recommender systems. In this article, we outline some of the challenges encountered and lessons learned in using deep learning for recommender systems at Netflix. We first provide an overview of the various recommendation tasks on the Netflix service. We found that different model architectures excel at different tasks. Even though many deep-learning models can be understood as extensions of existing (simple) recommendation algorithms, we initially did not observe significant improvements in performance over well-tuned non-deep-learning approaches. Only when we added numerous features of heterogeneous types to the input data, deep-learning models did start to shine in our setting. We also observed that deep-learning methods can exacerbate the problem of offline–online metric (mis-)alignment. After addressing these challenges, deep learning has ultimately resulted in large improvements to our recommendations as measured by both offline and online metrics. On the practical side, integrating deep-learning toolboxes in our system has made it faster and easier to implement and experiment with both deep-learning and non-deep-learning approaches for various recommendation tasks. We conclude this article by summarizing our take-aways that may generalize to other applications beyond Netflix.


Author(s):  
M. Sester ◽  
Y. Feng ◽  
F. Thiemann

<p><strong>Abstract.</strong> Cartographic generalization is a problem, which poses interesting challenges to automation. Whereas plenty of algorithms have been developed for the different sub-problems of generalization (e.g. simplification, displacement, aggregation), there are still cases, which are not generalized adequately or in a satisfactory way. The main problem is the interplay between different operators. In those cases the benchmark is the human operator, who is able to design an aesthetic and correct representation of the physical reality.</p><p>Deep Learning methods have shown tremendous success for interpretation problems for which algorithmic methods have deficits. A prominent example is the classification and interpretation of images, where deep learning approaches outperform the traditional computer vision methods. In both domains &amp;ndash; computer vision and cartography &amp;ndash; humans are able to produce a solution; a prerequisite for this is, that there is the possibility to generate many training examples for the different cases. Thus, the idea in this paper is to employ Deep Learning for cartographic generalizations tasks, especially for the task of building generalization. An advantage of this task is the fact that many training data sets are available from given map series. The approach is a first attempt using an existing network.</p><p>In the paper, the details of the implementation will be reported, together with an in depth analysis of the results. An outlook on future work will be given.</p>


Author(s):  
Janice Miller-Young

Peer Instruction (PI) is a widely used pedagogy which generally includes the use of two main teaching strategies: student pre-class preparation with an associated online quiz, and active in-class engagement including small-group discussions about conceptual questions. As an instructor trying this pedagogy for the first time, my purpose was to investigate both students’ learning and attitudes in my first/second year engineering dynamics course, using their answers to the reading quizzes as the main source of data. In short, students with the highest quiz marks did well in the course, indicating successful reading and learning strategies. Similarly, students with the lowest quiz marks attained lower overall marks. Students who did less well in the course were also more negative about the PI format (the class size of 17 did not allow for statistical analysis). Negative comments tended to be related to an expectation that the teacher should lecture more, indicating less understanding of cognitive principles. These results will provide a baseline for evaluating future teaching efforts which will include examining whether more directly encouraging deep learning strategies will be more effective for student learning.


Sign in / Sign up

Export Citation Format

Share Document