scholarly journals Memes: A motif analysis environment in R using tools from the MEME Suite

2021 ◽  
Vol 17 (9) ◽  
pp. e1008991
Author(s):  
Spencer L. Nystrom ◽  
Daniel J. McKay

Identification of biopolymer motifs represents a key step in the analysis of biological sequences. The MEME Suite is a widely used toolkit for comprehensive analysis of biopolymer motifs; however, these tools are poorly integrated within popular analysis frameworks like the R/Bioconductor project, creating barriers to their use. Here we present memes, an R package that provides a seamless R interface to a selection of popular MEME Suite tools. memes provides a novel “data aware” interface to these tools, enabling rapid and complex discriminative motif analysis workflows. In addition to interfacing with popular MEME Suite tools, memes leverages existing R/Bioconductor data structures to store the multidimensional data returned by MEME Suite tools for rapid data access and manipulation. Finally, memes provides data visualization capabilities to facilitate communication of results. memes is available as a Bioconductor package at https://bioconductor.org/packages/memes, and the source code can be found at github.com/snystrom/memes.

2021 ◽  
Author(s):  
Spencer L. Nystrom ◽  
Daniel J. McKay

AbstractIdentification of biopolymer motifs represents a key step in the analysis of biological sequences. The MEME Suite is a widely used toolkit for comprehensive analysis of biopolymer motifs; however, these tools are poorly integrated within popular analysis frameworks like the R/Bioconductor project, creating barriers to their use. Here we present memes, an R package which provides a seamless R interface to the MEME Suite. memes provides a novel “data aware” interface to these tools, enabling rapid and complex discriminative motif analysis workflows. In addition to interfacing with popular MEME Suite tools, memes leverages existing R/Bioconductor data structures to store the complex, multidimensional data returned by MEME Suite tools for rapid data access and manipulation. Finally, memes provides data visualization capabilities to facilitate communication of results. memes is available as a Bioconductor package at https://bioconductor.org/packages/memes, and the source code can be found at github.com/snystrom/memes.


2021 ◽  
Author(s):  
Federico Agostinis ◽  
Chiara Romualdi ◽  
Gabriele Sales ◽  
Davide Risso

Summary: We present NewWave, a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA sequencing data. To achieve scalability, NewWave uses mini-batch optimization and can work with out-of-memory data, enabling users to analyze datasets with millions of cells. Availability and implementation: NewWave is implemented as an open-source R package available through the Bioconductor project at https://bioconductor.org/packages/NewWave/ Supplementary information: Supplementary data are available at Bioinformatics online.


Author(s):  
Rommel Estores ◽  
Pascal Vercruysse ◽  
Karl Villareal ◽  
Eric Barbian ◽  
Ralph Sanchez ◽  
...  

Abstract The failure analysis community working on highly integrated mixed signal circuitry is entering an era where simultaneously System-On-Chip technologies, denser metallization schemes, on-chip dissipation techniques and intelligent packages are being introduced. These innovations bring a great deal of defect accessibility challenges to the failure analyst. To contend in this era while aiming for higher efficiency and effectiveness, the failure analysis environment must undergo a disruptive evolution. The success or failure of an analysis will be determined by the careful selection of tools, data and techniques in the applied analysis flow. A comprehensive approach is required where hardware, software, data analysis, traditional FA techniques and expertise are complementary combined [1]. This document demonstrates this through the incorporation of advanced scan diagnosis methods in the overall analysis flow for digital functionality failures and supporting the enhanced failure analysis methodology. For the testing and diagnosis of the presented cases, compact but powerful scan test FA Lab hardware with its diagnosis software was used [2]. It can therefore easily be combined with the traditional FA techniques to provide stimulus for dynamic fault localizations [3]. The system combines scan chain information, failure data and layout information into one viewing environment which provides real analysis power for the failure analyst. Comprehensive data analysis is performed to identify failing cells/nets, provide a better overview of the failure and the interactions to isolate the fault further to a smaller area, or to analyze subtle behavior patterns to find and rationalize possible faults that are otherwise not detected. Three sample cases will be discussed in this document to demonstrate specific strengths and advantages of this enhanced FA methodology.


2013 ◽  
Vol 9 (2) ◽  
Author(s):  
Fernando De Assis Rodrigues ◽  
Ricardo Ceśar Gonçalves Sant'Ana

Resumo Ambientes para acesso a dados governamentais, via Tecnologias de Informação e Comunicação, podem ampliar possibilidades de acompanhamento pelo cidadão, retroalimentando futuras demandas. O objetivo deste estudo é identificar nos dados disponíveis via transparência ativa, a existência de elementos que permitam a elaboração de propostas de modelos dimensionais, propiciando a antecipação de demandas de acesso a dados. Como referencial teórico-metodológico, o texto utiliza os conceitos Business Intelligence eCitizen Intelligence. Como resultado, foi elaborada a proposta de um modelo dimensional a partir da consulta de despesas diárias, disponível no Portal de Transparência do Governo Federal.Palavras-chave Transparência Pública, Tecnologias de Informação e Comunicação, Coleta de Dados, Citizen Intelligence, Data Warehouse.Abstract Environments for access to government data, viaInformation and Communications Technologies, may expand possibilities for citizen monitoring, providing feedback for future demands. The aim of this study is to identify, in the available data via active transparency, the existence of elements that allow the construction of new proposals of dimensional models, enabling an anticipation of demands on data access. The theoretical-methodological framework, the text uses the concepts Citizen Intelligence and Business Intelligence. As a result, a dimensional model was proposed, building on a dimensional model from a daily expenses query, available in the Transparency home-page of the Brazillian Federal Government.Keywords Public Transparency, Information and Communication Technologies, Collecting Data, Citizen Intelligence, Data Warehouse.


2019 ◽  
Vol 4 (1) ◽  
pp. 64-67
Author(s):  
Pavel Kim

One of the fundamental tasks of cluster analysis is the partitioning of multidimensional data samples into groups of clusters – objects, which are closed in the sense of some given measure of similarity. In a some of problems, the number of clusters is set a priori, but more often it is required to determine them in the course of solving clustering. With a large number of clusters, especially if the data is “noisy,” the task becomes difficult for analyzing by experts, so it is artificially reduces the number of consideration clusters. The formal means of merging the “neighboring” clusters are considered, creating the basis for parameterizing the number of significant clusters in the “natural” clustering model [1].


2018 ◽  
Vol 6 (3) ◽  
pp. 669-686 ◽  
Author(s):  
Michael Dietze

Abstract. Environmental seismology is the study of the seismic signals emitted by Earth surface processes. This emerging research field is at the intersection of seismology, geomorphology, hydrology, meteorology, and further Earth science disciplines. It amalgamates a wide variety of methods from across these disciplines and ultimately fuses them in a common analysis environment. This overarching scope of environmental seismology requires a coherent yet integrative software which is accepted by many of the involved scientific disciplines. The statistic software R has gained paramount importance in the majority of data science research fields. R has well-justified advances over other mostly commercial software, which makes it the ideal language to base a comprehensive analysis toolbox on. The article introduces the avenues and needs of environmental seismology, and how these are met by the R package eseis. The conceptual structure, example data sets, and available functions are demonstrated. Worked examples illustrate possible applications of the package and in-depth descriptions of the flexible use of the functions. The package has a registered DOI, is available under the GPL licence on the Comprehensive R Archive Network (CRAN), and is maintained on GitHub.


2019 ◽  
Vol 36 (8) ◽  
pp. 2587-2588 ◽  
Author(s):  
Christopher M Ward ◽  
Thu-Hien To ◽  
Stephen M Pederson

Abstract Motivation High throughput next generation sequencing (NGS) has become exceedingly cheap, facilitating studies to be undertaken containing large sample numbers. Quality control (QC) is an essential stage during analytic pipelines and the outputs of popular bioinformatics tools such as FastQC and Picard can provide information on individual samples. Although these tools provide considerable power when carrying out QC, large sample numbers can make inspection of all samples and identification of systemic bias a challenge. Results We present ngsReports, an R package designed for the management and visualization of NGS reports from within an R environment. The available methods allow direct import into R of FastQC reports along with outputs from other tools. Visualization can be carried out across many samples using default, highly customizable plots with options to perform hierarchical clustering to quickly identify outlier libraries. Moreover, these can be displayed in an interactive shiny app or HTML report for ease of analysis. Availability and implementation The ngsReports package is available on Bioconductor and the GUI shiny app is available at https://github.com/UofABioinformaticsHub/shinyNgsreports. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document