scholarly journals PyEI: A Python package for ecological inference

2021 ◽  
Vol 6 (64) ◽  
pp. 3397
Author(s):  
Karin Knudson ◽  
Gabe Schoenbach ◽  
Amariah Becker
2015 ◽  
Author(s):  
Alejandro Corvalan ◽  
Emerson Melo ◽  
Robert P Sherman ◽  
Matthew Shum

2020 ◽  
pp. 100443
Author(s):  
M. Chalela ◽  
E. Sillero ◽  
L. Pereyra ◽  
M.A. Garcia ◽  
J.B. Cabral ◽  
...  

Author(s):  
Derek J. N. Young ◽  
Sean M. A. Jeronimo ◽  
Derek J. Churchill ◽  
Van R. Kane ◽  
Andrew M. Latimer

GigaScience ◽  
2021 ◽  
Vol 10 (5) ◽  
Author(s):  
Colin Farrell ◽  
Michael Thompson ◽  
Anela Tosevska ◽  
Adewale Oyetunde ◽  
Matteo Pellegrini

Abstract Background Bisulfite sequencing is commonly used to measure DNA methylation. Processing bisulfite sequencing data is often challenging owing to the computational demands of mapping a low-complexity, asymmetrical library and the lack of a unified processing toolset to produce an analysis-ready methylation matrix from read alignments. To address these shortcomings, we have developed BiSulfite Bolt (BSBolt), a fast and scalable bisulfite sequencing analysis platform. BSBolt performs a pre-alignment sequencing read assessment step to improve efficiency when handling asymmetrical bisulfite sequencing libraries. Findings We evaluated BSBolt against simulated and real bisulfite sequencing libraries. We found that BSBolt provides accurate and fast bisulfite sequencing alignments and methylation calls. We also compared BSBolt to several existing bisulfite alignment tools and found BSBolt outperforms Bismark, BSSeeker2, BISCUIT, and BWA-Meth based on alignment accuracy and methylation calling accuracy. Conclusion BSBolt offers streamlined processing of bisulfite sequencing data through an integrated toolset that offers support for simulation, alignment, methylation calling, and data aggregation. BSBolt is implemented as a Python package and command line utility for flexibility when building informatics pipelines. BSBolt is available at https://github.com/NuttyLogic/BSBolt under an MIT license.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Jing Wui Yeoh ◽  
Neil Swainston ◽  
Peter Vegh ◽  
Valentin Zulkower ◽  
Pablo Carbonell ◽  
...  

Abstract Advances in hardware automation in synthetic biology laboratories are not yet fully matched by those of their software counterparts. Such automated laboratories, now commonly called biofoundries, require software solutions that would help with many specialized tasks such as batch DNA design, sample and data tracking, and data analysis, among others. Typically, many of the challenges facing biofoundries are shared, yet there is frequent wheel-reinvention where many labs develop similar software solutions in parallel. In this article, we present the first attempt at creating a standardized, open-source Python package. A number of tools will be integrated and developed that we envisage will become the obvious starting point for software development projects within biofoundries globally. Specifically, we describe the current state of available software, present usage scenarios and case studies for common problems, and finally describe plans for future development. SynBiopython is publicly available at the following address: http://synbiopython.org.


Author(s):  
Alireza Vafaei Sadr ◽  
Bruce A. Bassett ◽  
M. Kunz

AbstractAnomaly detection is challenging, especially for large datasets in high dimensions. Here, we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. DRAMA is released as a general python package that implements the general framework with a wide range of built-in options. This approach identifies the primary prototypes in the data with anomalies detected by their large distances from the prototypes, either in the latent space or in the original, high-dimensional space. DRAMA is tested on a wide variety of simulated and real datasets, in up to 3000 dimensions, and is found to be robust and highly competitive with commonly used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning, and highly unbalanced datasets. Besides, DRAMA naturally provides clustering of outliers for subsequent analysis.


2016 ◽  
Vol 24 (2) ◽  
pp. 263-272 ◽  
Author(s):  
Kosuke Imai ◽  
Kabir Khanna

In both political behavior research and voting rights litigation, turnout and vote choice for different racial groups are often inferred using aggregate election results and racial composition. Over the past several decades, many statistical methods have been proposed to address this ecological inference problem. We propose an alternative method to reduce aggregation bias by predicting individual-level ethnicity from voter registration records. Building on the existing methodological literature, we use Bayes's rule to combine the Census Bureau's Surname List with various information from geocoded voter registration records. We evaluate the performance of the proposed methodology using approximately nine million voter registration records from Florida, where self-reported ethnicity is available. We find that it is possible to reduce the false positive rate among Black and Latino voters to 6% and 3%, respectively, while maintaining the true positive rate above 80%. Moreover, we use our predictions to estimate turnout by race and find that our estimates yields substantially less amounts of bias and root mean squared error than standard ecological inference estimates. We provide open-source software to implement the proposed methodology.


Sign in / Sign up

Export Citation Format

Share Document