Using polyan: a Python package for modelling polysome profiles from ribosome density data v2 (protocols.io.bvyfn7tn)

Now-a-days data streams or information streams are gigantic and quick changing. The usage of information streams can fluctuate from basic logical, scientific applications to vital business and money related ones. The useful information is abstracted from the stream and represented in the form of micro-clusters in the online phase. In offline phase micro-clusters are merged to form the macro clusters. DBSTREAM technique captures the density between micro-clusters by means of a shared density graph in the online phase. The density data in this graph is then used in reclustering for improving the formation of clusters but DBSTREAM takes more time in handling the corrupted data points In this paper an early pruning algorithm is used before pre-processing of information and a bloom filter is used for recognizing the corrupted information. Our experiments on real time datasets shows that using this approach improves the efficiency of macro-clusters by 90% and increases the generation of more number of micro-clusters within in a short time.

Download Full-text

pymia: A Python package for data handling and evaluation in deep learning-based medical image analysis

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2020.105796 ◽

2021 ◽

Vol 198 ◽

pp. 105796

Author(s):

Alain Jungo ◽

Olivier Scheidegger ◽

Mauricio Reyes ◽

Fabian Balsiger

Keyword(s):

Image Analysis ◽

Deep Learning ◽

Medical Image ◽

Medical Image Analysis ◽

Data Handling ◽

Python Package

Download Full-text

GriSPy: A Python package for fixed-radius nearest neighbors search

Astronomy and Computing ◽

10.1016/j.ascom.2020.100443 ◽

2020 ◽

pp. 100443

Author(s):

M. Chalela ◽

E. Sillero ◽

L. Pereyra ◽

M.A. Garcia ◽

J.B. Cabral ◽

...

Keyword(s):

Nearest Neighbors ◽

Fixed Radius ◽

Python Package

Download Full-text

BiSulfite Bolt: A bisulfite sequencing analysis platform

GigaScience ◽

10.1093/gigascience/giab033 ◽

2021 ◽

Vol 10 (5) ◽

Author(s):

Colin Farrell ◽

Michael Thompson ◽

Anela Tosevska ◽

Adewale Oyetunde ◽

Matteo Pellegrini

Keyword(s):

Data Aggregation ◽

Bisulfite Sequencing ◽

Low Complexity ◽

Sequencing Analysis ◽

Command Line ◽

Sequencing Data ◽

Bisulfite Sequencing Data ◽

Analysis Platform ◽

Python Package ◽

Bisulfite Sequencing Analysis

Abstract Background Bisulfite sequencing is commonly used to measure DNA methylation. Processing bisulfite sequencing data is often challenging owing to the computational demands of mapping a low-complexity, asymmetrical library and the lack of a unified processing toolset to produce an analysis-ready methylation matrix from read alignments. To address these shortcomings, we have developed BiSulfite Bolt (BSBolt), a fast and scalable bisulfite sequencing analysis platform. BSBolt performs a pre-alignment sequencing read assessment step to improve efficiency when handling asymmetrical bisulfite sequencing libraries. Findings We evaluated BSBolt against simulated and real bisulfite sequencing libraries. We found that BSBolt provides accurate and fast bisulfite sequencing alignments and methylation calls. We also compared BSBolt to several existing bisulfite alignment tools and found BSBolt outperforms Bismark, BSSeeker2, BISCUIT, and BWA-Meth based on alignment accuracy and methylation calling accuracy. Conclusion BSBolt offers streamlined processing of bisulfite sequencing data through an integrated toolset that offers support for simulation, alignment, methylation calling, and data aggregation. BSBolt is implemented as a Python package and command line utility for flexibility when building informatics pipelines. BSBolt is available at https://github.com/NuttyLogic/BSBolt under an MIT license.

Download Full-text

SynBiopython: an open-source software library for Synthetic Biology

Synthetic Biology ◽

10.1093/synbio/ysab001 ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Jing Wui Yeoh ◽

Neil Swainston ◽

Peter Vegh ◽

Valentin Zulkower ◽

Pablo Carbonell ◽

...

Keyword(s):

Synthetic Biology ◽

Open Source ◽

Open Source Software ◽

Development Projects ◽

Software Library ◽

Current State ◽

Starting Point ◽

Common Problems ◽

Data Tracking ◽

Python Package

Abstract Advances in hardware automation in synthetic biology laboratories are not yet fully matched by those of their software counterparts. Such automated laboratories, now commonly called biofoundries, require software solutions that would help with many specialized tasks such as batch DNA design, sample and data tracking, and data analysis, among others. Typically, many of the challenges facing biofoundries are shared, yet there is frequent wheel-reinvention where many labs develop similar software solutions in parallel. In this article, we present the first attempt at creating a standardized, open-source Python package. A number of tools will be integrated and developed that we envisage will become the obvious starting point for software development projects within biofoundries globally. Specifically, we describe the current state of available software, present usage scenarios and case studies for common problems, and finally describe plans for future development. SynBiopython is publicly available at the following address: http://synbiopython.org.

Download Full-text

Analysis of the Vertical Air Motions and Raindrop Size Distribution Retrievals of a Squall Line Based on Cloud Radar Doppler Spectral Density Data

Atmosphere ◽

10.3390/atmos12030348 ◽

2021 ◽

Vol 12 (3) ◽

pp. 348

Author(s):

Ningkun Ma ◽

Liping Liu ◽

Yichen Chen ◽

Yang Zhang

Keyword(s):

Spectral Density ◽

Size Distribution ◽

Squall Line ◽

Vertical Profiles ◽

Air Velocity ◽

Density Data ◽

Cloud Radar ◽

Raindrop Size Distribution ◽

Raindrop Size ◽

Reflectivity Factor

A squall line is a type of strongly organized mesoscale convective system that can cause severe weather disasters. Thus, it is crucial to explore the dynamic structure and hydrometeor distributions in squall lines. This study analyzed a squall line over Guangdong Province on 6 May 2016 that was observed using a Ka-band millimeter-wave cloud radar (CR) and an S-band dual-polarization radar (PR). Doppler spectral density data obtained by the CR were used to retrieve the vertical air motions and raindrop size distribution (DSD). The results showed the following: First, the CR detected detailed vertical profiles and their evolution before and during the squall line passage. In the convection time segment (segment B), heavy rain existed with a reflectivity factor exceeding 35 dBZ and a velocity spectrum width exceeding 1.3 m s−1. In the PR detection, the differential reflectivity factor (Zdr) was 1–2 dB, and the large specific differential phase (Kdp) also represented large liquid water content. In the transition and stratiform cloud time segments (segments B and C), the rain stabilized gradually, with decreasing cloud tops, stable precipitation, and a 0 °C layer bright band. Smaller Kdp values (less than 0.9) were distributed around the 0 °C layer, which may have been caused by the melting of ice crystal particles. Second, from the CR-retrieved vertical air velocity, before squall line passage, downdrafts dominated in local convection and weak updrafts existed in higher-altitude altostratus clouds. In segment B, the updraft air velocity reached more than 8 m s−1 below the 0 °C layer. From segments C to D, the updrafts changed gradually into weak and wide-ranging downdrafts. Third, in the comparison of DSD values retrieved at 1.5 km and DSD values on the ground, the retrieved DSD line was lower than the disdrometer, the overall magnitude of the DSD retrieved was smaller, and the difference decreased from segments C to D. The standardized intercept parameter (Nw) and shape parameter (μ) of the DSD retrieved at 1.8 km showed good agreement with the disdrometer results, and the mass-weighted mean diameter (Dm) was smaller than that on the ground, but very close to the PR-retrieved Dm result at 2 km. Therefore, comparing with the DSD retrieved at around 2 km, the overall number concentration remained unchanged and Dm got larger on the ground, possibly reflecting the process of raindrop coalescence. Lastly, the average vertical profiles of several quantities in all segments showed that, first of all, the decrease of Nw and Dm with height in segments C and D was similar, reflecting the collision effect of falling raindrops. The trends were opposite in segment B, indicating that raindrops underwent intense mixing and rapid collision and growth in this segment. Then, PR-retrieved Dm profiles can verify the rationality of the CR-retrieved Dm. Finally, a vertical velocity profile peak generated a larger Dm especially in segments C and D.

Download Full-text

TAILOR-MS, a Python Package that Deciphers Complex Triacylglycerol Fatty Acyl Structures: Applications for Bovine Milk and Infant Formulas

Analytical Chemistry ◽

10.1021/acs.analchem.0c04373 ◽

2021 ◽

Author(s):

Kang-Yu Peng ◽

Malinda Salim ◽

Joseph Pelle ◽

Gisela Ramirez ◽

Ben J. Boyd

Keyword(s):

Bovine Milk ◽

Fatty Acyl ◽

Infant Formulas ◽

Python Package

Download Full-text

A flexible framework for anomaly Detection via dimensionality reduction

Neural Computing and Applications ◽

10.1007/s00521-021-05839-5 ◽

2021 ◽

Author(s):

Alireza Vafaei Sadr ◽

Bruce A. Bassett ◽

M. Kunz

Keyword(s):

Anomaly Detection ◽

Dimensionality Reduction ◽

Dimensional Space ◽

High Dimensions ◽

Detection Algorithms ◽

Latent Space ◽

Wide Range ◽

Flexible Framework ◽

Online Anomaly Detection ◽

Python Package

AbstractAnomaly detection is challenging, especially for large datasets in high dimensions. Here, we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. DRAMA is released as a general python package that implements the general framework with a wide range of built-in options. This approach identifies the primary prototypes in the data with anomalies detected by their large distances from the prototypes, either in the latent space or in the original, high-dimensional space. DRAMA is tested on a wide variety of simulated and real datasets, in up to 3000 dimensions, and is found to be robust and highly competitive with commonly used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning, and highly unbalanced datasets. Besides, DRAMA naturally provides clustering of outliers for subsequent analysis.

Download Full-text