scholarly journals ScaffoldGraph: an open-source library for the generation and analysis of molecular scaffold networks and scaffold trees

2020 ◽  
Vol 36 (12) ◽  
pp. 3930-3931 ◽  
Author(s):  
Oliver B Scott ◽  
A W Edith Chan

Abstract Summary ScaffoldGraph (SG) is an open-source Python library and command-line tool for the generation and analysis of molecular scaffold networks and trees, with the capability of processing large sets of input molecules. With the increase in high-throughput screening data, scaffold graphs have proven useful for the navigation and analysis of chemical space, being used for visualization, clustering, scaffold-diversity analysis and active-series identification. Built on RDKit and NetworkX, SG integrates scaffold graph analysis into the growing scientific/cheminformatics Python stack, increasing the flexibility and extendibility of the tool compared to existing software. Availability and implementation SG is freely available and released under the MIT licence at https://github.com/UCLCheminformatics/ScaffoldGraph.

2011 ◽  
Vol 3 (1) ◽  
Author(s):  
Georg Hinselmann ◽  
Lars Rosenbaum ◽  
Andreas Jahn ◽  
Nikolas Fechner ◽  
Andreas Zell

PLoS Biology ◽  
2018 ◽  
Vol 16 (3) ◽  
pp. e2003904 ◽  
Author(s):  
M. Flori Sassano ◽  
Eric S. Davis ◽  
James E. Keating ◽  
Bryan T. Zorn ◽  
Tavleen K. Kochar ◽  
...  

2015 ◽  
Vol 14 ◽  
pp. CIN.S26470 ◽  
Author(s):  
Richard P. Finney ◽  
Qing-Rong Chen ◽  
Cu V. Nguyen ◽  
Chih Hao Hsu ◽  
Chunhua Yan ◽  
...  

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview . The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview .


Author(s):  
Kai Kruse ◽  
Clemens B. Hug ◽  
Juan M. Vaquerizas

Chromosome conformation capture data, particularly from high-throughput approaches such as Hi-C and its derivatives, are typically very complex to analyse. Existing analysis tools are often single-purpose, or limited in compatibility to a small number of data formats, frequently making Hi-C analyses tedious and time-consuming. Here, we present FAN-C, an easy-to-use command-line tool and powerful Python API with a broad feature set covering matrix generation, analysis, and visualisation for C-like data (https://github.com/vaquerizaslab/fanc). Due to its comprehensiveness and compatibility with the most prevalent Hi-C storage formats, FAN-C can be used in combination with a large number of existing analysis tools, thus greatly simplifying Hi-C matrix analysis.


2021 ◽  
Author(s):  
Adarsh Kalikadien ◽  
Evgeny A. Pidko ◽  
Vivek Sinha

<div>Local chemical space exploration of an experimentally synthesized material can be done by making slight structural</div><div>variations of the synthesized material. This generation of many molecular structures with reasonable quality,</div><div>that resemble an existing (chemical) purposeful material, is needed for high-throughput screening purposes in</div><div>material design. Large databases of geometry and chemical properties of transition metal complexes are not</div><div>readily available, although these complexes are widely used in homogeneous catalysis. A Python-based workflow,</div><div>ChemSpaX, that is aimed at automating local chemical space exploration for any type of molecule, is introduced.</div><div>The overall computational workflow of ChemSpaX is explained in more detail. ChemSpaX uses 3D information,</div><div>to place functional groups on an input structure. For example, the input structure can be a catalyst for which one</div><div>wants to use high-throughput screening to investigate if the catalytic activity can be improved. The newly placed</div><div>substituents are optimized using a computationally cheap force-field optimization method. After placement of</div><div>new substituents, higher level optimizations using xTB or DFT instead of force-field optimization are also possible</div><div>in the current workflow. In representative applications of ChemSpaX, it is shown that the structures generated by</div><div>ChemSpaX have a reasonable quality for usage in high-throughput screening applications. Representative applications</div><div>of ChemSpaX are shown by investigating various adducts on functionalized Mn-based pincer complexes,</div><div>hydrogenation of Ru-based pincer complexes, functionalization of cobalt porphyrin complexes and functionalization</div><div>of a bipyridyl functionalized cobalt-porphyrin trapped in a M2L4 type cage complex. Descriptors such as</div><div>the Gibbs free energy of reaction and HOMO-LUMO gap, that can be used in data-driven design and discovery</div><div>of catalysts, were selected and studied in more detail for the selected use cases. The relatively fast GFN2-xTB</div><div>method was used to calculate these descriptors and a comparison was done against DFT calculated descriptors.</div><div>ChemSpaX is open-source and aims to bolster the efforts of the scientific community towards data-driven material</div><div>discovery.</div>


2018 ◽  
Author(s):  
isabelle Heath-Apostolopoulos ◽  
Liam Wilbraham ◽  
Martijn Zwijnenburg

We discuss a low-cost computational workflow for the high-throughput screening of polymeric photocatalysts and demonstrate its utility by applying it to a number of challenging problems that would be difficult to tackle otherwise. Specifically we show how having access to a low-cost method allows one to screen a vast chemical space, as well as to probe the effects of conformational degrees of freedom and sequence isomerism. Finally, we discuss both the opportunities of computational screening in the search for polymer photocatalysts, as well as the biggest challenges.


2020 ◽  
Author(s):  
Marcelo Inuzuka ◽  
Hugo Do Nascimento ◽  
Fernando Almeida ◽  
Bruno Barros ◽  
Walid Jradi

This article introduces Doclass, a free and open-source software for the Web that aims to assist in labeling and classifying large sets of documents. The research involved a design science research methodology, guided by the real demands of a legal text processing company. The architecture, several design decisions and the current development stage of the software are presented. Preliminary user experiments for evaluating interactive document labeling are described. As a result, the first version of a system with an architecture composed of a mobile frontend that communicates with a backend through a REST API was published, with satisfactory performance evaluation by the applicant. Other results involve the use of active learning techniques to reduce human effort when performing the classification of documents, as well as the Uncertainty strategy to choose the document to be labeled. The effectiveness of the stop criterion for the active learning technique based on confidence level was tested and proved unsatisfactory, remaining as a future work.


Author(s):  
Poonam Nandal ◽  
Deepa Bura ◽  
Meeta Singh

In today's world where data is accumulating at an ever-increasing rate, processing of this big data was a necessity rather than a need. This required some tools for processing as well as analysis of the data that could be achieved to obtain some meaningful result or outcome out of it. There are many tools available in market which could be used for processing of big data. But the main focus on this chapter is on Apache Hadoop which could be regarded as an open source software based framework which could be efficiently deployed for processing, storing, analyzing, and to produce meaningful insights from large sets of data. It is always said that if exponential increase of data is processing challenge then Hadoop could be considered as one of the effective solution for processing, managing, analyzing, and storing this big data. Hadoop versions and components are also illustrated in the later section of the paper. This chapter majorly focuses on the technique, methodology, components, and methodologies adopted by Apache Hadoop software framework for big data processing.


2020 ◽  
Vol 36 (10) ◽  
pp. 3263-3265 ◽  
Author(s):  
Lucas Czech ◽  
Pierre Barbera ◽  
Alexandros Stamatakis

Abstract Summary We present genesis, a library for working with phylogenetic data, and gappa, an accompanying command-line tool for conducting typical analyses on such data. The tools target phylogenetic trees and phylogenetic placements, sequences, taxonomies and other relevant data types, offer high-level simplicity as well as low-level customizability, and are computationally efficient, well-tested and field-proven. Availability and implementation Both genesis and gappa are written in modern C++11, and are freely available under GPLv3 at http://github.com/lczech/genesis and http://github.com/lczech/gappa. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document