Pygenprop: a Python library for programmatic exploration and comparison of organism genome properties

2019 ◽  
Vol 35 (23) ◽  
pp. 5063-5065
Author(s):  
Lee H Bergstrand ◽  
Josh D Neufeld ◽  
Andrew C Doxey

Abstract Summary A critical step in comparative genomics is the identification of differences in the presence/absence of encoded biochemical pathways among organisms. Our library, Pygenprop, facilitates these comparisons using data from the Genome Properties database. Pygenprop is written in Python and, unlike existing libraries, it is compatible with a variety of tools in the Python data science ecosystem, such as Jupyter Notebooks for interactive analyses and scikit-learn for machine learning. Pygenprop assigns YES, NO, or PARTIAL support for each property based on InterProScan annotations of open reading frames from an organism’s genome. The library contains classes for representing the Genome Properties database as a whole and methods for detecting differences in property assignments between organisms. As the Genome Properties database grows, we anticipate widespread adoption of Pygenprop for routine genome analyses and integration within third-party bioinformatics software. Availability and implementation Pygenprop is written in Python and is compatible with versions 3.6 or higher. Source code is available under Apache Licence Version 2 at https://github.com/Micromeda/pygenprop. The package can be installed from both PyPi (https://pypi.org/project/pygenprop) and Anaconda (https://anaconda.org/lbergstrand/pygenprop). Documentation is available on Read the Docs (http://pygenprop.rtfd.io/).

2003 ◽  
Vol 12 (02) ◽  
pp. 241-273 ◽  
Author(s):  
ANA L. C. BAZZAN ◽  
ROGÉRIO DUARTE ◽  
ABNER N. PITINGA ◽  
LUCIANA F. SCHROEDER ◽  
FARLON DE A. SOUTO ◽  
...  

This work reports on the ATUCG environment (Agent-based environmenT for aUtomatiC annotation of Genomes). It consists of three layers, each having several agents in charge of performing repetitive and time-consuming tasks. Layer I aims at automating the tasks behind the process of finding ORFs (Open Reading Frames). Layer II (the core of our approach) is associated with three main tasks: extraction and formatting of data, automatic annotation of data regarding profiles or families of proteins, and generation and validation of rules to automatically annotate the Keywords field in the SWISS-PROT database. Layer III permits the user to check the correctness of the automatic annotation. This environment is being designed having the sequencing of the Mycoplasma hyopneumoniae in mind. Thus examples are presented using data of organisms of the Mycoplasmataceae family. We have concentrated the developments in layer II because this is the most general one and because it focusses on machine learning algorithms, a characteristic which is not usual in annotation systems. Results regarding this layer show that with learning (individual or colaborative), agents are able to generate rules for annotation which achieve better results than those reported in the literature.


2021 ◽  
Vol 23 (06) ◽  
pp. 1672-1681
Author(s):  
Vinay Balamurali ◽  
◽  
Prof. Venkatesh S ◽  

Servers are required to monitor the health of the various I/O cards connected to it to alert the required personnel to service these cards. The Data Collection Unit (DCU) is responsible for detecting the I/O cards, sending their inventory as well as monitoring their health. Currently, the keys required to detect these I/O cards are manually coded into the source code. Such a task is highly laborious and time-consuming. To eliminate this manual work, a Software Pluggable Module was devised which would read the I/O card-related information from the I/O component list. This software design aims at using Data Science and OOPS concepts to automate certain tasks on server systems. The proposed methodology is implemented on a Linux system. The software design is modular in nature and extensible to accommodate future requirements. Such an automation framework can be used to track information maintained in Excel Spreadsheets and access them using an Application Programming Interface (API).


2005 ◽  
Vol 86 (10) ◽  
pp. 2661-2672 ◽  
Author(s):  
Anna M. Likos ◽  
Scott A. Sammons ◽  
Victoria A. Olson ◽  
A. Michael Frace ◽  
Yu Li ◽  
...  

Human monkeypox was first recognized outside Africa in 2003 during an outbreak in the USA that was traced to imported monkeypox virus (MPXV)-infected West African rodents. Unlike the smallpox-like disease described in the Democratic Republic of the Congo (DRC; a Congo Basin country), disease in the USA appeared milder. Here, analyses compared clinical, laboratory and epidemiological features of confirmed human monkeypox case-patients, using data from outbreaks in the USA and the Congo Basin, and the results suggested that human disease pathogenicity was associated with the viral strain. Genomic sequencing of USA, Western and Central African MPXV isolates confirmed the existence of two MPXV clades. A comparison of open reading frames between MPXV clades permitted prediction of viral proteins that could cause the observed differences in human pathogenicity between these two clades. Understanding the molecular pathogenesis and clinical and epidemiological properties of MPXV can improve monkeypox prevention and control.


Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


Sign in / Sign up

Export Citation Format

Share Document