genome data
Recently Published Documents


TOTAL DOCUMENTS

839
(FIVE YEARS 399)

H-INDEX

50
(FIVE YEARS 8)

2022 ◽  
Author(s):  
Toshihisa Ohshima ◽  
Taketo Ohmori ◽  
Masaki Tanaka

Abstract L-Arginine dehydrogenase (L-ArgDH, EC 1.4.1.25) is an amino acid dehydrogenase which catalyzes the reversible oxidative deamination of L-arginine to the oxo analog in the presence of NADP. Although the enzyme activity is detected in the cell extract of Pseudomonas aruginosa , the purification and characterization of the enzyme have not been achieved to date. We here found the gene homolog of L-ArgDH in genome data of Pseudomonas veronii and succeeded in expression of P. veronii JCM11942 gene in E. coli. The gene product exhibited strong NADP-dependent L-ArgDH activity. The crude enzyme was unstable under neutral pH conditions, but was markedly stabilized by the addition of 10% glycerol. The enzyme was purified to homogeneity through a single Ni-chelate affinity ch romatography step and consisted of a homodimeric protein with a molecular mass of about 65 kDa. The enzyme selectively catalyzed l-arginine oxidation in the presence of NADP with maximal activity at pH 9.5. The apparent K m values for l-arginine and NADP were 2.5 and 0.21 mM, respectively. The nucleotide sequence coding the enzyme gene ( was determined and the amino acid sequence was deduced from the nucleotide sequence. As an application of the enzyme, simple colorimetric microassay for L-arginine using the enzyme was achieved.


Life ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 61
Author(s):  
Ruitao Yu ◽  
Leining Feng ◽  
Christopher H. Dietrich ◽  
Xiangqun Yuan

To explore the phylogenetic relationships of the subfamily Centrotinae from the mitochondrial genome data, four complete mitogenomes (Anchon lineatus, Anchon yunnanensis, Gargara genistae and Tricentrus longivalvulatus) were sequenced and analyzed. All the newly sequenced mitogenomes contain 37 genes. Among the 13 protein-coding genes (PCGs) of the Centrotinae mitogenomes, a sliding window analysis and the ratio of Ka/Ks suggest that atp8 is a relatively fast evolving gene, while cox1 is the slowest. All PCGs start with ATN, except for nad5 (start with TTG), and stop with TAA or the incomplete stop codon T, except for nad2 and cytb (terminate with TAG). All tRNAs can fold into the typical cloverleaf secondary structure, except for trnS1, which lacks the dihydrouridine (DHU) arm. The BI and ML phylogenetic analyses of concatenated alignments of 13 mitochondrial PCGs among the major lineages produce a well-resolved framework. Phylogenetic analyses show that Membracoidea, Smiliinae and Centrotinae, together with tribes Centrotypini and Leptobelini are recovered as well-supported monophyletic groups. The tribe Gargarini (sensu Wallace et al.) and its monophyly are supported.


Author(s):  
Douglas Sachito ◽  
Luciana de Oliveira

Terpenes are the most abundant class of natural product that exist in nature. They possess a myriad of industrial applications including pharmaceutical, perfumery and flavors, bulk chemicals, and fuel. Intriguingly, until today, the vast majority of characterized terpenoids have been isolated from plants and fungi, and only in recent years bacteria were found to generate a representative reservoir of terpenoids molecules. Mining Streptomyces sp. CBMAI 2042 genome data has revealed the presence of five terpene cyclase genes. Chemical analysis of mycelium extract of this bacteria strain has unveiled at least 28 volatile terpenes molecules, where three encoding sesquiterpene cyclase (STC) genes are apparently responsible for their biosynthesis. The cyclic products obtained by incubation of these three purified recombinant STCs with farnesyl diphosphate (FPP) were analyzed by gas chromatography-mass spectrometry (GC-MS) and identified using the Van den Dool and Kratz equation.


2021 ◽  
Author(s):  
Tianqing Zheng ◽  
Yinghui Li ◽  
Yanfei Li ◽  
Shengrui Zhang ◽  
Chunchao Wang ◽  
...  

In Chinese National Soybean GeneBank (CNSGB), we have collected more than 30,000 soybean accessions. However, data sharing for soybean remains an especially sensitive question, and how to share the genome variations within rule frame has been bothering the soybean germplasm workers for a long time. Here we release a big data source named Soybean Functional Genomics & Breeding database (SoyFGB v2.0) (https://sfgb.rmbreeding.cn/), which embed a core collection of 2,214 soybean resequencing genome (2K-SG) from the CNSGB germplasm. This source presents a unique example which may help elucidating the following three major functions for multiple genome data mining with general interests for plant researchers. 1) On-line analysis tools are provided by the Analysis module for haplotype mining in high-throughput genotyped germplasms with different methods. 2) Variations for 2K-SG are provided in SoyFGB v2.0 by Browse module which contains two functions of SNP and InDel. Together with the Gene (SNP & InDel) function embedded in Search module, the genotypic information of 2K-SG for targeting gene / region is accessible. 3) Scaled phenotype data of 42 traits, including 9 quality and 33 quantitative traits are provided by SoyFGB v2.0. With the scaled-phenotype data search and seed request tools under a control list, the germplasm information could be shared without direct downloading the unpublished phenotypic data or information of sensitive germplasms. In a word, the mode of data mining and sharing underlies SoyFGB v2.0 may inspire more ideas for works on genome resources of not only soybean but also the other plants.


2021 ◽  
Author(s):  
Qing-Miao Yuan ◽  
Xu Luo ◽  
Jing Cao ◽  
Yu-Bao Duan

Abstract Background Nuthatches (genus Sitta) comprise a group of Passeriformes. With the publication of more mitochondrial genome data, there has been considerable focus on the taxonomic status of the nuthatches. To understand the phylogenetic position of Sitta and phylogenetic relations within this genus, we sequenced and analyzed the complete mitochondrial genomes of three species, S. himalayensis, S. nagaensis and S. yunnanensis, making this the first account of complete mitochondrial genomes (mitogenomes) for this genus. Results The mitochondrial genomes of three Sitta species are 16,822-16,830 bp in length and consisted of 37 genes and a control region. This study recovered the same gene arrangement found in the mitogenomes of Gallus gallus, which is considered the typical ancestral avian gene order. All tRNAs were predicted to form the typical cloverleaf secondary structures. Bayesian inference and maximum likelihood phylogenetic analyses of sequences of 18 species obtained a well-supported topology. The family Sittidae is the sister-group of Troglodytidae, and the genus Sitta can be divided into 3 major clades. We demonstrated the phylogenetic relationships within genus Sitta (S. carolinensis + (S. villosa + S. yunnanensis + (S. himalayensis + (S. europaea + S. nagaensis)))).


2021 ◽  
Vol 9 ◽  
Author(s):  
Elakkiya R. ◽  
Deepak Kumar Jain ◽  
Ketan Kotecha ◽  
Sharnil Pandya ◽  
Sai Siddhartha Reddy ◽  
...  

Over the last decade, the field of bioinformatics has been increasing rapidly. Robust bioinformatics tools are going to play a vital role in future progress. Scientists working in the field of bioinformatics conduct a large number of researches to extract knowledge from the biological data available. Several bioinformatics issues have evolved as a result of the creation of massive amounts of unbalanced data. The classification of precursor microRNA (pre miRNA) from the imbalanced RNA genome data is one such problem. The examinations proved that pre miRNAs (precursor microRNAs) could serve as oncogene or tumor suppressors in various cancer types. This paper introduces a Hybrid Deep Neural Network framework (H-DNN) for the classification of pre miRNA in imbalanced data. The proposed H-DNN framework is an integration of Deep Artificial Neural Networks (Deep ANN) and Deep Decision Tree Classifiers. The Deep ANN in the proposed H-DNN helps to extract the meaningful features and the Deep Decision Tree Classifier helps to classify the pre miRNA accurately. Experimentation of H-DNN was done with genomes of animals, plants, humans, and Arabidopsis with an imbalance ratio up to 1:5000 and virus with a ratio of 1:400. Experimental results showed an accuracy of more than 99% in all the cases and the time complexity of the proposed H-DNN is also very less when compared with the other existing approaches.


2021 ◽  
Author(s):  
Yana Hrytsenko ◽  
Noah M. Daniels ◽  
Rachel S. Schwartz

Abstract Background: Phylogenies enrich our understanding of how genes, genomes, and species evolve. Traditionally, alignment-based methods are used to construct phylogenies from genetic sequence data; however, this process can be time-consuming when analyzing the large amounts of genomic data available today. Additionally, these analyses face challenges due to differences in genome structure, synteny, and the need to identify similarities in the face of repeated substitutions resulting in loss of phylogenetic information contained in the sequence. Alignment Free (AF) approaches using k-mers (short subsequences) can be an efficient alternative due to their indifference to positional rearrangements in a sequence. However, these approaches may be sensitive to k-mer length and the distance between samples.Results: In this paper, we analyzed the sensitivity of an AF approach based on k-mer frequencies to these challenges using cosine and Euclidean distance metrics for both assembled genomes and unassembled sequencing reads. Quantification of the sensitivity of this AF approach for phylogeny reconstruction to branch length and k-mer length provides a better understanding of the necessary parameter ranges for accurate phylogeny reconstruction. Our results show that a frequency-based AF approach can result in accurate phylogeny reconstruction when using whole genomes, but not stochastically sequenced reads, so long as longer k-mers are used. Conclusions: In this study, we have shown an AF approach for phylogeny reconstruction is robust in analyzing assembled genome data for a range of numbers of substitutions using longer k-mers. Using simulated reads randomly selected from the genome by the Illumina sequencer had a detrimental effect on phylogeny estimation. Additionally, filtering out infrequent k-mers improved the computational efficiency of the method while preserving the accuracy of the results thus suggesting the feasibility of using only a subset of data to improve computational efficiency in cases where large sets of genome-scale data are analyzed.


2021 ◽  
Vol 12 ◽  
Author(s):  
Gabriel J. Odom ◽  
Antonio Colaprico ◽  
Tiago C. Silva ◽  
X. Steven Chen ◽  
Lily Wang

Recent advances in technology have made multi-omics datasets increasingly available to researchers. To leverage the wealth of information in multi-omics data, a number of integrative analysis strategies have been proposed recently. However, effectively extracting biological insights from these large, complex datasets remains challenging. In particular, matched samples with multiple types of omics data measured on each sample are often required for multi-omics analysis tools, which can significantly reduce the sample size. Another challenge is that analysis techniques such as dimension reductions, which extract association signals in high dimensional datasets by estimating a few variables that explain most of the variations in the samples, are typically applied to whole-genome data, which can be computationally demanding. Here we present pathwayMultiomics, a pathway-based approach for integrative analysis of multi-omics data with categorical, continuous, or survival outcome variables. The input of pathwayMultiomics is pathway p-values for individual omics data types, which are then integrated using a novel statistic, the MiniMax statistic, to prioritize pathways dysregulated in multiple types of omics datasets. Importantly, pathwayMultiomics is computationally efficient and does not require matched samples in multi-omics data. We performed a comprehensive simulation study to show that pathwayMultiomics significantly outperformed currently available multi-omics tools with improved power and well-controlled false-positive rates. In addition, we also analyzed real multi-omics datasets to show that pathwayMultiomics was able to recover known biology by nominating biologically meaningful pathways in complex diseases such as Alzheimer’s disease.


2021 ◽  
Author(s):  
Matej Lexa ◽  
Monika Cechova ◽  
Son Hoang Nguyen ◽  
Pavel Jedlicka ◽  
Viktor Tokan ◽  
...  

The role of repetitive DNA in the 3D organization of the interphase nucleus in plant cells is a subject of intensive study. High-throughput chromosome conformation capture (Hi-C) is a sequencing-based method detecting the proximity of DNA segments in nuclei. We combined Hi-C data, plant reference genome data and tools for the characterization of genomic repeats to build a Nextflow pipeline identifying and quantifying the contacts of specific repeats revealing the preferential homotypic interactions of ribosomal DNA, DNA transposons and some LTR retrotransposon families. We provide a novel way to analyze the organization of repetitive elements in the 3D nucleus.


Sign in / Sign up

Export Citation Format

Share Document