Increasing the power of interpretation for soil metaproteomics data

Abstract Background Soil and sediment microorganisms are highly phylogenetically diverse but are currently largely under-represented in public molecular databases. Their functional characterization by means of metaproteomics is usually performed using metagenomic sequences acquired for the same sample. However, such hugely diverse metagenomic datasets are difficult to assemble; in parallel, theoretical proteomes from isolates available in generic databases are of high quality. Both these factors advocate for the use of theoretical proteomes in metaproteomics interpretation pipelines. Here, we examined a number of database construction strategies with a view to increasing the outputs of metaproteomics studies performed on soil samples. Results The number of peptide-spectrum matches was found to be of comparable magnitude when using public or sample-specific metagenomics-derived databases. However, numbers were significantly increased when a combination of both types of information was used in a two-step cascaded search. Our data also indicate that the functional annotation of the metaproteomics dataset can be maximized by using a combination of both types of databases. Conclusions A two-step strategy combining sample-specific metagenome database and public databases such as the non-redundant NCBI database and a massive soil gene catalog allows maximizing the metaproteomic interpretation both in terms of ratio of assigned spectra and retrieval of function-derived information.

Download Full-text

What do class comments tell us? An investigation of comment evolution and practices in Pharo Smalltalk

Empirical Software Engineering ◽

10.1007/s10664-021-09981-5 ◽

2021 ◽

Vol 26 (6) ◽

Cited By ~ 1

Author(s):

Pooja Rani ◽

Sebastiano Panichella ◽

Manuel Leuenberger ◽

Mohammad Ghafari ◽

Oscar Nierstrasz

Keyword(s):

Empirical Study ◽

Programming Languages ◽

Program Comprehension ◽

Assessment Tools ◽

Support Program ◽

Writing Style ◽

High Quality ◽

Types Of Information ◽

Similar Code

Abstract Context Previous studies have characterized code comments in various programming languages, showing how high quality of code comments is crucial to support program comprehension activities, and to improve the effectiveness of maintenance tasks. However, very few studies have focused on understanding developer practices to write comments. None of them has compared such developer practices to the standard comment guidelines to study the extent to which developers follow the guidelines. Objective Therefore, our goal is to investigate developer commenting practices and compare them to the comment guidelines. Method This paper reports the first empirical study investigating commenting practices in Pharo Smalltalk. First, we analyze class comment evolution over seven Pharo versions. Then, we quantitatively and qualitatively investigate the information types embedded in class comments. Finally, we study the adherence of developer commenting practices to the official class comment template over Pharo versions. Results Our results show that there is a rapid increase in class comments in the initial three Pharo versions, while in subsequent versions developers added comments to both new and old classes, thus maintaining a similar code to comment ratio. We furthermore found three times as many information types in class comments as those suggested by the template. However, the information types suggested by the template tend to be present more often than other types of information. Additionally, we find that a substantial proportion of comments follow the writing style of the template in writing these information types, but they are written and formatted in a non-uniform way. Conclusion The results suggest the need to standardize the commenting guidelines for formatting the text, and to provide headers for the different information types to ensure a consistent style and to identify the information easily. Given the importance of high-quality code comments, we draw numerous implications for developers and researchers to improve the support for comment quality assessment tools.

Download Full-text

Functional annotation of human long noncoding RNAs using chromatin conformation data

10.1101/2021.01.13.426305 ◽

2021 ◽

Author(s):

Saumya Agrawal ◽

Tanvir Alam ◽

Masaru Koido ◽

Ivan V. Kulakovskiy ◽

Jessica Severin ◽

...

Keyword(s):

Functional Annotation ◽

Rna Binding ◽

Functional Characterization ◽

Cell Types ◽

Chromatin Interaction ◽

Spatial Proximity ◽

Chromatin Conformation ◽

Cell Type ◽

Cell Type Specific ◽

Rna Domains

AbstractTranscription of the human genome yields mostly long non-coding RNAs (lncRNAs). Systematic functional annotation of lncRNAs is challenging due to their low expression level, cell type-specific occurrence, poor sequence conservation between orthologs, and lack of information about RNA domains. Currently, 95% of human lncRNAs have no functional characterization. Using chromatin conformation and Cap Analysis of Gene Expression (CAGE) data in 18 human cell types, we systematically located genomic regions in spatial proximity to lncRNA genes and identified functional clusters of interacting protein-coding genes, lncRNAs and enhancers. Using these clusters we provide a cell type-specific functional annotation for 7,651 out of 14,198 (53.88%) lncRNAs. LncRNAs tend to have specialized roles in the cell type in which it is first expressed, and to incorporate more general functions as its expression is acquired by multiple cell types during evolution. By analyzing RNA-binding protein and RNA-chromatin interaction data in the context of the spatial genomic interaction map, we explored mechanisms by which these lncRNAs can act.

Download Full-text

Functional characterization of mutations and their interaction using the novel functional annotation for cancer treatment (FACT) platform

Annals of Oncology ◽

10.1093/annonc/mdw380.10 ◽

2016 ◽

Vol 27 ◽

pp. vi404

Author(s):

B. Miron ◽

N. Peled ◽

Z. Barbash ◽

O. Edelheit ◽

M. Vidne ◽

...

Keyword(s):

Cancer Treatment ◽

Functional Annotation ◽

Functional Characterization ◽

The Novel

Download Full-text

Isolation, characterization and functional annotation of the salt tolerance genes through screening the high-quality cDNA library of the halophytic green alga Dunaliella salina (Chlorophyta)

Annals of Microbiology ◽

10.1007/s13213-014-0967-z ◽

2014 ◽

Vol 65 (3) ◽

pp. 1293-1302 ◽

Cited By ~ 6

Author(s):

Junli Liu ◽

Dongxin Zhang ◽

Ling Hong

Keyword(s):

Salt Tolerance ◽

Cdna Library ◽

Green Alga ◽

Functional Annotation ◽

Dunaliella Salina ◽

High Quality

Download Full-text

Draft Genome Sequence of Streptomyces ahygroscopicus subsp. wuyiensis CK-15, Isolated from Soil in Fujian Province, China

Genome Announcements ◽

10.1128/genomea.01125-15 ◽

2015 ◽

Vol 3 (5) ◽

Cited By ~ 3

Author(s):

Beibei Ge ◽

Yan Liu ◽

Binghua Liu ◽

Kecheng Zhang

Keyword(s):

Genome Sequence ◽

Draft Genome ◽

Soil Samples ◽

Draft Genome Sequence ◽

High Quality ◽

Fujian Province ◽

Protein Coding ◽

Coding Sequences

We report the first high-quality draft genome sequence of an antibiotic (wuyiencin)-producing strain, Streptomyces ahygroscopicus subsp. wuyiensis CK-15, isolated from soil samples collected from Fujian Province, China. The 9.41-Mb genome comprises 8,311 protein-coding sequences, encodes 89 structural RNAs, and shows a G+C content of 72.25%.

Download Full-text

Diversity and Evolution of Clostridium beijerinckii and Complete Genome of the Type Strain DSM 791T

Processes ◽

10.3390/pr9071196 ◽

2021 ◽

Vol 9 (7) ◽

pp. 1196

Author(s):

Karel Sedlar ◽

Marketa Nykrynova ◽

Matej Bezdicek ◽

Barbora Branska ◽

Martina Lengerova ◽

...

Keyword(s):

Complete Genome ◽

Functional Annotation ◽

Type Strain ◽

Core Genome ◽

Clostridium Beijerinckii ◽

High Quality ◽

Circular Chromosome ◽

The Core ◽

Different Strains ◽

Genome Assemblies

Clostridium beijerinckii is a relatively widely studied, yet non-model, bacterium. While 246 genome assemblies of its various strains are available currently, the diversity of the whole species has not been studied, and it has only been analyzed in part for a missing genome of the type strain. Here, we sequenced and assembled the complete genome of the type strain Clostridium beijerinckii DSM 791T, composed of a circular chromosome and a circular megaplasmid, and used it for a comparison with other genomes to evaluate diversity and capture the evolution of the whole species. We found that strains WB53 and HUN142 were misidentified and did not belong to the Clostridium beijerinckii species. Additionally, we filtered possibly misassembled genomes, and we used the remaining 237 high-quality genomes to define the pangenome of the whole species. By its functional annotation, we showed that the core genome contains genes responsible for basic metabolism, while the accessory genome has genes affecting final phenotype that may vary among different strains. We used the core genome to reconstruct the phylogeny of the species and showed its great diversity, which complicates the identification of particular strains, yet hides possibilities to reveal hitherto unreported phenotypic features and processes utilizable in biotechnology.

Download Full-text

42 An Improved, High-quality Ovine Reference Genome Assembly

Journal of Animal Science ◽

10.1093/jas/skab235.039 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 23-24

Author(s):

Kimberly M Davenport ◽

Derek M Bickhart ◽

Kim Worley ◽

Shwetha C Murali ◽

Noelle Cockett ◽

...

Keyword(s):

Genome Assembly ◽

Functional Annotation ◽

Reference Genome ◽

De Novo ◽

The United States ◽

Read Length ◽

Chromosome 11 ◽

High Quality ◽

Oxford Nanopore ◽

Long Read

Abstract Sheep are an important agricultural species used for both food and fiber in the United States and globally. A high-quality reference genome enhances the ability to discover genetic and biological mechanisms influencing important traits, such as meat and wool quality. The rapid advances in genome assembly algorithms and emergence of increasingly long sequence read length provide the opportunity for an improved de novo assembly of the sheep reference genome. Tissue was collected postmortem from an adult Rambouillet ewe selected by USDA-ARS for the Ovine Functional Annotation of Animal Genomes project. Short-read (55x coverage), long-read PacBio (75x coverage), and Hi-C data from this ewe were retrieved from public databases. We generated an additional 50x coverage of Oxford Nanopore data and assembled the combined long-read data with canu v1.9. The assembled contigs were polished with Nanopolish v0.12.5 and scaffolded using Hi-C data with Salsa v2.2. Gaps were filled with PBsuite v15.8.24 and polished with Nanopolish v0.12.5 followed by removal of duplicate contigs with PurgeDups v1.0.1. Chromosomes were oriented by identifying centromeres and telomeres with RepeatMasker v4.1.1, indicating a need to reverse the orientation of chromosome 11 relative to Oar_rambouillet_v1.0. Final polishing was performed with two rounds of a pipeline which consisted of freebayes v1.3.1 to call variants, Merfin to validate them, and BCFtools to generate the consensus fasta. The ARS-UI_Ramb_v2.0 assembly has improved continuity (contig N50 of 43.19 Mb) with a 19-fold and 38-fold decrease in the number of scaffolds compared with Oar_rambouillet_v1.0 and Oar_v4.0. ARS-UI_Ramb_v2.0 has greater per-base accuracy and fewer insertions and deletions identified from mapped RNA sequence than previous assemblies. This significantly improved reference assembly, public at NCBI GenBank under accession number GCA_016772045, will optimize the functional annotation of the sheep genome and facilitate improved mapping accuracy of genetic variant and expression data for traits relevant the sheep industry.

Download Full-text

Effect of cementation on the compressibility of Singapore upper marine clay

Canadian Geotechnical Journal ◽

10.1139/t08-030 ◽

2008 ◽

Vol 45 (7) ◽

pp. 1018-1024 ◽

Cited By ~ 2

Author(s):

Han-Eng Low ◽

Kok-Kwang Phoon

Keyword(s):

Ethylene Diamine Tetraacetic Acid ◽

Amorphous Materials ◽

Soil Samples ◽

Ethylene Diamine ◽

Clay Layer ◽

Tetraacetic Acid ◽

Marine Clay ◽

High Quality ◽

One Dimensional ◽

Soil Microstructure

A series of one-dimensional consolidation tests were performed under varying pretreatments on high quality soil samples collected from a Singapore upper marine clay layer in an attempt to evaluate the effect of cementation by amorphous materials on its compressibility. The findings from this study seem to suggest that cementation by ethylene-diamine tetraacetic acid (EDTA) removable amorphous materials may only partially contribute to the development of soil microstructure and overconsolidation in Singapore upper marine clay.

Download Full-text

Ten steps to get started in Genome Assembly and Annotation

F1000Research ◽

10.12688/f1000research.13598.1 ◽

2018 ◽

Vol 7 ◽

pp. 148 ◽

Cited By ~ 32

Author(s):

Victoria Dominguez Del Angel ◽

Erik Hjerde ◽

Lieven Sterck ◽

Salvadors Capella-Gutierrez ◽

Cederic Notredame ◽

...

Keyword(s):

Transposable Elements ◽

Genome Assembly ◽

Genome Annotation ◽

Functional Annotation ◽

General Assembly ◽

Intrinsic Properties ◽

High Quality ◽

Sequencing Technologies ◽

Annotation Project ◽

Over Time

As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR).

Download Full-text

Research on Teaching Reform of Electronic Information Major Driven by Professional Skill Competition

Advances in Higher Education ◽

10.18686/ahe.v4i9.2641 ◽

2020 ◽

Vol 4 (9) ◽

Author(s):

Kun Dang ◽

Xiaolong Jiang

Keyword(s):

High Efficiency ◽

Electronic Systems ◽

Professional Skill ◽

Electronic Information ◽

High Quality ◽

Teaching Activities ◽

Teaching Reform ◽

Research On Teaching ◽

Set Up ◽

Types Of Information

In the context of the current rapid innovation of electronic information technology, various schools have set up majors related to electronic information in order to cultivate high-quality talents required by social positions. As the electronic information course requires students to learn through knowledge, they can quickly grasp the content of the integration and processing of electronic systems and various types of information, and while having higher professional skills, they can participate in vocational skill competition activities and achieve better results. Results. Therefore, this article mainly discusses how to make the electronic information major innovate in the background of the current fierce development of vocational skill competitions, hoping that in the process of ensuring the high-efficiency development of teaching activities, students can be encouraged to participate in the competition activities and get more Good development.

Download Full-text