pfam database
Recently Published Documents


TOTAL DOCUMENTS

15
(FIVE YEARS 3)

H-INDEX

5
(FIVE YEARS 1)

2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Aya Satoh ◽  
Miwako Takasu ◽  
Kentaro Yano ◽  
Yohey Terai

Abstract Objectives The mangrove cricket, Apteronemobius asahinai, shows endogenous activity rhythms that synchronize with the tidal cycle (i.e., a free-running rhythm with a period of ~ 12.4 h [the circatidal rhythm]). Little is known about the molecular mechanisms underlying the circatidal rhythm. We present the draft genome of the mangrove cricket to facilitate future molecular studies of the molecular mechanisms behind this rhythm. Data description The draft genome contains 151,060 scaffolds with a total length of 1.68 Gb (N50: 27 kb) and 92% BUSCO completeness. We obtained 28,831 predicted genes, of which 19,896 (69%) were successfully annotated using at least one of two databases (UniProtKB/SwissProt database and Pfam database).


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Briallen Lobb ◽  
Benjamin Jean-Marie Tremblay ◽  
Gabriel Moreno-Hagelsieb ◽  
Andrew C. Doxey

Abstract Background A substantial fraction of genes identified within bacterial genomes encode proteins of unknown function. Identifying which of these proteins represent potential virulence factors, and mapping their key virulence determinants, is a challenging but important goal. Results To facilitate virulence factor discovery, we performed a comprehensive analysis of 17,929 protein domain families within the Pfam database, and scored them based on their overrepresentation in pathogenic versus non-pathogenic species, taxonomic distribution, relative abundance in metagenomic datasets, and other factors. Conclusions We identify pathogen-associated domain families, candidate virulence factors in the human gut, and eukaryotic-like mimicry domains with likely roles in virulence. Furthermore, we provide an interactive database called PathFams to allow users to explore pathogen-associated domains as well as identify pathogen-associated domains and domain architectures in user-uploaded sequences of interest. PathFams is freely available at https://pathfams.uwaterloo.ca.


2020 ◽  
Vol 49 (D1) ◽  
pp. D412-D419 ◽  
Author(s):  
Jaina Mistry ◽  
Sara Chuguransky ◽  
Lowri Williams ◽  
Matloob Qureshi ◽  
Gustavo A Salazar ◽  
...  

Abstract The Pfam database is a widely used resource for classifying protein sequences into families and domains. Since Pfam was last described in this journal, over 350 new families have been added in Pfam 33.1 and numerous improvements have been made to existing entries. To facilitate research on COVID-19, we have revised the Pfam entries that cover the SARS-CoV-2 proteome, and built new entries for regions that were not covered by Pfam. We have reintroduced Pfam-B which provides an automatically generated supplement to Pfam and contains 136 730 novel clusters of sequences that are not yet matched by a Pfam family. The new Pfam-B is based on a clustering by the MMseqs2 software. We have compared all of the regions in the RepeatsDB to those in Pfam and have started to use the results to build and refine Pfam repeat families. Pfam is freely available for browsing and download at http://pfam.xfam.org/.


2018 ◽  
Author(s):  
Ahmed F. Roumia ◽  
Margarita C. Theodoropoulou ◽  
Konstantinos D. Tsirigos ◽  
Pantelis G. Bagos

Transmembrane β-barrel proteins perform multiple cellular functions such as passive transport of ions and allowing the flux of molecules. Also, they act as enzymes, transporters, receptors and virulence factors. Even though, in the last few years, several families of eukaryotic β-barrel outer membrane proteins (OMPs) have been discovered, the computational characterization of these families is far from complete. The PFAM database includes only very few characteristic profiles for these families and, in most cases, the profile Hidden Markov Models where trained using both prokaryotic and eukaryotic proteins. Here, we present, for the first time, a comprehensive computational analysis of eukaryotic transmembrane β- barrels. Ten characteristic pHMMs were build that can discriminate eukaryotic β-barrels from other classes of β-barrel proteins (globular and bacterial) and are, also, capable of discriminating between mitochondrial and chloroplastic ones. Specifically, we built six new pHMMs for the chloroplastic β-barrel families not included in the PFAM database and, also, updated the profile for MDM10 family (PF12519) and divided the porin family (PF01459) into two separated families VDAC and TOM40. We hope that all the pHMMs presented here will be used for the detection and characterization of eukaryotic OMPs in newly discovered proteomes.


2018 ◽  
Author(s):  
Ahmed F. Roumia ◽  
Margarita C. Theodoropoulou ◽  
Konstantinos D. Tsirigos ◽  
Pantelis G. Bagos

Transmembrane β-barrel proteins perform multiple cellular functions such as passive transport of ions and allowing the flux of molecules. Also, they act as enzymes, transporters, receptors and virulence factors. Even though, in the last few years, several families of eukaryotic β-barrel outer membrane proteins (OMPs) have been discovered, the computational characterization of these families is far from complete. The PFAM database includes only very few characteristic profiles for these families and, in most cases, the profile Hidden Markov Models where trained using both prokaryotic and eukaryotic proteins. Here, we present, for the first time, a comprehensive computational analysis of eukaryotic transmembrane β- barrels. Ten characteristic pHMMs were build that can discriminate eukaryotic β-barrels from other classes of β-barrel proteins (globular and bacterial) and are, also, capable of discriminating between mitochondrial and chloroplastic ones. Specifically, we built six new pHMMs for the chloroplastic β-barrel families not included in the PFAM database and, also, updated the profile for MDM10 family (PF12519) and divided the porin family (PF01459) into two separated families VDAC and TOM40. We hope that all the pHMMs presented here will be used for the detection and characterization of eukaryotic OMPs in newly discovered proteomes.


2016 ◽  
pp. 53-58
Author(s):  
SM Sabbir Alam ◽  
M Ruhul Amin ◽  
M Anwar Hossain

Domains of unknown functions (DUFs) are a big set of protein families within the Pfam database that includes proteins of unknown function. In the absence of functional information, proteins are classified into different families based on conserved amino acid sequences and are potentially functionally important. In Pfam database, the numbers of families of DUFs are rapidly increasing and in current the fraction of DUF families had increased to about twenty two percent of all protein families. In this study we targeted DUF2726 member proteins which are mainly present in different bacterial species of Gamma-proteobacteria and have a particular domain organization. We analyzed the protein sequences of domain DUF2726 using different computational tools and databases. We found that this domain contains a nuclear localization signal peptide, which is conserved in Escherichia spp. and Shigella spp. It were also predicted that it has nucleic acid binding properties. Analyzing protein-protein interactions functional partners associated with DUF 2726 were revealed. Protein secondary structure, transmembrane helices structure were predicted. We have found that it has gene neighbourhood and co-occurrences with protein RepA and RepB. RepA and RepB are functionally associated with replication. RepA is a replication protein and RepB is a replication regulatory protein. Presence of a nucleic acid binding properties, a nuclear localization signal (NLS) signalling peptide, and possible interaction pattern with replication proteins, conjectures its possible role as a NLS like signalling peptide.Bangladesh J Microbiol, Volume 31, Number 1-2,June-Dec 2014, pp 53-58


Author(s):  
Yun-Rong Gao ◽  
Na Feng ◽  
Tao Chen ◽  
De-Feng Li ◽  
Li-Jun Bi

Rv0880 from the pathogenMycobacterium tuberculosisis classified as a MarR family protein in the Pfam database. It consists of 143 amino acids and has an isoelectric point of 10.9. Crystals of Rv0880 belonged to space groupP1, with unit-cell parametersa= 54.97,b= 69.60,c= 70.32 Å, α = 103.71, β = 111.06, γ = 105.83°. The structure of the MarR family transcription regulator Rv0880 was solved at a resolution of 2.0 Å with anRcrystandRfreeof 21.2 and 24.9%, respectively. The dimeric structure resembles that of other MarR proteins, with each subunit comprising a winged helix–turn–helix domain connected to an α-helical dimerization domain.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
José Esteban Muñoz-Medina ◽  
Carlos Javier Sánchez-Vallejo ◽  
Alfonso Méndez-Tenorio ◽  
Irma Eloísa Monroy-Muñoz ◽  
Javier Angeles-Martínez ◽  
...  

The unpredictable, evolutionary nature of the influenza A virus (IAV) is the primary problem when generating a vaccine and when designing diagnostic strategies; thus, it is necessary to determine the constant regions in viral proteins. In this study, we completed anin silicoanalysis of the reported epitopes of the 4 IAV proteins that are antigenically most significant (HA, NA, NP, and M2) in the 3 strains with the greatest world circulation in the last century (H1N1, H2N2, and H3N2) and in one of the main aviary subtypes responsible for zoonosis (H5N1). For this purpose, the HMMER program was used to align 3,016 epitopes reported in the Immune Epitope Database and Analysis Resource (IEDB) and distributed in 34,294 stored sequences in the Pfam database. Eighteen epitopes were identified: 8 in HA, 5 in NA, 3 in NP, and 2 in M2. These epitopes have remained constant since they were first identified (~91 years) and are present in strains that have circulated on 5 continents. These sites could be targets for vaccination design strategies based on epitopes and/or as markers in the implementation of diagnostic techniques.


Sign in / Sign up

Export Citation Format

Share Document