Background:
Proteins may have none, single, double, or multiple domains, while a single
domain may appear in multiple proteins. Their distribution patterns may have impacts on bacterial
physiology and lifestyle.
Objective:
This study aims to understand how domains are distributed and duplicated in bacterial proteomes,
in order to better understand bacterial physiology and lifestyles.
Methods:
In this study, we used 16712 Hidden Markov Models to screen 944 bacterial reference proteomes
versus a threshold E-value<0.001. The number of non-redundant domains and duplication
rates of redundant domains for each species were calculated. The unique domains, if any, were also
identified for each species. In addition, the properties of no-domain proteins were investigated in
terms of physicochemical properties.
Results:
The increasing number of non-redundant domains for a bacterial proteome follows the trend
of an asymptotic function. The domain duplication rate is positively correlated with proteome size and
increases more rapidly. The high percentage of single-domain proteins is more associated with small
proteome size. For each proteome, unique domains were also obtained. Moreover, no-domain proteins
show differences with the other three groups for several physicochemical properties analysed in this
study.
Conclusion:
The study confirmed that a low domain duplication rate and a high percentage of singledomain
proteins are more likely to be associated with bacterial host-dependent or restricted nicheadapted
lifestyle. In addition, the unique lifestyle and physiology were revealed based on the analysis
of species-specific domains and core domain interactions or co-occurrences.