Large-scale analysis reveals 4,539 candidate small protein families encoded by human-associated microbes, many of which have never been previously described.
Thus far, small open reading frames and the small proteins they encode (≤50 amino acids in length) have been overlooked when studying the human microbiome due to computational and experimental difficulties in detecting them.
“It’s critically important to understand the interface between human cells and the microbiome,” said Ami Bhatt, senior author of the study. “How do they communicate? How do strains of bacteria protect themselves from other strains? These functions are likely to be found in very small proteins, which may be more likely than larger proteins to be secreted outside the cell.”
This study sought to characterise the small proteins encoded by a healthy human microbiome, using a reference-free approach to analyse the NIH Human Microbiome Project (HMP) dataset. Surprisingly, most of the 4,539 candidate small protein families identified (containing a total of 467,538 small proteins) are not represented in traditional reference genomes and/or do not contain a known protein domain.
For each of the identified small protein families, the researchers further compiled information about taxonomic classification, prevalence across human body sites, predicted cellular localisation (secreted/transmembrane), and potential functions.
Focused analysis of the 14 most prevalent families that are encoded by ≥100 species showed that 13 of these 14 families were identified in ≥3 body sites, suggesting that they play housekeeping roles that are not niche-specific. In addition, 39 small protein families were identified to be potential antimicrobial peptides, and another 13 small protein families were localised on “defense islands” and may be associated with bacterial defense systems against phage or other bacteria.
“Small proteins can be synthesized rapidly and could be used by the bacteria as biological switches to toggle between functional states or to trigger specific reactions in other cells,” Bhatt said. “They are also easier to study and manipulate than larger proteins, which could facilitate drug development. We anticipate this to be a valuable new area of biology for study.”