Metaphylogenetic analysis of global sewage reveals that bacterial strains associated with human disease show less degree of geographic clustering
Abstract
Knowledge about the difference in the global distribution of pathogens and non-pathogens is limited. Here, we investigate it using a multi-sample metagenomics phylogeny approach based on short-read metagenomic sequencing of sewage from 79 sites around the world. For each metagenomic sample, bacterial template genomes were identified in a non-redundant database of whole genome sequences. Reads were mapped to the templates identified in each sample. Phylogenetic trees were constructed for each template identified in multiple samples. The countries from which the samples were taken were grouped according to different definitions of world regions. For each tree, the tendency for regional clustering was determined. Phylogenetic trees representing 95 unique bacterial templates were created covering 4 to 71 samples. Varying degrees of regional clustering could be observed. The clustering was most pronounced for environmental bacterial species and human commensals, and less for colonizing opportunistic pathogens, opportunistic pathogens and pathogens. No pattern of significant difference in clustering between any of the organism classifications and country groupings according to income were observed. Our study suggests that while the same bacterial species might be found globally, there is a geographical regional selection or barrier to spread for individual clones of environmental and human commensal bacteria, whereas this is to a lesser degree the case for strains and clones of human pathogens and opportunistic pathogens.