In Silico Genotyping of Escherichia coli Isolates for Extraintestinal Virulence Genes by Use of Whole-Genome Sequencing Data
Abstract
Extraintestinal pathogenic Escherichia coli (ExPEC) is the leading cause in humans of urinary tract infection and bacteremia. The previously published web tool VirulenceFinder (http://cge.cbs.dtu.dk/services/VirulenceFinder/) uses whole-genome sequencing (WGS) data for in silico characterization of E. coli isolates and enables researchers and clinical health personnel to quickly extract and interpret virulence-relevant information from WGS data. In this study, 38 ExPEC-associated virulence genes were added to the existing E. coli VirulenceFinder database. In total, 14,441 alleles were downloaded. A total of 1,890 distinct alleles were added to the database after removal of redundant sequences and analysis of the remaining alleles for open reading frames (ORFs). The database now contains 139 genes-of which 44 are related to ExPEC-and 2,826 corresponding alleles. Construction of the database included validation against 27 primer pairs from previous studies, a search for serotype-specific P fimbriae papA alleles, and a BLASTn confirmation of seven genes (etsC, iucC, kpsE, neuC, sitA, tcpC, and terC) not covered by the primers. The augmented database was evaluated using (i) a panel of nine control strains and (ii) 288 human-source E. coli strains classified by PCR as ExPEC and non-ExPEC. We observed very high concordance (average, 93.4%) between PCR and WGS findings, but WGS identified more alleles. In conclusion, the addition of 38 ExPEC-associated genes and the associated alleles to the E. coli VirulenceFinder database allows for a more complete characterization of E. coli isolates based on WGS data, which has become increasingly important considering the plasticity of the E. coli genome.