Microbiology and Molecular Biology Reviews, June 2003, p. 238-276, Vol. 67, No. 2
1092-2172/03/$08.00+0 DOI: 10.1128/MMBR.67.2.238-276.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Prophage Genomics
Carlos Canchaya, Caroline Proux, Ghislain Fournous, Anne Bruttin, and Harald Brüssow*
Nestlé Research Center, Vers-chez-les-Blanc, CH-1000 Lausanne 26, Switzerland
The majority of the bacterial genome sequences deposited in the National Center for Biotechnology Information database contain prophage sequences. Analysis of the prophages suggested that after being integrated into bacterial genomes, they undergo a complex decay process consisting of inactivating point mutations, genome rearrangements, modular exchanges, invasion by further mobile DNA elements, and massive DNA deletion. We review the technical difficulties in defining such altered prophage sequences in bacterial genomes and discuss theoretical frameworks for the phage-bacterium interaction at the genomic level. The published genome sequences from three groups of eubacteria (low- and high-G+C gram-positive bacteria and
-proteobacteria) were screened for prophage sequences. The prophages from Streptococcus pyogenes served as test case for theoretical predictions of the role of prophages in the evolution of pathogenic bacteria. The genomes from further human, animal, and plant pathogens, as well as commensal and free-living bacteria, were included in the analysis to see whether the same principles of prophage genomics apply for bacteria living in different ecological niches and coming from distinct phylogenetical affinities. The effect of selection pressure on the host bacterium is apparently an important force shaping the prophage genomes in low-G+C gram-positive bacteria and
-proteobacteria.
Many bacterial genomes deposited in the public database contain phage DNA integrated into the bacterial chromosome. It is not rare for bacteria to contain multiple prophages in their chromosomes, which then constitute a sizable part of the total bacterial DNA (Fig. 1). The most extreme case is currently represented by the food pathogen Escherichia coli O157:H7 strain Sakai. It contains 18 prophage genome elements, which amount to 16% of its total genome content (Fig. 1C). Less extreme but still impressive cases are represented by Streptococcus pyogenes, with four to six prophages, amounting to 12% of the bacterial DNA content (Fig. 1A). These prophages do not represent exotic phage types: the E. coli O157 prophages resemble the well-known temperate E. coli phages
, P2 (and its satellite phage P4), and Mu (160). The S. pyogenes prophages belong to the proposed Sfi11-, Sfi21-, and r1t-like Siphoviridae, which are also found in lactic acid bacteria (LAB) used in industrial milk fermentation. The taxonomy we use in this review for phages from low-G+C gram-positive bacteria is our own system based on comparative phage genomics (28, 171). Other authors have proposed partially overlapping and partially distinct phage taxonomy systems based on a phage proteomics tree (in these systems the Sfi11-like phages are called TP901-like phages after another type of phage from the same group) or a differentiation of phage genomes into a set of modi (modules) (114, 176).

View larger version (39K):
[in this window]
[in a new window]
|
FIG. 1. Prophage content of four human bacterial pathogens. The prophages are indicated as shaded boxes on the bacterial genome maps. The lengths of the boxes correspond to the relative sizes of the prophage DNA with respect to the bacterial chromosome. Note that the circumference of the bacterial genomes does not correspond to their relative length. Prophages with extensive DNA sequence identity are linked by dotted lines. (A) S. pyogenes genomes of the sequenced M1, M18, and M3 strains (from center to periphery). (B) S. aureus genomes of the sequenced Mu50 (center), N315, MW2, and 8325 strains. (C) E. coli genomes from the O157:H7 Sakai (center) and O157:H7 EDL933 strains, the laboratory strain K-12, and the uropathogenic strain CFT073. (D) S. enterica serovar Typhimurium LT2 (center) and serovar Typhi CT18.
|
|
Prophages are not only quantitatively important genetic elements of the bacterial chromosome. As mobile DNA elements, phage DNA is a vector for lateral gene transfer between bacteria (35). Indeed, numerous virulence factors from bacterial pathogens are phage encoded (22, 216, 215). It was postulated that this role of prophages is not limited to pathogenic bacteria but that some adaptations of nonpathogenic bacterial strains to their ecological niche might also be mediated by prophage genomes (30). Furthermore, prophages account for a substantial amount of interstrain genetic variability in several bacterial species (e.g., Staphylococcus aureus [7] and S. pyogenes [189]). When genomes from closely related bacteria were compared in a dot plot analysis, prophage sequences frequently accounted for a substantial, if not the major, proportion of the differences between the genomes (e.g., Listeria monocytogenes and L. innocua [79], Salmonella enterica serovars Typhi and Typhimurium [139, 164], and E. coli O157 and K-12 [166]). Microarray analysis (169) and PCR scanning (161) allowed researchers to explore the presence of specific prophages over a much larger set of related bacterial strains, and again prophages contributed a large part of strain-specific DNA, irrespective of whether pathogenic or nonpathogenic bacteria were investigated. Finally, when mRNA expression patterns were studied with microarrays in lysogenic bacteria that underwent physiologically relevant changes in growth conditions, prophage genes figured prominently in the mRNA species changing their expression pattern (190, 220). These data demonstrate that prophages are not a passive genetic cargo of the bacterial chromosome but are likely to be active players in cell physiology. Subtractive mRNA hybridization analysis demonstrated that prophage genes also make up prominent share of the E. coli genes upregulated when the bacteria invaded the lungs of infected birds (60). Apparently, prophage genomes are an important target for selection working on bacterial genomes. Indeed, in medical microbiology there are good indications that prophage acquisition actually shaped the epidemiology of some important bacterial pathogens (8). We summarize here some recently formulated ideas (22, 53, 115) on the coevolution of bacteria and phages, and we have screened the published bacterial genome sequences for prophage sequences. Specifically, we looked for the possible role of phage-encoded genes in the adaptation of the lysogenic bacterium to its specific environment, whether the bacterial host is an animal or plant pathogen or a commensal or a free-living bacterium. In addition, we asked where candidate lysogenic conversion genes (genes that could change the phenotype of the lysogenic bacterial host) are integrated into the prophage genomes.
Technical Difficulties
A review of prophage sequences in sequenced bacterial genomes has technical difficulties. On a very practical side, prophage sequences are currently not compiled in the National Center for Biotechnology Information (NCBI) phage database. Therefore, the interested scientist has to turn to the original publication and the annotations of the GenBank entry for the bacterial genomes to locate prophage sequences or has to reanalyze the genome sequence. However, no uniform criteria have been established for the diagnosis of prophages in bacterial genome sequences. Prophages can be present in many different forms ranging from inducible prophages to prophages showing deletions, insertions, and rearrangements to prophage remnants that have lost most of the phage genome. In addition, computer programs have difficulties in detecting prophage sequences. Only a few, if any, phage genes are sufficiently conserved and distinct from bacterial genes to serve as markers for prophage sequences. Computer programs efficiently detect integrase genes. However, it is not clear what qualifies an integrase as phage related. Several conjugative transposon-like elements contain lambda family integrases, as do integrons and pathogenicity islands. There are also chromosome-encoded integrases such as XerC/D. Given the presence of this gene family in several kinds of elements, it becomes problematic to use integrase as a prophage signature. In our own experience with one specific class of temperate phages (Siphoviridae), reasonably conserved phage proteins are the integrase (32), the portal protein, the terminase (52), and the tail tape measure protein. A further complication is that the current NCBI phage database is small (at the time of writing, it contained 136 complete phage genomes) and is dominated by a single phage group, Siphoviridae (1, 136) (contributing 53 complete genomes, followed by 18 Inoviridae, 17 Podoviridae, and 13 Myoviridae genomes). However, phages that are less well documented with respect to genome sequences can also integrate their DNA into the bacterial chromosome, e.g., P2- and Mu-like Myoviridae (147, 151), Inoviridae in Vibrio (217), Xanthomonas (48), Xylella, and Pseudomonas (194), and Plasmaviridae in Acholeplasma (137). In addition, psiM1-like Siphoviridae, lipothrixviruses, and fuselloviruses (167, 232) integrate their genomes into the chromosomes of Archaea. Still other forms of lysogeny exist that do not lead to the integration of phage DNA into the bacterial chromosome; e.g., prophages P1 and N15 are maintained as circular or linear plasmids (87, 174), respectively, and Borrelia prophages have a peculiar relationship to plasmids (65). We therefore anticipate an underreporting of prophage sequences in the published bacterial genomes.
On the other hand, Bacillus subtilis was reported to contain at least 10 prophage sequences (108). Two prophages corresponded to biologically well-defined incomplete prophages (197), and a third prophage was represented by the inducible prophage SPBc2. The diagnosis of the other prophage sequences was based on codon usage analysis (148). However, this analysis cannot easily differentiate prophages from other horizontally acquired DNA elements. For example, in B. subtilis prophage 2 (numbered in order of appearance of the prophage in the genome) a typical lysogeny module was detected but no other phage links were detected. Prophage 6 showed only few isolated links to SPBc2, while the annotated prophage 7 lacked phage links altogether, casting doubt on their prophage nature. Overreporting of prophage-like elements might thus also be a problem.
The Phage Side
Before going into the analysis of prophage-containing bacterial genomes, it is appropriate to summarize the current ideas about phage-bacterial genome interaction from an evolutionary perspective. The peculiar life-style of temperate phages makes them model systems to address a number of fundamental questions in evolutionary biology. The viral DNA undergoes different selective pressures when replicated during lytic infection cycles compared to prophage DNA maintained in the bacterial genome during lysogeny. Darwinian considerations, along with the selfish-gene concept, lead to interesting conjectures (22, 30, 53, 115). One could anticipate that the prophage decreases the fitness of its lysogenic host by at least two processes: the metabolic burden to replicate extra DNA (Fig. 1) and the lysis of the host after prophage induction. To compensate for these disadvantages, one has to invoke the notion that temperate phages encode functions that increase the fitness of the lysogen. According to the selective value of these postulated phage genes, the lysogenic cell will be maintained or even be overrepresented in the bacterial population. An obvious selective advantage for the lysogenic host is the immunity (phage repressor) and superinfection exclusion genes of the prophage that protect the lysogen against phage infection. These genes also provide a direct advantage to the prophage since they exclude superinfecting phage DNA from competing with the resident prophage DNA for the same host. Where phages from the environment do not provide a sufficiently strong selection pressure, other phage genes have to increase the fitness of the lysogenic host, frequently in rather unanticipated ways (lysogenic conversion genes). Classic examples of such phage-located genes that increase host fitness include the nonessential phage
genes bor and lom, which confer serum resistance and better survival in macrophages, respectively, to the Escherichia coli lysogen (9). In these cases, the reproductive success of the lysogenic bacterium carrying these new genes translates directly into an evolutionary success for the resident prophage. However, host-parasite relationships also constitute an arms race and therefore represent a highly dynamic genetic equilibrium. Gains from prophages carrying genes that increase host fitness are short-lived from a bacterial standpoint if the resident prophage ultimately destroys the bacterial lineage. In this way, prophages can be considered dangerous molecular time bombs that can kill the lysogenic cell on their eventual induction (115). One would therefore expect evolution to select lysogenic bacteria with mutations in the prophage DNA. Mutations that inactivate the prophage induction process avoid the loss of the lysogenic clone from the bacterial population. One would also expect that selection would lead to large-scale deletion of prophage DNA in order to decrease the metabolic burden of extra DNA synthesis and a littering of the bacterial genomes by selfish DNA elements. One would predict, furthermore, that useful prophage genes (e.g., lysogenic conversion genes) are preferentially spared from this deletion process, since their loss would actually decrease the fitness of the cell (116). It was proposed that a high genomic deletion rate is instrumental in removing dangerous genetic parasites from the bacterial genome (115). These deletion processes could explain why the bacterial genomes (in general) have not increased in size despite a constant bombardment with parasitic DNA over evolutionary time. The streamlined bacterial chromosome containing few pseudogenes might be the consequence of this deletion process of parasitic DNA.
The Bacterial Side
New data from comparative bacterial genomics highlight the importance of lateral DNA transfer in microbial evolution (35). Based on a variety of criteria such as sequence matches with other organisms, G+C -content, codon usage, association with mobile DNA elements, and proximity to tRNA genes, it has been estimated that some bacteria capture and fix DNA at a rate of at least 16 kb per 106 years (116). An interesting case is provided by E. coli. Genomic comparison between the pathogenic E. coli O157 EDL933 strain and the laboratory E. coli strain K-12 revealed 4.1 Mb of common chromosome backbone sequence, 1.3 Mb of O157-specific DNA, and 0.5 Mb of K-12-specific DNA (160). Approximately half of the O157-specific DNA was clear-cut mobile DNA, mostly prophage DNA (166). These observations led to the distinction of a conserved core genome sequence, which is shaped by the mechanisms of vertical evolution, and a variable part of the genome, which is dominated by processes of horizontal evolution. The replacement of tree-like with web-like phylogenies is the visual expression in our current understanding of evolution in the microbial world (58). Phage transduction and prophage integration are major mechanisms of lateral DNA transfer in prokaryotes. Bacteria are therefore confronted with a dilemma: phages are a threat to their survival, against which they must mount defensive countermeasures (surface changes, restriction modification, and a variety of abortive phage infection mechanisms), and at the same time phages are an important tool for the acquisition of genes which could help them to defend their ecological place or gain new ones. Apparently even closely related bacteria addressed this dilemma differently. For example, virulent phages against yogurt strains of Lactobacillus delbrueckii (subsp. bulgaricus) are a rarity in the food industry, and at the same time it is very hard to transform this organism with foreign DNA. Possibly this bacterium has opted for a strong barrier against intrusion of foreign DNA. In contrast, Lactococcus lactis and Streptococcus thermophilus phages are readily isolated from the dairy factory or raw milk, and many strains can be transformed in the laboratory, suggesting a greater permissiveness to DNA transfer. However, L. lactis has developed numerous antiphage strategies, and an intensive arms race exist between phage and host (44). In contrast, in S. thermophilus very few phage defense mechanisms have been identified so far. We even found molecular evidence for cooperation between the host and its phages; e.g., the bacterial attB site complements the 3' end of the phage integrase gene, which would otherwise lose its C terminus on integration (33). This observation suggests that the phage integrase is of selective value for this bacterial host.
|
PROPHAGES FROM LOW-G+C GRAM-POSITIVE BACTERIA
|
|---|
Streptococcus pyogenes
In 1927, long before lysogeny was described, it was demonstrated that a filterable agent from scarlet fever isolates could convert nonscarlatinal S. pyogenes to toxigenic strains (22). We know now that this conversion is mediated by bacteriophages and that 90% of S. pyogenes isolates are lysogenic (Fig. 1A). For several reasons, S. pyogenes provides a neat test case for the role of prophages. Transformation and conjugation appear to play no or only minor roles in lateral DNA transfer in this species, giving phages a special role in this process (68, 189). S. pyogenes belongs to the lactic acid bacteria (LAB) branch of low-G+C gram-positive bacteria, and its phages are good examples of phages from LAB of medical and economical importance and will therefore be discussed in more detail. Phylogenetic relatives are used in the dairy industry as starter organisms, and their viruses have become a focus for comparative phage genomics studies (28). The only known habitat of S. pyogenes is the human; it is normally found on the skin and in the oral cavity in humans. S. pyogenes comes in many M serotypes and causes an astonishing range of diseases including pharyngitis, scarlet fever, pyodermitis, fasciitis, rheumatic fever, and toxic shock syndrome. With respect to the three sequenced strains, M1 strains were associated with wound infections, M3 strains were identified in patients with severe invasive infections, and M18 strains caused rheumatic fever outbreaks (8). Despite the protean character of this pathogen, the sequenced S. pyogenes isolates are genetically closely related. For example, the M18 strain shared 1,532 of 1,696 open reading frames (ORFs) identified in the M1 strain, and sequence identity ranged from 83 to 100% at the base par level. In fact, a dot plot analysis revealed essentially a straight line between the two serotypes, with 1.7 of the 1.9 Mb of chromosomal DNA shared. There were only five larger regions of difference; all were prophage sequences (189). This observation leads to an interesting question: is the specific pathogenic potential of a given S. pyogenes strain influenced by its prophage content? Microarray analysis of 36 M18 strains demonstrated that prophages are not only a significant source of genomic divergence between strains of M1, M3, and M18 serotypes but also the predominant source of difference between M18 strains. The M18 strains differed by a maximum of 3% of the genes, and prophages were responsible for virtually all the variation in gene content. Variation in the prophages ranged from entire absence of the prophages from the test strain to small differences in the gene content of the prophages (189).
When the DNA sequences from the 15 available S. pyogenes prophages were compared against each other in a dot plot matrix, five clusters of phages consisting of two to four members could be distinguished, while two phages had only limited DNA sequence similarity to the rest of the phages (Fig. 2, green boxes). Next to E. coli prophages, this is the largest set of prophage sequences from a single species for comparative analysis. Recently, a proposal was made to base the taxonomy of Siphoviridae on the genetic organization of the structural gene cluster (excluding the tail fiber genes) (171). All currently known LAB prophages showed a conserved overall gene order: left attachment site (attL)-lysogeny-DNA replication-transcriptional regulation-DNA packaging-head-joining-tail-tail fiber-lysis modules-right attachment site (attR) (Fig. 3, 4, 5). Particularly well conserved is the order of the structural genes, allowing the distinction of three major forms of head gene clusters exemplified by the cos-site Streptococcus thermophilus phage Sfi21, the pac-site S. thermophilus phage Sfi11, and the cos-site Lactococcus lactis phage r1t as prototypes. The corresponding phages were also observed in S. pyogenes.

View larger version (48K):
[in this window]
[in a new window]
|
FIG. 2. Dot plot matrix for the currently available 15 S. pyogenes prophages identified by their prophage names on the x and y axes. According to their structural genes, the prophages were classified into distinct groups (green triangles) and annotated on the left ordinate. The prophage genomes were aligned with their integrase gene to the left (top). The extent of the conservation of the tail fiber and lysis genes is highlighted by the red box. The lack of conservation of the lysogeny and DNA replication genes is demonstrated by the yellow box. The largest group of prophages sharing early genes is indicated by the blue circles.
|
|

View larger version (51K):
[in this window]
[in a new window]
|
FIG. 3. r1t-like S. pyogenes prophages. (A) Alignment of the S. pyogenes prophage 370.3 with the L. lactis prophage r1t. (B) Alignment of r1t-like prophages from the three sequenced S. pyogenes genomes. Genes sharing sequence relationships are linked by shading. Boxes A to E mark features discussed in the text. The prophage modules, as identified by bioinformatic analysis, are color coded. Lysogeny, red; DNA replication, orange; transcriptional regulation?, yellow; DNA packaging and head, green; head-to-tail joining, brown; tail, blue; tail fiber, mauve; lysis, violet; lysogenic conversion, black; unattributed genes, grey. Selected genes were annotated: int, integrase; cI/cro, repressors; xis, excisionase; repl, replication; rec, recombination; ant, antirepressor; por, portal; terL, large subunit terminase; mhp/mtp, major head/tail protein; hya, hyaluronidase; hol, holin; lys, lysin.
|
|

View larger version (62K):
[in this window]
[in a new window]
|
FIG. 4. Sfi11-like S. pyogenes prophages. (A) Alignment of S. pyogenes prophage 370.2 with S. thermophilus prophage O1205. The arrows under the map indicate phage O1205 transcripts detected in the lysogen. (B) Alignment of S. pyogenes prophage 370.1 with S. pyogenes prophage NIH1.1 and S. pneumoniae prophage MM1. (C) Alignment of Sfi11-like prophages from the three sequenced S. pyogenes genomes. For annotations, see Fig. 3.
|
|

View larger version (66K):
[in this window]
[in a new window]
|
FIG. 5. Sfi21-like prophages. (A) Alignment of S. thermophilus prophage Sfi21, L. lactis prophage BK5-T, and S. aureus prophage PVL. Sfi21 and BK5-T genes transcribed in the lysogenic host are indicated by arrows under the gene map. Genes are annotated as in the original publications (103, 129, 134). (B and C) Alignment of the Sfi21-like S. pyogenes prophages of the A2-like (B) and Staphylococcus-like (C) subgroups.
|
|
The first group of S. pyogenes prophages are the r1t-like Siphoviridae (Fig. 2). The second group resembles Sfi11-like Siphoviridae. Sequence alignments allowed the distinction of two Sfi11 subgroups with S. pneumoniae phage MM1 and S. thermophilus phage O1205, respectively, as reference strains (Fig. 2). The third group of prophages are members of the Sfi21-like Siphoviridae. Database matches again permitted the differentiation of two subgroups: one had sequence similarity to Lactobacillus gasseri phage A2 (171), and the other had sequence similarity to staphylococcal phages (Fig. 2). Except for the tail fiber and lysis genes and part of the lysogeny genes, S. pyogenes prophage 315.5 lacks DNA sequence similarity to the other S. pyogenes prophages, while it has some protein sequence similarity to a Bacillus halodurans prophage (see below). Phage tail fibers and lysins have to interact with the cell surface and the cell wall of their bacterial host cell and should therefore be subjected to a strong adaptive selection pressure. Not surprisingly, all but one of the sequenced S. pyogenes prophages had a highly related tail fiber module (Fig. 2, red box). Central to this module is the phage hyaluronidase, an enzyme that splits the hyaluronic acid-containing capsule surrounding the bacterial cell. This lytic enzyme allows the phage to reach the cell surface, where it injects its DNA into the bacterial cell. Only prophage 315.6 lacks this tail fiber module typical for S. pyogenes prophages. Over these genes, prophage 315.6 had about 40% sequence identity at the amino acid level to S. agalactiae prophage SA1 (see below), suggesting a possible cross-species infection. In fact, these two phages had some DNA sequence similarity across their entire genomes.
One would expect competition and exclusion between prophages during the establishment of a polylysogenic cell. Exclusion could be mediated by three proteins encoded in the lysogeny module, namely, the phage integrase, the superinfection exclusion gene sie (140), and the phage repressor (immunity function). It is therefore not surprising that a substantial diversification was observed over this region between the S. pyogenes prophages (Fig. 2, yellow box; the largest group of phages sharing early genes is marked by blue circles). Eleven distinct prophage integration sites were identified in the three sequenced S. pyogenes strains. Three sites were occupied in more than one strain; in all cases, the corresponding phage integrases had at least 92% sequence identity at the amino acid level. Seven prophages used unique integration sites; however, not all possessed unique integrases. Prophages 8232.1 and 315.1 showed distinct chromosomal integration sites while sharing identical integrases. Seven- and 14-bp core sequences were deduced for the two integration sites, showing a 6-bp overlap. A similar observation was made for prophages 8232.5 and 315.5. Such low specificity of the phage integrase with respect to conserved nucleotides in the core sequence was also observed in other phages from LAB (S. thermophilus phage Sfi21 [33] and Lactobacillus phage mv4 [5]). It is therefore unlikely that the competition for integration sites is a limiting factor in the establishment of polylysogenic cells.
For the DNA-packaging, head, and tail genes, S. pyogenes prophage 370.3 had sequence similarity to L. lactis phage r1t (Fig. 3A). Differences between r1t-like S. pyogenes prophage were seen in a group of three nonstructural genes preceding the terminase gene (Fig. 3B, box C). An endonuclease gene and a point mutation interrupt and truncate the tail tape gene from prophage 8232.4 (boxes A and B, respectively). Prophage 315.3 showed distinct lysis and lysogenic conversion genes (box E). All three prophages demonstrated variations in the early genes (boxes D) and a disrupted replisome organizer gene (e.g., orf7a and orf7b in 370.3 [Fig. 3A]). S. pyogenes prophage 370.2 could be aligned with S. thermophilus phage O1205 in the DNA-packaging, head, and tail genes (Fig. 4A). Differences between the two O1205-like S. pyogenes prophages 370.2 and 8232.3 were seen in the early-gene cluster and the lysis/lysogenic conversion genes. Prophage 8232.3 showed ORF-disrupting point mutations in the tail tape and a tail fiber gene, and a lysis gene is interrupted by an endonuclease, while prophage 370.2 contained a stop codon within the portal gene (Fig. 4A).
The four MM1-like S. pyogenes prophages had extensive DNA sequence similarity throughout the entire structural genes. Except for the tail fiber and two head genes, extensive protein sequence similarity to S. pneumoniae prophage MM1 was also detected (Fig. 4B). In the structural genes, differences between the MM1-like S. pyogenes prophages amounted for a few gene replacements (Fig. 4B and C, boxes B and C), the transfer of a holin gene to the opposite strand (box H), and a point mutation leading to an inactivating frameshift in a tail fiber gene (box A). In contrast, quite extensive differences were detected over the early and the lysogenic conversion genes.
The A2-related subgroup of Sfi21-like S. pyogenes prophages differed over the nonstructural early genes and the putative lysogenic conversion genes but showed closely related structural genes (Fig. 5B); those of the Staphylococcus-related subgroup differed only for the genes near the attR site (Fig. 5C).
A clear trend for prophage DNA loss was seen in the M1 S. pyogenes strain (Fig. 1A). A 13-kb prophage remnant, 370.4, encoded only lysogeny and DNA replication genes. A closely related prophage remnant was identified at a corresponding position in the Manfredo strain. The prophages shared both flanking att sites but differed by internal insertions and deletions and gene replacements (37). In addition, three prophage remnants of only 2 kb were identified; they consisted of the phage integrase accompanied by the phage repressor and a potential lysogenic conversion gene in the R-1092 remnant (37).
All prophages but one (315.1) encoded potential virulence factors between the lysin gene and the attR site (mitogenic factors, toxins/superantigens, enzymes) (8, 53, 68) (Fig. 3 to 5). The lysogenic conversion genes in the three prophages of the M1 strain differ in their G+C content from the surrounding prophage and bacterial DNA (68), suggesting a faulty phage excision process in an unusual bacterial host with a lower G+C content as the origin of this DNA (230). The horizontal spread of these genes is also suggested by the presence of sequence-identical genes in the horse pathogen Streptococcus equi (37). Notably, there is a short stretch of sequence conservation adjacent to the right attachment site between different S. pyogenes prophages (Fig. 3B and 4C, arrows). This conserved segment and the highly conserved region around the hyaluronidase gene in the tail fiber might allow an exchange of lysogenic conversion genes between different S. pyogenes prophages by homologous recombination.
The prophage-encoded hyaluronidase and DNase have been suspected of promoting bacterial spread through host tissue by their ability to hydrolyze glucosaminic bonds in hyaluronic acid, a major component of the extracellular matrix in the connective tissue, and the liquefaction of pus when degrading the DNA from decaying lymphocytes, respectively. Notably, antibodies against both phage enzymes are found in some post-streptococcal diseases (82). The virulence properties of the DNases are not entirely clear since they were also described in the literature as mitogenic factors (or streptodornases) (191). The prophage 315.4-encoded Sla protein showed phospholipase A2 activity. Sla has sequence homology to a potent snake venom toxin and might contribute to inflammation and coagulopathy seen in streptococcal toxic shock syndrome (13).
Many S. pyogenes prophages encoded streptococcal pyrogenic exotoxins (Spe) in the lysogenic conversion region (8). However, the specific combination of toxins differed between the sequenced S. pyogenes strains: the M1 strain showed speC, speH, and speI genes; the M3 strain demonstrated ssa, speK, and speA3, genes; while the sequenced M18 contained the speA1, speC, speL, and speM genes (8). These are all distinct members of a large family of superantigens, and they include the scarlet fever toxin. These proteins bind the T-cell receptor and the major histocompatibility complex protein outside of the usual peptide binding site and lead to a pathological activation of the immune system, possibly allowing the escape of S. pyogenes from immune surveillance (170). It is conceivable that the variable combination of superantigens and mitogenic factors (sda, sdn, mf2, mf3, and mf4) provided by multiple prophages influences the pathogenic potential of the polylysogenic host. This is a theoretically interesting possibility to explain the strikingly distinct symptoms associated with pathogens whose bacterial genome sequences are so similar.
For example, prophage NIH1.1 was identified in an M3 S. pyogenes strain from a toxic shock syndrome patient. It resembled prophage SF370.1 over the entire structural gene cluster but encoded a distinct superantigen (SpeL instead of SpeC) (96) (Fig. 4B). Notably, possession of prophage NIH1.1 was a genetic marker for newly emerging M3 S. pyogenes strains in Japan (97), which had replaced an otherwise genetically identical strain that just lacked the prophage NIH1.1 (96). Prophage acquisition might thus be a major mechanism of short-term evolution in this epidemiologically highly dynamic and clinically variable bacterial species (8, 13). In an appealing model, the emergence of new, unusually virulent subclones of M3 strains is explained by the sequential acquisition of prophages 315.5, 315.2, and 315.4 in approximately 1920, 1940, and 1985 (13), suggesting bacterial pathogenicity evolution by prophage-mediated lateral gene transfer in the fast lane.
The possession of phage-located toxin genes does not automatically lead to the expression of these genes. Clinical isolates containing toxin genes showed a variable pattern of toxin expression when grown in broth culture (105). However, growth of these strains in mice or coculture with human pharyngeal cells led to the production of the toxins (30, 105). A small heat-stable factor released from the pharyngeal cells was identified as an inducer of the prophages (26). This is a fascinating observation, since it means that streptococcal prophages respond via bacterial regulation systems to signals emitted from the eukaryotic host. Mobile DNA and prophages were also the most prominent group of genes that showed expression changes when mRNA from S. pyogenes cells grown at 29 or 37°C was assayed on an M1 strain-based microarray (190).
Streptococcus agalactiae and Streptococcus mitis
S. agalactiae is a commensal organism that colonizes the gastrointestinal or genital tract of up to 40% of healthy women. However, it has substantial residual pathogenic potential for neonates and adults with underlying chronic illnesses. Two strains have been sequenced (80, 198). One strain contained two prophages, while the other showed only a large number of isolated prophage-related genes. Prophage SA1 showed a composite structure. Over the DNA-packaging, head, and tail genes, SA1 is closely related to L. lactis prophage r1t (differences are one indel and one gene replacement), while the tail fiber gene cluster show links to S. pyogenes prophage 315.6 and S. thermophilus prophage O1205. Near attL, SA1 has genes which match genes found at corresponding position in several prophages from gram-positive bacteria. Next to attR, a gene related to a candidate lysogenic conversion gene from a Listeria prophage is flanked by a mobile S. thermophilus DNA element.
A similar hybrid character was observed with S. agalactiae prophage SA2. The closest relative to the DNA-packaging, head, and tail genes from SA2 is L. lactis phage bIL170. This is a surprising affinity for a prophage since bIL170 belongs to the most frequently isolated group of obligate virulent lactococcal phages from the dairy environment (sk1-like or 936 species [47]) (Fig. 6), which is not known to contain temperate phages. The putative tail tape measure protein from SA1 and the putative phage adsorption protein from SA2 have low-level sequence similarity to two surface proteins, PblB and PblA, expressed from the S. mitis prophage SM1. Platelet binding is thought to be essential for the pathogenesis of infective endocarditis and was mediated in part by these prophage proteins (11). In prophage SA2, the structural genes from a virulent siphovirus apparently cooperated with nonstructural genes found only in temperate Siphoviridae. Interestingly, in the transition zone from nonstructural to structural SA2 genes, a XerC-like recombinase gene which might have mediated this remarkable modular exchange reaction was located.

View larger version (47K):
[in this window]
[in a new window]
|
FIG. 6. Alignment of the gene maps from bacteriophages belonging to a postulated lambda-supergroup of Siphoviridae. The viruses represent prophages from Archaea (Methanobacterium virus psiM2), -proteobacteria (E. coli phages HK97 and lambda), low-G+C gram-positive bacteria (S. thermophilus phages Sfi21 and Sfi11; L. lactis phages TP901-1 and sk1) and high-G+C gram-positive bacteria (Streptomyces phage phiC31; Mycobacterium phages L5 and TM4). Virulent phages are underlined. To better visualize the similarity between the structural gene clusters from this diverse group of phages, the phage genome maps are aligned starting with the terminase genes at the left. Structural genes are identified by a color code, as indicated at the top. Selected ORFs are numbered to facilitate the orientation with the GenBank entry.
|
|
Streptococcus pneumoniae
S. pneumoniae is the major cause of acute bacterial pneumoniae and otitis media. At the same time, it is also a transient commensal colonizing the throat and upper respiratory tract of 40% of humans. The great majority of clinical isolates carry prophages identified by hybridization experiments with phage lysin-specific probes or induction experiments (173, 181). Two types of prophages were induced: Siphoviridae, with a protein covalently linked to the 5' end of the 40-kb DNA genome represented by prophages HB-3 (177) and MM1 (78), and Myoviridae, with a 42-kb genome (55). Ground-laying work was performed with the lysins of S. pneumoniae phages and defined a two-domain structure consisting of a lytic domain and a binding domain to the choline part of the pneumococcal cell wall (74, 183). Prophage MM1 from the multiple-antibiotic-resistant epidemic S. pneumoniae strain Spain 23F is currently the only completely sequenced temperate phage of this species (accession number AJ302074). Its head and tail genes are closely related to those of S. pyogenes prophage 370.1 (Fig. 4B), while the tail fiber genes resemble S. agalactiae prophage SA1. MM1 is one of the few prophages from LAB lacking genes between the lysin gene and the attR site. A candidate lysogenic conversion region consists of two overlapping ORFs in the replication module of MM1, which encode the two subunits of a cytosine methyltransferase of a mobile DNA element. However, this DNA element is not specific to MM1, since similar genes were also found in S. agalactiae prophage SA2.
Streptococcus thermophilus
S. thermophilus is naturally found in raw milk and represents a major starter bacterium in the dairy industry. Lysogeny is not widespread in this species (29). According to the mode of DNA packaging, two groups of temperate S. thermophilus phages have been characterized (122), represented by the pac-site Siphovirus O1205 (192) (Fig. 4A) and the cos-site Siphovirus Sfi21 (129) (Fig. 5A). Temperate and virulent S. thermophilus phages showed a peculiarly close genetic relationship. Virulent phages which are the predominant ecological isolates from both the factory and raw milk (31) are essentially the result of deletion, gene replacement and rearrangement events in the lysogeny module of temperate phages (127). In silico analysis demonstrated that prophages related to the two basic types of S. thermophilus prophages are found in many low-G+C gram-positive bacteria (Fig. 4A and 5A). Comparative genomics revealed distant relationships to lambdoid phages from gram-negative bacteria and even prophages from Archaea (51) (Fig. 6). In fact, over the structural gene cluster, the Sfi21-like phages shared a gene map with E. coli phage HK097. Sfi11 phage even showed protein sequence similarity to phage lambda, suggesting distant phylogenetic relationships between these phages (28). No protein sequence similarity linked HK097 and Sfi21 prophages. However, some features characteristic for this group of phages were identified in both: the major head protein was proteolytically cleaved at amino acid 104 and 105, respectively, releasing an N-terminal protein fragment with strong coiled-coil structure (51). In both prophages, a protease gene precedes the major head gene. In phage Sfi21, this protease belongs to the ClpP protein family, as in many other Sfi21-like phages from LAB and even prophages from
-proteobacteria (54). This specific gene constellation is a diagnostic criterion for Sfi21-like phages. Sfi11-like phages showed a distinct head gene constellation consisting of three phage head genes and one scaffold gene.
Sfi21 belongs to the few temperate phages from LAB for which a transcription map was established both in the lytic mode of infection (210) and in the prophage state (209) (Fig. 5A). In the lytic mode, essentially the entire Sfi21 genome was transcribed, allowing a distinction of early (transcription regulation module), middle (DNA replication), and late (structural and lysis genes) transcripts. In the lysogenic state, only two Sfi21 genome regions were transcribed from the otherwise transcriptionally silent prophage (209). One transcript comprised the DNA segment from the cl-like repressor (34) to the superinfection exclusion (sie) genes located directly upstream of the phage integrase (32). The cloned cI repressor protected a cell against superinfection with temperate phages (34), while the cloned sie gene conferred protection against many virulent phages (32). Another transcript covered four genes located between the lysin gene and the attR site (209). These genes lacked database matches, preventing speculations about their possible functions. S. thermophilus prophage O1205 carries a different set of genes near attR, and they also belong to the few genes transcribed from the prophage (209) (Fig. 4A). A lysogenic conversion phenotype was observed for a S. thermophilus strain lysogenic with the prophage TP-J34: it showed distinct growth properties (planktonic versus aggregated growth) when lysogenic or when prophage cured (H. Neve, personal communication). TP-J34 displayed a distinct set of genes between the lysin gene and the attP site (158). A database search revealed that many temperate phages from low-G+C gram-positive bacteria showed extra genes between the phage lysin and attR (209). With the exception of a Bacillus halodurans prophage (see below), these prophage genes from free-living bacteria showed an informative database match, precluding any speculation with respect to their function (209). In accordance with theoretical predictions, a prophage remnant consisting of the phage integrase and a few transcribed phage genes was described for S. thermophilus (209).
Lactococcus lactis
L. lactis is the closest phylogenetic relative of the genus Streptococcus and is the major starter used in the cheese industry. Due to the economical impact of phage infections, lactococci and their phages became a focus of research in dairy microbiology. The completely sequenced L. lactis strain IL1403 contained six prophage elements (42). Two inducible and one noninducible prophages showed the genome organization of cos-site temperate Siphoviridae closely related to S. thermophilus phages Sfi21. Three 15-kb prophage remnants had maintained only lysogeny genes (integrase and repressor) and, in variable amounts, DNA replication and a few structural genes (42). In contrast to our interpretation, these authors viewed them as P4-like satellite phages.
The Sfi21-like lactococcal prophages are represented by prophage BK5-T (Fig. 5A). Over the structural gene cluster, BK5-T showed an interesting gradient of sequence similarity covering high and low DNA identity to prophages bIL286 (Lactococcus) and Sfi21 (Streptococcus) or moderate or low protein sequence identity to phages adh (Lactobacillus) and PVL (Staphylococcus). Since this gradient of prophage relatedness reflects the phylogenetic relationship of their host bacteria, a coevolution of prophages with their bacterial hosts was initially discussed (52). However, further analysis also demonstrated substantial sequence diversification within prophages from a single bacterial species (L. lactis), including DNA sequence (bIL286), protein sequence (bIL309), or only genome organization similarity to BK5-T (bIL285), creating a problem for phage taxonomy and models of phage evolution (171). Transcription in the BK5-T prophage was limited to two regions: three transcripts covered the phage integrase, the sie homologue and the cI repressor, and another transcript was derived from an anonymous large gene located between the phage lysin and the attR site (21) (Fig. 5A).
The best-characterized Sfi11-like prophage in L. lactis is prophage TP901-1 (24, 25) (Fig. 6). The structural proteins from TP901-1 have been characterized by protein sequencing (100), immunoelectron microscopy (100), and mutational analysis (165). The results confirmed that the prediction of gene functions by comparative genomics, and specifically the alignment of the structural gene map with phage lambda, is quite reliable (28). For example, as in lambda, the length of the tail structure is determined in TP901-1 by the length of the tail tape measure protein (165). Also, the prediction of a transcriptional regulation module between the DNA replication module and the structural gene module in prophages from LAB was confirmed experimentally (24). In fact, many of the in silico predictions of gene assignments in phages from LAB were confirmed by experiments with one or the other phage from LAB, instilling some confidence in the power of comparative phage genomics. Indeed, in some cases the experiments were actually guided by comparative genomics. The genome analysis of S. thermophilus phages differing in host range provided keys to the location of the phage antireceptor on the genome map and suggested a mechanism of diversification by the exchange of highly variable gene segments flanked by conserved gene segments encoding collagen-like peptides (127, 200). The model was subsequently confirmed by the construction of chimeric phages with S. thermophilus phage DT1 (61). However, two genes occupied different genome positions in dairy phages from their positions in phage lambda. In prophages from low-G+C gram-positive bacteria, the lysis cassette is invariably located downstream of the tail fiber genes, in contrast to lambda, where they are found upstream of the DNA packaging genes. Second, the excisionase from TP901-1 was identified (23) within the early genes downstream of the cro-like repressor gene (131). In lambda, the xis gene is found directly upstream of the phage integrase gene. This position is occupied in lactococcal and streptococcal prophages by the sie gene (140).
The third class of lactococcal prophages is represented by phage r1t (206). Not only did its genetic switch region show a comparable structure to that of phage lambda (155), but also molecular modeling of its repressor on the basis of the lambda cI-repressor allowed the design of a thermolabile repressor mutant as a genetic tool (154). The functions of two predicted DNA replication genes were confirmed by biochemical experiments, including a replisome organizer (235) and a RusA protein. The latter is an endonuclease that resolves Holliday junction intermediates formed during DNA replication, recombination, and repair (182). Interestingly, the RusA protein of E. coli is also encoded by the defective prophage DLP12 and rusA-like sequences are associated with prophage sequences in several bacteria. Since phages from LAB dedicate more genes from their genome to DNA replication functions than the similarly organized phage lambda does, one might ask to what extent some of these genes are of potential use to the bacterial host. Such dual functions could also explain why prophage remnants in LAB demonstrated a trend to maintain genes from the lysogeny and DNA replication modules. As in S. thermophilus phages, closely related virulent derivatives of temperate r1t-like phages were described. Their lysogeny module consisted only of the genetic switch structure, while the phage integrase has been eliminated (133).
Many lactococcal phages, including r1t, contain introns at various genome positions, demonstrating that selfish DNA elements such as prophages can also become the target for parasitic DNA elements. Intron homing is the process by which introns spread through a population of intronless alleles and is initiated by intron-encoded endonucleases. In dairy phages, these endonucleases are found relatively frequently (47, 71). The process of intron homing can be very efficient: an ecological survey in S. thermophilus phages revealed that all phage lysin genes possessing a 14-bp consensus sequence contained an intron. As with the prophage DNA, one would expect a selection pressure to remove the intron or to prevent its further spread. Indeed, large deletions within the homing endonuclease were detected in S. thermophilus phages (71).
Lactobacillus
The use of Lactobacillus, another LAB, in various food fermentation processes and as probiotic (health-promoting bacteria) has motivated research into their phages and prophages. L. delbrueckii prophage mv4, for which closely related virulent phages were also described (142, 208), became the focus for research into the site specificity of the integration system (4, 5). Lactobacillus gasseri phage A2, in comparison, is the best-characterized LAB phage with respect to its DNA-packaging mechanism (75). Also, the genetic switch structure of A2 was studied in more detail than in any other phage from LAB: three operators located between divergently transcribed repressor genes were bound with different affinities by the two repressors, resulting in a repositioning of the RNA polymerase (76, 110, 111).
When corresponding genome segments were studied in different phages from LAB, substantial biological variability was frequently observed. The genetic switch region can serve as an example. Lactobacillus casei phage A2 still follows the phage lambda paradigm relatively closely. Substantial deviations from this theme were found in other phages from LAB; e.g., Lactobacillus plantarum phage phig1e showed seven 15-bp operators with dyad symmetry in this region, which were bound differentially by the repressors encoded by the flanking genes (102); in the Lactococcus phage BK5-T, the divergently transcribed repressor genes are separated by one ORF which is normally found further downstream of the early lytic transcript (134); and in the lactococcal phage TP901-1 and the streptococcal phage Sfi21, the lytic (Cro) repressor lacked binding activity for the DNA of the genetic switch region and inhibited the lysogeny (cI) repressor binding to the genetic switch region possibly by protein-protein interaction between the two repressors (34, 132).
The holin-lysin system provides another example. The similarity with the phage lambda holin S and lysin R gene constellation was demonstrated in experiments where Lactobacillus phage holin (162) and lysin (19) could complement lambda prophages containing mutations in both genes. However, S. thermophilus phages showed two holin genes with distinct biological properties, suggesting a holin-antiholin system in the control of the lytic process (184). Apparently, there are many different solutions to a given problem for phages with a common overall genome organization. This is not a peculiar situation in phages from LAB; similar observations were made with lambdoid phages (218).
Most sequenced Lactobacillus species contain prophage sequences. The 2-Mb chromosome of the gut commensal Lactobacillus johnsonii NCC 533 (Nestlé) contained two prophages showing the genome organization of Sfi11-like pac-site Siphoviridae (54). The lysogeny module of these prophages contained more genes than are commonly found in temperate phages of LAB (128). In one prophage, two of these extra genes showed links to a genomic island from S. aureus. Northern blot analysis revealed that these genes are transcribed in the lysogen. Microarray analysis demonstrated that the two prophages Lj928 and Lj965 represented quantitatively the majority of the strain-specific DNA of the sequenced L. johnsonii strain. Another L. johnsonii prophage, Lj771, had extensive DNA sequence identity to a prophage in the sequenced L. gasseri strain (Joint Genome Institute). Differences over the late genes were limited to few genome regions (lysin and anti-receptor) but were more extensive over the early genes.
The sequenced L. plantarum strain WCFS1 (106) contained two closely related Sfi11-like prophages that had a nearly identical structural gene cluster. One prophage contained a disruptive mutation in the terminase gene. Candidate lysogenic conversion genes were identified by database searches and transcription analysis near both the attL and attR sites. The extra genes shared similarity to a mitogenic factor encoded by an S. pyogenes prophage. This observation is notable since the sequenced L. plantarum strain was isolated from the oral cavity of a human, which is also the habitat of S. pyogenes. A prophage remnant consisted of truncated lysogeny, DNA replication, and a few structural genes typical for an Sfi21-like phage. It abutted directly one of the Sfi11-like prophages.
Listeria
Although most if not all Listeria strains carry functional or cryptic prophages, the potential influence of lysogeny on the host phenotype is unknown. Only one Listeria prophage has been investigated in some molecular detail: A118 belongs to Sfi11-like Siphoviridae, but lacks a pac-site (125). The prophage integrates into comK, a putative transcriptional activator for various factors involved in competence for DNA uptake. However, Listeria is not easily transformable, and so a negative lysogenic conversion phenotype is not immediately obvious. A closely related prophage, EGDe, was identified in the sequenced Listeria monocytogenes strain (79). Over the structural gene cluster, differences from A118 were limited to the major head gene. In view of the intricate protein-protein interactions which occur during phage morphogenesis, it is surprising that a single protein can be exchanged without upsetting the other phage proteins participating in the head-building process. More substantial differences were detected over the nonstructural genes including the lysogeny module, which might explain why A118 can be propagated on a strain containing the EGDe prophage. From the Listeria strain ScottA, isolated during a large listeriosis epidemic in the United States, an Sfi21-like prophage, PSA, was induced and sequenced (accession number AJ312240). Like all sequenced Listeria prophages, PSA contained a cluster of genes without database matches near the attR site. Parts of these genes were shared between different Listeria prophages.
Listeria is ubiquitous in nature; it can be found in soil and the gut, and it represents an opportunistic pathogen in animals and to a lesser extent in humans. L. monocytogenes, the etiological agent of listeriosis, a severe food-borne disease, and the nonpathogenic species L. innocua shared a closely related genome and an unexpected synteny with B. subtilis and S. aureus (79). Remarkably, all major gaps in the alignment of the two bacterial genomes were represented by the prophages integrated into L. innocua. Except for prophage genes, less than 10 and 5% of the genes were L. monocytogenes and L. innocua specific, respectively. L. innocua contains five prophages; only A118-like prophage 1 is shared with L. monocytogenes, but the two prophages are integrated into two different chromosome locations. Over the structural genes, prophages 2, 3, and 5 resembled B. subtilis prophage PBSX, Xylella prophage XfP3, and Lactococcus prophage bIL285 (171), respectively. The closest relative of prophage 4 was the L. monocytogenes prophage EGD, with which it had low to moderate sequence similarity in a patchwise fashion.
Staphylococcus aureus
Staphylococcal enterotoxins cause an acute food-poisoning syndrome that is the second most frequent food-borne disease in the United States. Like botulism, the illness results from ingesting preformed bacterial toxins. The gene for enterotoxin A is carried by several staphylococcal prophages near their attachment sites (14). In addition, S. aureus causes a range of diseases from skin infections to life-threatening conditions such as sepsis. The organism produces many toxins and is highly efficient at overcoming antibiotics. A number of prophages have been found in clinical isolates. Their sequencing revealed the carriage of several toxin genes. Prophage PVL, a typical Sfi21-like siphovirus (Fig. 5A), encoded the clinically important bicomponent cytotoxin leukocidin S and F between the phage lysin and the attR site (103). Leukocidin is an established staphylococcal virulence factor, which causes leukocytolysis and tissue necrosis. The same toxin was found on prophage SLT, showing a distinct morphology (an elongated instead of icosahedral head as in PVL), suggesting horizontal transmission of toxin genes between temperate phages (153) (Fig. 7). Despite its distinct head morphology, SLT also showed the genome organization of an Sfi21-like siphovirus, with the characteristic gene constellation portal protein-ClpP protease-major head gene (identical to the prophage phi12 head protein, see below). The noninducible prophage PV83 shared with PVL the entire structural gene cluster (>86% amino acid identity) but showed a variant leukocidin (LukM/ LukF pore-forming complex) next to a distinct lysis cassette. The defective nature of this prophage might be linked to the incorporation of a transposase-containing insertion sequence into a head-to-tail joining gene of PV83. A second insertion element is found near the attR site of PV83 (234).

View larger version (76K):
[in this window]
[in a new window]
|
FIG.7. Alignment of the five major types of S. aureus prophages. phiMu50A and phiMu50B are prophages extracted from the sequenced Mu50 S. aureus strain, while phages ETA, SLT, and PVL were induced from unsequenced S. aureus strains. The phage genes are color coded as in Fig. 3. Sequence-related genes between the prophages are linked by shading. Sequence matches of structural genes to those in other than staphylococcal phages are indicated (Bacillus, 105, SPP1, SPBc; Listeria, PSA, A118; Lactobacillus, adh, g1e; Lactococcus, TP901-1; Streptococcus, Sfi21, Sfi11, 1205, MM1 Spy; E. coli, N15; Pseudomonas, D3). Selected ORFs are numbered to facilitate the orientation with the GenBank entry.
|
|
Exfoliative toxin is one of the extracellular staphylococcal proteins causing blistering skin disease. The exfoliative toxin A is encoded downstream of the lysin gene in prophage ETA (225) (Fig. 7). This prophage showed the genome structure of an Sfi11-like siphovirus, with many protein sequence links to the phages described in the preceding sections. Comparison with prophage SLT identified a possibly inserted group of genes between the tail fiber and lysis genes (Fig. 7). These genes encode a cell hydrolase and a protein related to a collagen-like surface protein, a virulence factor in S. pyogenes, thus representing further candidate lysogenic conversion genes.
The genome sequence from the methicillin-resistant strain N315 and the vancomycin-resistant strain Mu50, isolated 15 years apart from Japanese patients, were closely related (99% at the nucleotide level); most of the differences were due to the insertion of Mu50-specific DNA elements (109). Both strains had related prophages phiN315 and phiMu50A integrated at the same chromosomal locus (beta-hemolysin) next to the pathogenicity island SaPIn1 (Fig. 1B). The two prophages belonged to the Sfi21-like Siphoviridae and had sequence similarity to prophage PVL over large parts of the genome (Fig. 7). However, the DNA-packaging, head, and tail genes belonged to different modules, showing some sequence similarity to Listeria prophage PSA. Differences between phiN315 and phiMu50A included several gene replacements in the lysogeny module (e.g., a sugar transferase), a larger replacement in the putative transcription regulation module, and a single-gene indel near the attR site, providing an additional truncated lysin gene in phiN315. Both prophages contained several candidate lysogenic conversion genes (encoding enterotoxin P, staphylokinase, and the M-like protein fragment) near but not in the direct vicinity of the attR site. The vancomycin-resistant strain contains an additional prophage phiMu50B that shares with prophage ETA the integrase and integration site and sequence similarity over part of the early, tail fiber, and lysis genes (Fig. 7). PhiMu50B is a close relative of prophage phi11 from S. aureus strain 8325, with which it could be aligned at the DNA level over nearly the entire genome length including the head gene cluster, defining a second allele of structural genes in Sfi11-like S. aureus prophages (Fig. 8). PhiMu50B contained candidate lysogenic conversion genes in the vicinity of the genetic switch region (two genes with links to the pathogenicity island SaPIn1) and genes downstream of the phage lysin (one showed a match with a S. pyogenes prophage gene located next to a superantigen or toxin) (Fig. 7).

View larger version (49K):
[in this window]
[in a new window]
|
FIG. 8. Dot plot matrix for the currently available 12 S. aureus prophages. The prophages are identified by their names on the x and y- axes. According to the structural genes, the prophages were classified into five distinct groups (red triangles) and annotated on the left ordinate. Two groups of phages sharing relatively highly conserved early genes are marked by blue and green circles.
|
|
In contrast to N315, which was isolated from a hospital infection, MW2 is a community-acquired methicillin-resistant S. aureus strain which is otherwise susceptible to many antibiotic classes. This strain had 95% identity to N315 and Mu50 at the nucleotide level. MW2 contains two prophages: phiSa3 and phiSa2. The first is found at a position occupied by prophages in four of the five currently sequenced S. aureus strains (Fig. 1B). Comparison of the corresponding prophage maps revealed patchwise relatedness. This mosaic structure was interpreted as evidence for multiple crossovers between the phages (7). PhiSa3 encodes two new enterotoxins in the vicinity of attL (enterotoxin G and K homologues, nearly identical to the corresponding genes in the SaPIn3 pathogenicity island [226]) and the sea toxin gene located between the tail fiber and lysis module. PhiSa3 differed from the prophage PVL essentially only in the associated virulence factors (Fig. 8). PhiSa2 has DNA sequence similarity to prophage phi12 essentially over the entire genome (Fig. 8). Differences included a few indels and some gene replacements. Notable was the possession of the lukF and lukS genes between the lysin gene and attR in phiSa2, where phi12 lacked ORFs.
Strain 8325 was used for the construction of the first physical maps of S. aureus. It harbors three prophages, phi11, phi12, and phi13 (95) (Fig. 1B). phi11 and phi13 have been studied in some detail. phi11 DNA is 5% terminally redundant and 40% circularly permutated (126). phi11 is one of the few prophages from low-G+C gram-positive bacteria that showed the attP-int-xis gene constellation familiar from phage lambda (227, 228). This is, however, not the common situation even in staphylococcal phages (38). phi13 was the first staphylococcal phage associated with positive (staphylokinase) and negative (beta-toxin) phage conversion (222). The negative phage conversion occurred because phi13 integrated into the beta-toxin, leading to gene inactivation (46). This is not an isolated case. S. aureus phage L54a integration confers a lipase-negative phenotype due to insertional inactivation of a lipase gene (119). The positive phage conversion is conferred by the staphylokinase gene located between the phi13 lysin gene and attR.
A dot plot matrix of the available S. aureus prophages demonstrated five distinct groups of structural modules (Fig. 8). Three distinct groups of Sfi21-like cos-site Siphoviridae were identified: PVL-PV83-phi13-phiSa3 comprise the first group, the second is represented by SLT-phiSa2-phi12, while the third group is provided by phiMu50A-phiN315. In addition, two different Sfi11-like pac-site Siphoviridae were revealed by the dot plot: prophages phiMu50B-phi11 on one side and phiETA on the other side. With respect to the early genes, the distinction of DNA homology groups was less obvious. Two loosely defined groups could be distinguished (Mu50B-PVL-Sa3 vs all the others), but an extensive mosaicism prevented a sharper distinction of modules (Fig. 8).
In contrast to the prophage-containing S. aureus strains, the sequenced Staphylococcus epidermidis strain ATCC 12228 (accession number AE015929) lacked prophage sequences.
Bacillus
Comprehensive research was conducted on two virulent phages of the soil bacterium Bacillus subtilis: phi29, a podovirus, and SPP1, despite its life-style a typical Sfi11-like siphovirus (54) and by far the best-characterized phage of this proposed phage genus (see references 6 and 130 for recent examples and references therein). Much less is known about temperate B. subtilis phages, which have been classified into five groups (231). Only three groups are represented by sequenced prophages. The group I phage phi105 shows the genome organization of a typical Sfi21-like siphovirus (52) and was investigated mainly for repressor binding to the genetic switch region (205). The group III phage SPBc2 represents a 134-kb siphovirus consisting of 187 predicted ORFs, 70% of which lacked matches to the database (118). A mere 14 ORFs shared links with other phages. In contrast, about 30 ORFs had links with bacterial genes, mostly from B. subtilis. According to the orientation of the ORFs, three clusters could be distinguished. Cluster I contains the integrase/recombinase. Cluster II starts with the lysis cassette and continues with the structural gene module. The tail fiber genes had up to 50% amino acid sequence similarity to proteins from defective B. subtilis prophages. Cluster III contained genes involved in transcriptional regulation, DNA replication, and nucleotide metabolism.
Group V prophages are represented in the sequenced B. subtilis strain by the defective prophages PBSX and skin. Upon UV or mitomycin C induction, the cell releases phage-like particles consisting of small heads and large, complex tails that adsorb to and kill related bacilli acting like bacteriocins. The head contains randomly selected 13-kb fragments of the bacterial chromosome. In that respect, PBSX resembles a small bacteriophage-like particle discovered in the purple nonsulfur bacterium Rhodobacter capsulatus, which transfers random 4.5-kb segments of the genome of the producing cell to recipient cells, where allelic replacements occur. This particle was called a gene transfer agent, resulting in a genetic exchange process controlled by the bacterial cell (113). However, the DNA packaged into the PBSX head is not injected into the cell (223). The widespread occurrence of the PBSX-like defective phages throughout the Bacillus species and the failure to isolate strains cured of PBSX nevertheless suggested that their continued maintenance is advantageous, if not essential, for the host strain (223). The 28-kb PBSX prophage remnant consists of a shortened lysogeny and DNA replication module and a structural gene cluster whose organization resembles that of the Sfi11-like pac-site Siphoviridae (Fig. 9). In comparison with the standard genome map of Sfi11-like phages, PBSX lacks a large head protein gene normally located between the portal gene and the scaffold gene. In addition, there are fewer head-to-tail joining genes than usual, possibly explaining the small head morphology. The siphovirus-like tail fiber genes are followed by sequence links to putative tail genes from the myovirus prophage SPBc2 and end in a lysis cassette consisting of a holin and an amidase-type lysin gene (Fig. 9). During sporulation, the ca. 50-kb skin prophage element is excised from the B. subtilis chromosome by a DNA rearrangement event (197). The prophage remnant contains a seemingly complete set of structural genes characteristic of Sfi11-like pac-site Siphoviridae. Over these genes, it had sequence similarity to many structural genes from the PBSX prophage and Listeria innocua prophage 2 (79). The structural gene cluster is preceded by a DNA replication module. The lysogeny region is reduced to the genetic switch structure, while an integrase was not detected. An integrase was found downstream of the prophage lysis cassette, separated by a group of bacterial genes including an arsenic resistance operon. In contrast to PBSX, the skin element contains no genes essential for B. subtilis viability.

View larger version (25K):
[in this window]
[in a new window]
|
FIG. 9. Clostridium prophages. The genome of C. tetani prophage 3 (center) is aligned with B. subtilis prophage PBSX (top) and C. perfringens prophage phi3626 (bottom). Genes related at the protein sequence level are connected by shading. Putative gene functions are indicated, and the modular structure of the prophage genomes is indicated by a color code, as in Fig. 3. Two likely deletions in the head gene cluster of PBSX are marked by a delta symbol.
|
|
The sequencing of B. halodurans, an industrial source of enzymes used under alkaline pH, revealed 112 ORFs encoding transposases or recombinases, suggesting an important role of these enzymes in horizontal gene transfer in this species; however, no prophage was reported (196). A reanalysis of the sequence revealed a complete prophage, showing the typical genome organization of an Sfi11-like siphovirus with sequence matches over the head and tail genes to S. pyogenes prophage 315.5. As in a number of other prophages, an isolated adenine methyltransferase gene was detected between the DNA replication module and the DNA-packaging module. More interestingly, however, was the presence of a type II restriction endonuclease and an associated cytosine-specific methyltransferase located between the phage lysin and the attR site. Possession of the prophage thus confers a potentially new restriction modification system to the lysogenic cell.
Clostridium
The spore-forming clostridia are widely disseminated in soil and lakes but are also found in the intestinal flora. Clostridium botulinum is defined as any clostridial isolate that produces botulinum toxin, which causes an often fatal form of food poisoning. Biological experiments conducted 30 years ago established that lysogenization by some bacteriophages with contractile tails converted nontoxigenic into toxigenic isolates. Curing of the prophage leads to concomitant loss of the toxigenicity (66, 67). However, the first temperate Clostridium phage was sequenced only recently (233). Despite its relatively small genome size of 33.5 kb, the Clostridium perfringens phage phi3626 showed the typical genome organization of Sfi21-like Siphoviridae (Fig. 9). The presence of two genes related to sporulation-dep