MMBR Figure table search 04
Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowReprints and Permissions
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Meijer, W. J. J.
Right arrow Articles by Salas, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Meijer, W. J. J.
Right arrow Articles by Salas, M.

 Previous Article  |  Next Article 

Microbiology and Molecular Biology Reviews, June 2001, p. 261-287, Vol. 65, No. 2
1092-2172/01/$04.00+0   DOI: 10.1128/MMBR.65.2.261-287.2001
Copyright © 2001, American Society for Microbiology. All rights reserved.

phi 29 Family of Phages

Wilfried J. J. Meijer, José A. Horcajadas, and Margarita Salas*

Centro de Biología Molecular "Severo Ochoa" (CSIC-UAM), Universidad Autónoma, Canto Blanco, 28049 Madrid, Spain

SUMMARY
INTRODUCTION
GENERAL FEATURES OF PHAGES phi 29, B103, AND GA-1
SEQUENCE ANALYSIS OF THE GA-1 GENOME
GENETIC AND TRANSCRIPTIONAL ORGANIZATION
TRANSCRIPTIONAL REGULATION
    Early Promoters A2b and A2c and Late Promoter A3: Transcriptional Regulation by Proteins p4 and p6
    Early Promoter C2: Transcriptional Regulation by Protein p6
    Early Promoters C1, C1a, and C1b Present in phi 29, B103, and GA-1, Respectively
    Promoter A1, Driving Synthesis of the pRNA
    Other Promoters in the phi 29 Genome
    Other Promoters in the GA-1 Genome
TRANSCRIPTIONAL TERMINATION
PROTEIN-PRIMED MECHANISM OF DNA REPLICATION
INITIATION OF DNA REPLICATION
    DNA Polymerase-TP Heterodimer Formation
    Sliding-Back Mechanism
    Transition from Protein-Primed to DNA-Primed Replication
THE FOUR MAIN PROTEINS REQUIRED FOR IN VITRO DNA REPLICATION
    DNA Polymerase
        C-terminal domain of phi 29 DNA polymerase.
        N-terminal domain of phi 29 DNA polymerase.
        (i) Proofreading.
        (ii) Strand displacement.
        Coordination between synthesis and degradation.
    Terminal Protein p3
    DBP Protein p6
    SSB Protein p5
OTHER GENES AND OPEN READING FRAMES DOWNSTREAM OF GENE 2 IN phi 29 AND B103
    Gene 1 of phi 29
GA-1 OPERONS CONTAINING OPEN READING FRAMES M-O AND P-T
EARLY OPERON LOCATED AT THE RIGHT SIDE OF THE PHAGE GENOMES
    Gene 17
    Gene 16.7
LATE OPERON
    Gene 8.5, Encoding the Head Fiber Protein
    Structural Phage Proteins and phi 29 Phage Morphogenesis
        Prohead formation.
        DNA translocating/packaging machine.
        (i) Connector.
        (ii) pRNA ring.
        (iii) ATPase protein p16.
        Putative mechanism of phi 29 DNA packaging.
        Phage maturation.
        Lysis cassette.
        (i) Holin-encoding genes of phi 29, B103, and GA-1.
        (ii) Peptidoglycan hydrolase-encoding genes of phi 29, B103, and GA-1.
CONCLUSIONS AND FUTURE PERSPECTIVES
ACKNOWLEDGMENTS
REFERENCES


SUMMARY
Top
Next
References

Continuous research spanning more than three decades has made the Bacillus bacteriophage phi 29 a paradigm for several molecular mechanisms of general biological processes, such as DNA replication, regulation of transcription, phage morphogenesis, and phage DNA packaging. The genome of bacteriophage phi 29 consists of a linear double-stranded DNA (dsDNA), which has a terminal protein (TP) covalently linked to its 5' ends. Initiation of DNA replication, carried out by a protein-primed mechanism, has been studied in detail and is considered to be a model system for the protein-primed DNA replication that is also used by most other linear genomes with a TP linked to their DNA ends, such as other phages, linear plasmids, and adenoviruses. In addition to a continuing progress in unraveling the initiation of DNA replication mechanism and the role of various proteins involved in this process, major advances have been made during the last few years, especially in our understanding of transcription regulation, the head-tail connector protein, and DNA packaging. Recent progress in all these topics is reviewed. In addition to phi 29, the genomes of several other Bacillus phages consist of a linear dsDNA with a TP molecule attached to their 5' ends. These phi 29-like phages can be divided into three groups. The first group includes, in addition to phi 29, phages PZA, phi 15, and BS32. The second group comprises B103, Nf, and M2Y, and the third group contains GA-1 as its sole member. Whereas the DNA sequences of the complete genomes of phi 29 (group I) and B103 (group II) are known, only parts of the genome of GA-1 (group III) were sequenced. We have determined the complete DNA sequence of the GA-1 genome, which allowed analysis of differences and homologies between the three groups of phi 29-like phages, which is included in this review.


INTRODUCTION
Top
Previous
Next
References

The genus Bacillus incorporates many species of gram-positive, aerobic, endospore-forming bacteria that normally inhabit the soil or decaying plant material. In these habitats, a large variety of phages have been isolated that infect bacilli. All of these phages isolated so far have some common features. First, they all contain double-stranded DNA (dsDNA), and second, the virions have prolate icosohedral heads and are tailed. Modern phage taxonomy is based on properties of the virion and its nucleic acid (see references 74 and 131). The order of tailed phages, named Caudovirales, are classified into three families: Myoviridae (phages with contractile tails), Podoviridae (phages with short tails), and Siphoviridae (phages with long noncontractile tails). For a general review on tailed bacteriophages, see reference 4. In addition to taxonomy based on properties of the virion and its nucleic acid, phages can be divided into three groups based on their infection cycle. The first group contains lytic phages that complete their life cycle within a well-defined period after infection and are unable to lysogenize their host. The second group is formed by the so-called pseudo-temperate phages. These are virulent phages with an extended and irregular latent period. Although this stage mimics lysogeny, it does not involve a stable prophage. The third group contains the temperate phages. The genomes of these phages are able to integrate into the host genome and can be maintained in this lysogenic stage for many generations. Generally, during this stage, the cells are immune to infection with the same phage.

This review specifically focuses on the phi 29-like genus of phages, which includes, in addition to phi 29, phages PZA, phi 15, BS32, B103, M2Y (M2), Nf, and GA-1. They are all lytic phages that belong to the Podoviridae family. Most of these phages infect Bacillus subtilis, but often they also infect other related species such as Bacillus pumilus, Bacillus amyloliquefaciens, and Bacillus licheniformis. Phages of this genus have been subclassified into three groups based on serological properties, DNA physical maps, peptide maps and partial or complete DNA sequences (164, 220, 222). The first group includes phages phi 29, PZA, phi 15, and BS32; the second group includes B103, Nf, and M2Y; and the third group contains GA-1. Interestingly, the classification of these phages coincides with their geographical distribution. Thus, the phages belonging to group I were isolated in the United States (169), those belonging to group II were isolated in Japan (91, 113, 191), and GA-1 (group III) was isolated in Europe (39).

The genomes of the phi 29-like phages consist of a linear dsDNA molecule of about 20 kb that has a phage-encoded protein, named terminal protein (TP), covalently attached at each 5' DNA end. The DNA sequences of the complete genomes of phi 29 and PZA (83, 84, 161, 216, 221, 224) belonging to group I and that of B103 (163) belonging to group II have been determined. However, only parts of the GA-1 sequence, belonging to group III, have been determined so far (78, 86, 111, 114, 222). To gain a comprehensive understanding of the relatedness of the three groups of phages, we have determined the complete DNA sequence of GA-1. The genomes of phi 29 and B103 are 19,285 and 18,630 kb, respectively (163, 216). However, the GA-1 genome was reported to be approximately 21.5 kb (220, 223). Thus, an additional incentive to determine the complete DNA sequence of GA-1 was to gain insight into possible additional coding sequences present on the GA-1 genome.

Phage phi 29 has been subject to extensive studies, and the results have led to the understanding of several molecular mechanisms of general biological processes, such as DNA replication, regulation of transcription, phage morphogenesis, and phage DNA packaging. These various topics will be discussed in this review, and attention is focused specifically on progress made during the last few years. In general, the views presented are based on results obtained with phi 29, since most studies concerned analysis of this phage. In addition, an integrated overview of homologies and differences between the three groups of the phi 29 genus based on the complete DNA sequences of phi 29 (group I), B103 (group II), and GA-1 (group III) is presented.


GENERAL FEATURES OF PHAGES phi 29, B103, AND GA-1
Top
Previous
Next
References

The phi 29-like phages are the smallest Bacillus phages isolated so far and are among the smallest known phages containing dsDNA. The sizes of the phage particle of each of the three groups of phi 29-like phages are shown in Table 1. Phage phi 29 was first isolated by Reilly (169) from garden soil. Phage B103 was first isolated from a nonspecified lysed Bacillus culture (91), and phage GA-1 was first isolated by Bradley (39) from rotting lawn mowings. Electron microscopy analysis showed that the phage particles of phi 29, B103, and GA-1 have a sixfold radial symmetry and a short noncontractile tail tube. A schematic representation of a phi 29 phage particle in which each protein is indicated is shown in Fig. 1. Analysis of the host range showed that phi 29 was able to infect B. subtilis strains 168, 110NA, and Marburg, B. amyloliquefaciens H, and several strains of B. licheniformis and B. pumilus (reviewed in reference 177). The host range of B103 has not been studied, but it is known to infect B. subtilis 9/3 (163). Finally, GA-1 was shown to infect lytically Bacillus species strain G1R (39). Arwert and Venema (10) showed that GA-1 is unable to infect the standard B. subtilis strain 168. Although sequence analysis of G1R 16S rRNA showed that Bacillus strain G1R is most closely related to B. pumilus, GA-1 is unable to infect B. pumilus strains BP1 or B205-L (J. A. Horcajadas, unpublished results). Therefore, the species identification of Bacillus strain G1R, the specific host of GA-1, remains unclear.

                              
View this table:
[in this window]
[in a new window]
 
TABLE 1.   Structure of the phi 29-like phages



View larger version (36K):
[in this window]
[in a new window]
 
FIG. 1.   Schematic representation of a phi 29 phage particle. The various proteins are indicated.


SEQUENCE ANALYSIS OF THE GA-1 GENOME
Top
Previous
Next
References

The DNA sequence of the complete genomes of phi 29 (group I) (83, 84, 216, 221, 224) and B103 (group II), (163) are known, and they consist of 19,285 and 18,630 bp, respectively. However, only noncontinuous parts of the GA-1 genome have been sequenced before. These include (i) the left (168 bp; GenBank accession number M19512) and right (168 bp; M19519) terminal nucleotide sequences (222); (ii) the central region containing the early promoters A2b and A2c and the late promoter A3 (354 bp; AJ133524) (111); (iii) gene 6 encoding the dsDNA binding protein (DBP) p6 (342 bp; AF148209) (78); (iv) gene 5 encoding the single-stranded DNA (ssDNA) binding protein (SSB) p5 (513 bp; AJ244026) (86); (v) gene 4 encoding the transcriptional regulatory protein p4G (405 bp; AJ133525) (111); (vi) the region spanning genes 3 and 2 encoding the TP (p3) and DNA polymerase (p2), respectively (2,668 bp; X96987) (114); and (vii) a region downstream of gene 2 (549 bp; AJ294726) (A. Bravo and M. Salas, unpublished data). Genes 6 through 2 lie in the same order as the corresponding genes of phages phi 29 and B103. Where possible, the individual sequences were integrated in larger contigs and gaps were filled using primers based on the published sequences and purified GA-1 as template DNA. Next, the remaining part (~17 kbp) of the GA-1 genome sequence was determined of both strands by a primer-walking strategy using purified phage GA-1 DNA as template.

The genome of GA-1 was shown to have a total size of 21,129 bp. The complete nucleotide sequence has been deposited in the EMBL/GenBank/DDBJ nucleotide sequence database and was assigned accession number X96987. Whereas the G+C content of GA-1 is 34.7%, those of phi 29 and B103 are 40.0 and 37.7%, respectively. Next, computer-assisted and manual analysis of the DNA sequence were used to identify open reading frames (ORFs), direct and inverted repeats, and putative promoters, ribosomal binding sites, and Rho-independent transcriptional terminators. The deduced amino acid sequences of the various ORFs were compared with protein sequences present in the phi 29 and B103 genomes as well as with those present in available databases. In cases where the deduced amino acid sequences of the identified ORFs or genes showed significant homology to those of the phi 29 and B103 genes, they were given numbers according to the nomenclature used for these phages. The remaining ORFs were identified with letters. The data obtained were used to construct a putative genetic and transcriptional map of GA-1, which is shown, together with those of phi 29 and B103, in Fig. 2. This figure shows that genes 2 to 6, 7 to 16 (with the exception of gene 8.5, which is lacking in GA-1), and 17 and 16.7 are conserved in all three genomes. Characteristics of the proteins synthesized by these GA-1 genes and their levels of similarity to corresponding proteins of phi 29 and B103 are given in Table 2, which shows that for all the homologous genes shared by phi 29, B103, and GA-1, those of GA-1 are less conserved than those of phi 29 and B103. This confirms that within the family of phi 29-related phages, GA-1 is the most distantly related one, as suggested previously (164, 220, 222). Features of the putative proteins synthesized by the GA-1 ORFs are given in Table 3.


View larger version (26K):
[in this window]
[in a new window]
 
FIG. 2.   Genetic and transcriptional maps of GA-1 (group III, 21,129 bp), phi 29 (group I, 19,285 bp), and B103 (group II, 18,630 bp). The maps are aligned according to the A2b, A2c, A3 promoter region. The direction of transcription and lengths of the transcripts are indicated by wavy arrows. The transcripts of late- and early-expressed operons and the late and early promoters (boxed) are shown below and above the map, respectively. The positions of the various genes and ORFs are indicated between the two DNA strands. Genes are indicated by numbers, ORFs are indicated by letters (capital for GA-1 and phi 29, and lowercase for B103). The positions of genes 17 and 16.7 that are conserved in all three phage genomes, located in the right early operon, are indicated. The positions of phi 29 ORFs 16.9, 16.8, 16.6, and 16.5 and B103 ORF 16.5, located at the right side of their genomes, are indicated by the numbers .9, .8, .6, and .5. Transcriptional terminators are indicated by hairpin structures. A light grey box indicates the DNA region encoding the pRNA, and a black box indicates the region spanning the early A2b, A2c, and late A3 promoters.

                              
View this table:
[in this window]
[in a new window]
 
TABLE 2.   Characteristics of GA-1 genes


                              
View this table:
[in this window]
[in a new window]
 
TABLE 3.   Features of GA-1 ORFs


GENETIC AND TRANSCRIPTIONAL ORGANIZATION
Top
Previous
Next
References

Generally, genes with related functions are clustered in phage genomes (4), and Fig. 2 shows that phi 29, B103, and GA-1 are no exception to this rule. In addition, Fig. 2 shows that in most aspects, the genomes of phi 29, B103, and GA-1 are similarly organized. In all three genomes the genes and ORFs are organized in operons. Depending on the time when they are first expressed during the infection cycle, these can be divided into early and late operons. In all three genomes the early-expressed operons are transcribed leftward and the single late-expressed operon is transcribed rightward. The genes present in the late operon (genes 7 through 16), which is located in the central part of the genome, encode phage structural proteins, proteins involved in phage morphogenesis, and proteins required for lysis of the host. All three genomes contain an early-expressed operon that is divergently transcribed with respect to the late operon (Fig. 2). Genes 6, 5, 3, and 2 of this operon encode the four main proteins required for phage DNA replication. The operon also contains gene 4, which encodes the transcriptional regulator protein. In addition to its role in phage DNA replication, protein p6 also has a role in transcriptional regulation (14, 69, 219). Note that this operon of GA-1 is smaller than the corresponding ones of phi 29 and B103. Another early-expressed operon is located at the right side of the phage genomes. However, as described in more detail later, only two genes of this operon, 17 and 16.7, are conserved in all three phage genomes. Finally, another feature shared by all three phages is the presence of a region located in the left part of the genome that encodes an RNA (pRNA) which is required for packaging of phage DNA.

The genome of GA-1 is about 1.8 and 2.5 kb larger than those of phi 29 and B103, respectively. Although the structural organization of GA-1 genome is similar to that of phi 29 and B103, it contains additional sequences, located at both genome ends, that may encode several proteins, counterparts of which are not present in the genomes of phi 29 and B103 (see Fig. 2).


TRANSCRIPTIONAL REGULATION
Top
Previous
Next
References

The (putative) promoters and transcriptional start sites, for these cases already determined, are listed in Table 4. When appropriate, the nomenclature of the GA-1 promoters was adapted to that of phi 29 and B103. Expression of most phi 29 and GA-1 promoters has been studied. As indicated in Table 4, most of the promoters contain the sequence TG positioned 1 bp upstream of the -10 sequence. This additional sequence is characteristic of the so-called -10 extended promoters first described for Escherichia coli promoters (123, 165). At least in E. coli, the extension of the -10 region is able to compensate for the absence of a good -35 box, helping the sigma 70 RNA polymerase to recognize and bind such promoters (123, 128, 165). The additional TG sequence is also frequently found in sigma A-dependent B. subtilis promoters (106, 152). Possible involvement of the TG motif in promoter strength has been recently studied for the phi 29 promoters A1, A2c, and A3 (46). In all three promoters, mutation of the TG motif impaired the binding of the sigma A-RNA polymerase to the promoter. These and additional results support the view that the TG motif provides contact sites for B. subtilis sigma A RNA polymerase that are important for a specific role in the first steps of transcription (46).

                              
View this table:
[in this window]
[in a new window]
 
TABLE 4.   Promoters of phi 29, B103 andGA-1

The B. subtilis alpha -amylase promoters amyP and amyP2 contain the TGTG sequence located 1 bp upstream of its -10 region, called the -16 region. Mutation analysis of the -16 region of these promoters showed that it significantly affected the in vitro promoter strength (217). In addition, a large portion of known gram-positive bacterial promoters contain the -16 TRTG motif (in which R is a purine), suggesting that not only the -10 extended TG motif but also the -16 region is important for promoter strength (217). The -16 region is present in the following phage promoters: A1 and A2b of phi 29, A1 of B103, and A1c, A3 and C2 of GA-1 (Table 4). Possible involvement of the -16 region in the activity of these phage promoters has not been studied yet.

Early Promoters A2b and A2c and Late Promoter A3: Transcriptional Regulation by Proteins p4 and p6

As described above, the structural organization of the centrally located late operon and the divergently oriented early operon is conserved in the genomes of phi 29, B103, and GA-1. In all three phage genomes the promoters that drive the expression of these early and late genes are localized in a short intergenic region between these two operons. The transcriptional regulation of these promoters has been studied extensively for phi 29 (for reviews, see references 171 and 182). Two strong promoters named A2c and A2b drive the expression of the early operon of phi 29 containing genes 6 to 1. The late phi 29 operon is transcribed from a single promoter named A3 (16, 136, 137, 149, 197). The transition from early to late phi 29 transcription is controlled by phi 29 protein p4, the product of the early gene 4. Protein p4, which is a dimer in solution, binds to its cognate DNA binding sites as a tetramer (142), contacting only one side of the DNA helix (172). The intergenic region comprising promoters A2c, A2b, and A3 contains two p4 binding sites. The center of one of these is located at position -82 relative to the transcription start site of the late promoter A3 (15). Whereas this promoter contains a good consensus sequence at the -10 region for the vegetative B. subtilis sigma A RNA polymerase, it lacks a typical -35 box (Table 4). Therefore, the RNA polymerase alone does not bind efficiently to the A3 promoter, which explains why the downstream operon is not expressed during early infection times. Activation of the A3 promoter requires binding of protein p4 to the p4 binding site upstream of the A3 promoter. The main role of protein p4 is to stabilize the binding of RNA polymerase to the A3 promoter as a closed complex, and the protein has little effect on the rest of the steps of the initiation process (157).

The phi 29 promoters A2c and A2b drive the expression of the early operon containing genes 6 to 1. Of these, promoter A2b is the one located closest to the oppositely oriented late promoter A3; promoter A2c is located proximal to gene 6. Both early promoters are repressed by protein p4. The p4 binding site that is located upstream of the late A3 promoter and is required for activation of this promoter, as described above, partially overlaps the early A2b promoter. Binding of protein p4 to this site occupies the -35 region of the A2b promoter, preventing the expression of this promoter. Thus, protein p4 activation of the late promoter A3 is accompanied by an efficient repression of the A2b promoter (172). Expression of the other early promoter, A2c, is also repressed by protein p4, but this occurs through a totally different mechanism. In addition to the p4 binding site upstream of the late promoter A3, another p4 binding site is located upstream of promoter A2c (centered at position -72 relative to the transcription start site of A2c). Protein p4 binding to this site is stabilized in the presence of RNA polymerase, indicating that the proteins bind cooperatively to the DNA. In this situation, the RNA polymerase can generate abortive initiation transcripts but is unable to escape from the A2c promoter (150). Thus, repression of the A2c promoter occurs by overstabilization of the RNA polymerase to this promoter (148). Interestingly, both repression at the A2c promoter and activation of the A3 promoter involve interaction between a region of protein p4 containing Arg120 and the C-terminal domain of the RNA polymerase alpha  subunit (140-143, 150, 151, 171).

Recently it was demonstrated that expression of the phi 29 A2c, A2b, and A3 promoters is regulated by the viral protein p6 in addition to protein p4 (69). Protein p6 is an abundantly early-expressed dsDNA binding protein that was shown previously to play an important role in initiation of phage DNA replication (see below). Elías-Arnanz and Salas (69) showed that protein p6 promotes p4-mediated repression of the A2b promoter and activation of the A3 promoter by enhancing binding of p4 to its recognition site at promoter A3. In addition, protein p4 promotes p6-mediated repression of the A2c promoter by favoring the formation of a stable p6-nucleoprotein complex that interferes with RNA polymerase binding to promoter A2c.

Although transcriptional regulation of the equivalent promoters of B103 has not been studied, conservation of the main characteristics of this region regarding the A3 and A2b promoters suggests that transcription of these promoters may be regulated in a similar way to those of phi 29. Results that at least partially support this assumption may come from the analysis of the corresponding region of phage Nf (147, 158), which belongs to the same group of phages as B103. First, it was shown that activation of the late A3 promoter of Nf requires the Nf-encoded protein gpF (homologue of the phi 29 protein p4); (147). Second, Nuez and Salas (158) showed that activation of the Nf A3 promoter is responsive to the phi 29 protein p4 in a similar way to that observed for the phi 29 A3 promoter.

A first in vivo and in vitro analysis of the transcriptional regulation of the equivalent promoters of GA-1 has been reported recently (111). The in vivo activity of the GA-1 A2b and A2c promoters was shown to diminish 10 min after infection, whereas at this time the expression of the late A3 promoter increased significantly. The GA-1-encoded protein p4 (named p4G, 53% similar to phi 29 p4) was purified and used to study its involvement in regulation of these promoters in vitro. As in phi 29, a p4G binding site is located upstream of the late A3 promoter that overlaps with the early A2b promoter. As in phi 29, binding of p4G to this site prevented the binding of RNA polymerase to the GA-1 early A2b promoter. Surprisingly, however, binding of p4G to this site had no effect on the in vitro expression of the late A3 promoter of GA-1. Both in the absence and in the presence of p4G, promoter A3 was expressed efficiently in vitro. Thus, in contrast to the situation in phi 29, p4G is not required in vitro to activate the expression of the GA-1 A3 promoter. Moreover, in contrast to the phi 29 protein p4, the GA-1 protein p4G was shown not to interact with the RNA polymerase alpha  subunit (111). Although the A3 promoter of GA-1 was active in the absence of p4G in in vitro assays, it was not active at early infection times in vivo. In addition, in vivo activation of the A3 promoter was completely blocked when protein synthesis was prevented just before infection. Together, these results suggested that the A3 promoter may be repressed in vivo by a host-encoded protein and that protein p4G may function as an antirepressor, permitting A3 expression at late infection times. Finally, it is intriguing that the GA-1 A3 promoter, which, like the A3 promoters of phi 29 and B103, lacks a good -35 box, is expressed efficiently in vitro. Studies are under way to unravel the mechanisms that underlie the observed differences in regulation of the phi 29 and GA-1 A3 promoters.

At present, it is unknown whether a p4-dependent repression of the A2c promoter, as described for phi 29, also applies for the equivalent A2c promoters of Nf/B103 or GA-1. The fact that a typical p4 binding site is lacking upstream of the A2c promoters of B103 (163), Nf (158), and GA-1 (111) may be an indication that p4 is not involved in the repression of these promoters, at least not in a similar way to that in phi 29. It is also unknown whether protein p6 of B103/Nf and/or GA-1 plays a role in the regulation of the A2c, A2b, and A3 promoters of these phages.

Early Promoter C2: Transcriptional Regulation by Protein p6

All three phage genomes contain an early-expressed operon located at the right end of their genome, whose expression is under the control of the C2 promoter (Fig. 2). For phi 29 it has been demonstrated that the activity of the early promoter C2 decreases rapidly 10 min after infection (110, 122, 149). Protein p6 was shown to be responsible for in vivo and in vitro repression of promoter C2 (14, 219). Thus, the phi 29 p6 protein not only plays a role in the regulation of the A3, A2b, and A2c promoters (see above) but also regulates the expression of the C2 promoter. In addition, as described below, it plays an important role in the initiation of phi 29 DNA replication. Most probably, binding of p6 to the DNA ends prevents the RNA polymerase to recognize the C2 promoter (A. Camacho and M. Salas, unpublished results). The phi 29 mutant sus6(626) contains a suppressible mutation in gene 6, and therefore protein p6 is not synthesized in nonsuppressor cells infected with this mutant phage. When phi 29 sus6(626) mutant phage was used for infection, phage DNA replication did occur in suppressor cells but not in nonsuppressor cells (219). However, under these conditions the C2 promoter was not repressed in either nonsuppressor or suppressor cells. It appeared that whereas the amount of p6 protein synthesized under permissive conditions was sufficient to permit in vivo phi 29 DNA replication, it was too small to repress the C2 promoter in vivo (47, 219). The observation that a fairly large amount of p6 is required for repression of the C2 promoter in vitro (14, 219) supports this view.

Equivalent C2 promoters are also present in the genomes of B103 and GA-1. Like the C2 promoter of phi 29, the GA-1 C2 promoter is expressed almost exclusively during the first 10 min after infection (Horcajadas, unpublished). In vitro expression of the C2 promoter of GA-1 is inhibited in the presence of purified GA-1-encoded protein p6, as well as, although somewhat less efficiently, by protein p6 of phi 29. DNase I footprint analysis indicated that DNA binding of protein p6 prevents the RNA polymerase from recognizing the C2 promoter of GA-1 (Horcajadas, unpublished results). Thus, due to protein p6-mediated repression, the phi 29 and GA-1 C2 promoters are expressed only during the initial 10 min after infection. Obviously, this repression will limit the amount of proteins encoded by the downstream genes and ORFs.

Early Promoters C1, C1a, and C1b Present in phi 29, B103, and GA-1, Respectively

All three phage genomes contain a promoter within the early operon located at the right side of their genome (Fig. 2 and Table 4). The absence of potential transcriptional terminators upstream of these promoters suggests that the last genes or ORFs of these operons may be expressed from two promoters. In phi 29 this additional promoter was named C1. It is located within gene 16.7 and may drive the expression of ORFs 16.6 and 16.5. In B103, the promoter is located within ORF d and may drive the expression of gene 16.7 and ORF 16.5. According to the phi 29 nomenclature, this promoter of B103 was named C1 (163). Finally, in GA-1 the promoter is located within ORF G and may drive the expression of ORFs H to L. Since these promoters drive the expression of different genes and ORFs, they are not equivalent. Therefore, we named these promoters of B103 and GA-1 Cla and Clb, respectively.

In vitro transcription analysis showed that expression of the phi 29 C1 promoter is repressed by protein p6 (14). Although p6 repressed the C2 promoter in the presence of low and high salt concentration, p6 affected C1 expression only at low salt concentrations. This difference may be due to the higher affinity of p6 for the terminal phi 29 DNA fragment containing the C2 promoter than for the more internal DNA sequences containing the C1 promoter (14).

Promoter A1, Driving Synthesis of the pRNA

For phi 29 it has been demonstrated that packaging of TP DNA into the phage prohead requires a 174-base phi 29-encoded RNA (pRNA) (5, 93, 94). This pRNA is produced from promoter A1 (136, 137, 197), which is active throughout the infection cycle (149). Although substantial levels of pRNA were detected at early infection times, a rapid increase in the number of pRNA molecules was detected starting about 15 min after infection, which approximately coincided with the onset of phi 29 DNA replication. Therefore, the additional phage DNA templates produced explain this increase of pRNA and suggest a constant transcription rate (149).

Equivalent A1 promoters driving pRNA synthesis of the corresponding phages are present in B103 and GA-1 (Table 4). The pRNA coding sequences of phi 29 and B103 are located at the far-left ends of their genomes. Figure 2 shows that the situation is different for GA-1. This genome contains an additional operon downstream of the pRNA-coding region, as well as another operon located between gene 2 and its pRNA-coding region. A promoter is located upstream each of these two unique operons. Thus, whereas the leftmost region of the phi 29 and B103 genomes contains only one promoter, this region of GA-1 contains three promoters. To maintain a consistent nomenclature, the GA-1 promoter upstream of ORF M was named A1a, the one driving the expression of the GA-1 pRNA was named A1b, and the one upstream of ORF P was named A1c.

The expression patterns of GA-1 promoter A1b and B103 promoter A1 during the infection cycle have not been studied. Table 4 shows that the -35 and -10 sequences of the A1 promoters of phi 29 and B103 and the equivalent A1b promoter of GA-1 are almost identical and very close to the consensus sequence recognized by sigma A-containing RNA polymerase. Therefore, it is likely that the A1b promoter of GA-1 and the A1 promoter of B103 behave similarly to the equivalent A1 promoter of phi 29.

Other Promoters in the phi 29 Genome

In vivo and in vitro experiments revealed two promoters, named B1 and B2, that are located in the phi 29 DNA region encoding the late genes (16, 197) (Fig. 2). Transcription from these promoters proceeds leftward. Compared to other phi 29 promoters, only minor amounts of RNA were synthesized by the B1 and B2 promoters in vivo (149). No ORF with a reasonable ribosome binding site was found downstream of either of these promoters. Although it has been suggested that the products synthesized by these promoters may function as antisense RNA to modulate the expression of some late genes (16, 136), such a function has not been proven experimentally. The phi 29 promoter A1IV, located in the DNA polymerase coding region (Fig. 2), was shown to be weakly expressed in vivo (16) and to contribute to the synthesis of protein p1 (40). The B1, B2, and A1IV promoters are shown in Table 4.

Other Promoters in the GA-1 Genome

The promoters A1c and A1a are unique for GA-1. Primer extension analysis using total RNA isolated at different times after infection showed that these two promoters are active early after infection and that they are progressively downregulated at later infection times (J. A. Horcajadas, unpublished). Therefore, it is likely that promoters A1c and A1a drive the expression of the GA-1 regions containing ORFs P to T and M to O, respectively. At present, the mechanism underlying the in vivo repression of these promoters is unknown. Since the pattern of repression of these promoters is different from that of the abruptly repressed C2 promoter, it is unlikely that these promoters are repressed by protein p6 in a similar way to the C2 promoter.


TRANSCRIPTIONAL TERMINATION
Top
Previous
Next
References

The main early and late in vivo transcription termination sites of phi 29 have been determined by S1 nuclease mapping (17). Transcription of the late A3 promoter and that of the early promoters C2 and C1 terminated in the short intergenic region between gene 16 and ORF 16.5 (Fig. 2). This DNA region contains an inverted repeat, and stem-loop structures with calculated free energies of -14.8 and -16.8 kcal could be drawn for the early and late transcripts, respectively. In both directions, a uridine-rich tail follows the stem-loop, indicating that it functions as a Rho-independent bidirectional transcription terminator. This terminator was named TD1. Inverted repeats are located at similar positions in the genomes of B103 and GA-1. As in phi 29, uridine-rich tails at either strand follow the stem-loops of B103 and GA-1, indicating that these also constitute bidirectional Rho-independent transcriptional terminators. According to the phi 29 nomenclature, these terminators were named TD1. The DNA sequences of the TD1 terminators are shown in Table 5.

                              
View this table:
[in this window]
[in a new window]
 
TABLE 5.   Transcriptional terminators

Another Rho-independent transcriptional terminator, named TA1, was found to be present within gene 4 of phi 29 (17). It has been suggested that part of the transcripts initiated at the A2b and A2c promoters terminate at this terminator. This would result in the synthesis of high levels of mRNA coding for proteins p6 (DBP) and p5 (SSB) and lower levels of longer mRNA coding for proteins p6 to p1 (17). Apart from possible differences in translation initiation efficiencies, this explains why p6 and p5 are synthesized in far larger quantities than are proteins p4, p3, p2, and p1 (2, 86, 139). Equivalent TA1 transcriptional terminators are present in the genomes of B103 and GA-1, indicating that a regulatory mechanism similar to that proposed for phi 29 exists in B103 and GA-1. In all three genomes, the TA1 transcriptional terminator is located at very similar positions within gene 4. Thus, the mRNAs synthesized up to the TA1 terminators may allow the synthesis of the N-terminal 28 to 30 amino acids of protein p4. Interestingly, this region of the three p4 proteins is far more conserved than the downstream p4 region (Fig. 3), which might imply that the N-terminal 30 amino acids of p4 could have a function on its own.


View larger version (25K):
[in this window]
[in a new window]
 
FIG. 3.   Alignment of the deduced transcriptional regulatory protein sequences encoded by phi 29 (p4phi29), B103 (p4B103), and GA-1 (p4GA-1). Black and grey boxes enclose residues that are conserved in all three or two of the three sequences, respectively. The following amino acids were considered conservative: L, I, V, A, and M; F, Y, and W; K and R; D and E; Q and N; and S and T. The position up to which the N-terminal part of the respective gene 4 can be translated, considering the mRNA length terminated at the transcriptional terminator TA1, is indicated by a vertical arrow.

No potential Rho-independent transcriptional terminator is present downstream of the pRNA coding region of phi 29, which constitutes the most leftward-reading region of this genome (Fig. 2). This could imply that transcription, starting from the A2b, A2c, and A1 promoters, continues until the left end of the genome is reached. It has indeed been shown that in vivo transcription initiating at these phi 29 promoters reaches the very left end of the phi 29 DNA molecule as if the RNA polymerase would run off the template (16, 17). The same organization and the absence of a potential Rho-independent terminator downstream of the B103 pRNA-coding region suggests a similar situation for B103.

The situation is different, however, for GA-1. As shown in Fig. 2, three potential Rho-independent terminators are present in the left part of the GA-1 genome. The one located closest to the left DNA end (downstream of ORF T), named TA4, would terminate transcription initiating at the A1c promoter of GA-1. The middle one, named TA3, located downstream of the pRNA coding region, would terminate transcription initiating from the GA-1 promoter A1b and possibly A1a. The third one, named TA2, would terminate transcription initiating from the GA-1 A2c and A2b promoters. Note that in contrast to the situation in phi 29 and B103, the GA-1 terminator TA2 is located directly downstream of gene 2. The -35 sequence of the GA-1 promoter A1a is located within this terminator.


PROTEIN-PRIMED MECHANISM OF DNA REPLICATION
Top
Previous
Next
References

The genomes of the phi 29-like phages consist of a linear dsDNA molecule of about 20 kb with a phage-encoded protein, TP, covalently attached at each 5' end. Genomes consisting of a linear dsDNA molecule with a TP covalently linked to their 5' ends have also been found for (i) other bacteriophages (e.g., the Streptococcus pneumoniae and Escherichia coli phages Cp-1 and PRD1, respectively), (ii) animal viruses (e.g., adenoviruses), (iii) plasmids (e.g., S1 and Kalilo), and (iv) bacteria (e.g., Streptomyces). In most of these cases, initiation of DNA replication occurs via a so-called protein-priming mechanism (for reviews, see references 176, 178, and 181).

The in vitro mechanism of protein-primed DNA replication has been studied in most detail for phi 29. The basic features of the protein-primed mechanism of DNA replication, based on the phi 29 system, are outlined here. More detailed descriptions of the different steps and the function of the proteins involved are given below. In addition, it should be mentioned that although the main characteristics of protein-primed DNA replication are conserved, some minor differences with respect to the phi 29 mechanism have been observed in some cases, especially regarding the sliding-back step (see below). Figure 4 shows a schematic representation of in vitro phi 29 DNA replication. Initiation of phi 29 DNA replication starts with recognition of the origin of replication, i.e., the TP-containing DNA ends, by a TP-DNA polymerase heterodimer. The virus-encoded protein p6 forms a nucleoprotein complex that would help to open the DNA ends (187), facilitating the formation of a covalent linkage between the first inserted nucleotide (dAMP) and TP, which is catalyzed by the phi 29 DNA polymerase (29, 109). The formation of this first TP-dAMP covalent complex is directed by the second nucleotide at the 3' end of the template; then the TP-dAMP complex slides back 1 nucleotide to recover the information of the terminal nucleotide (144). Next, the phi 29 DNA polymerase synthesizes a short elongation product before dissociating from the TP (146). Replication, which starts at both DNA ends, is coupled to strand displacement. This results in the generation of so-called type I replication intermediates consisting of full-length phi 29 dsDNA molecules with one or more ssDNA branches of different lengths. The ssDNA stretches generated are bound by the SSB protein (p5). When the two converging DNA polymerases merge, a type I replication intermediate becomes physically separated into two type II replication intermediates. Each of these consists of a full-length phi 29 DNA molecule in which a portion of the DNA, starting from one end, is double stranded and the portion spanning to the other end is single -stranded (102, 117). Continuous elongation by the DNA polymerase completes replication of the parental strand.


View larger version (27K):
[in this window]
[in a new window]
 
FIG. 4.   Schematic representation of the mechanism of phi 29 protein-primed DNA replication. Primer and parental TP are shown in black and grey, respectively. p6, double-stranded DNA binding protein; p5, single-stranded DNA binding protein (SSB). Only the left DNA end has been drawn, except for type I and type II molecules, where both DNA ends are shown. See the text for details. Adapted with permission from reference 182.


INITIATION OF DNA REPLICATION
Top
Previous
Next
References

DNA Polymerase-TP Heterodimer Formation

DNA polymerases are unable to initiate de novo DNA synthesis on a DNA template but require the existence of a primer containing a free hydroxyl group to start DNA elongation (126). Generally, RNA primers provide the 3'-hydroxyl (3'-OH) group needed by the DNA polymerase to elongate the DNA chain. However, in most linear genomes containing a TP covalently linked to their 5' DNA ends, the 3'-OH group of a specific serine, threonine, or tyrosine residue of the TP is used for DNA elongation (reviewed in reference 181). In phi 29 DNA polymerase, its TP deoxynucleotidylation activity is responsible for the covalent linkage of 5'-dAMP, via a phosphoester bond, to the hydroxyl group of Ser232 of the TP (24, 29, 109). This reaction requires the formation of a stable heterodimer complex between the TP and the phi 29 DNA polymerase (28). Most probably, the active site used for polymerization is also used for the TP deoxynucleotidylation reaction (reviewed in references 31 and 32). This implies that the TP present in the heterodimer complex has to be specifically positioned in order for the DNA polymerase to perform the TP deoxynucleotidylation reaction. Several mutations located in different regions of the phi 29 DNA polymerase affect its interaction with the TP (37, 61, 145, 206). In addition, interaction of TP with the purified C-terminal portion of the phi 29 DNA polymerase is severely impaired (209). Together, these results suggest that interaction of the TP involves many contacts with different regions of the DNA polymerase.

Interestingly, a multiple sequence alignment of DNA polymerases belonging to the B-type family showed that DNA polymerases involved in protein-primed DNA replication contain two regions of amino acids, denoted TPR-1 and TPR-2 (Fig. 5), which are not present in other B-type DNA polymerases (33). Analysis of the phi 29 mutant DNA polymerase in which the conserved Asp332 residue of the TPR-1 region was changed into Tyr showed that it was able to form a stable heterodimer with TP and that it had essentially wild-type levels of synthetic activities in DNA primed reactions. However, its activity was drastically affected in phi 29 TP-DNA replication, indicating that the mutant DNA polymerase forms a non functional interaction with the TP and hence supporting the view that at least TPR-1 is involved in proper positioning of the TP in the TP-DNA polymerase heterodimer complex (68).


View larger version (16K):
[in this window]
[in a new window]
 
FIG. 5.   Structural and functional map of the DNA polymerase. The N-terminal domain, required for proofreading and strand displacement, and the C-terminal initiation and polymerization domain are indicated by white and black rectangular boxes, respectively (32). The ExoI, ExoII, and ExoIII motifs, as well as the motifs of the C-terminal domain, are indicated: motif 1 (also called motif A), motif 2a (also called motif B), motif 2b, motif 3 (also called motif C), and motif 4. In addition, the position of the YxG(G/A) motif, important for the coordination between 3'-5' exonuclease and 5'-3' polymerization, is indicated by CT (cross talk). All these motifs are conserved in B-type proofreading-proficient DNA polymerases. The amino acid sequences of each of these various motifs present in the DNA polymerases of phi 29, B103 and GA-1, together with each consensus sequence, are indicated below the map. The three Asp residues that form a metal binding triad required for catalysis at the polymerization active site are indicated by an asterisk. Finally, the position of the TPR-1 and TPR-2 motifs, characteristic of DNA polymerases involved in protein-primed DNA replication, are indicated. See the text for further details.

Sliding-Back Mechanism

Although the TP deoxynucleotidylation reaction can occur in the absence of a DNA template, it is strongly stimulated in the presence of phi 29 TP-DNA (24). In the latter case, TP-dAMP is preferentially formed. The DNA ends of phi 29 have a short inverted terminal repeat of 6 nucleotides (3'-TTTCAT-5'). The first TP-dAMP is not directed by the terminal nucleotide but by the penultimate nucleotide of the phi 29 template strand. Subsequently, the complex slides back 1 nucleotide to recover the information of the 3'-terminal nucleotide (144). Terminal repeats are also present in the genomes of B103 (3'-TTTCAT-5'), GA-1 (3'-TTTATCTT-5'), and all other phi 29-related phages analyzed so far. Moreover, this feature is also conserved in other linear genomes containing a TP covalently linked to their DNA ends, such as the E. coli and S. pneumoniae phages PRD1 and Cp-1, respectively, linear plasmids, and the eukaryotic adenovirus. Terminal reiteration is a prerequisite for the sliding-back mechanism. Indeed, the replication initiation site in GA-1 (114), PRD1 (43), Cp-1 (132), and adenovirus (125) corresponds to an internal nucleotide close to the 3'-terminal end, and a sliding-back or similar mechanism has been shown to occur in these cases to recover the information of the terminal nucleotide(s). Probably, the sliding-back mechanism applies to all genomes that replicate via a protein-primed mechanism. Since proofreading does not apply to the TP-dNMP product (72), the sliding-back mechanism would be an alternative way to ensure that the replication origin-containing DNA ends are replicated with high fidelity.

Transition from Protein-Primed to DNA-Primed Replication

After the sliding-back step, the phi 29 DNA polymerase and the primer TP do not dissociate immediately. Rather, there is a transition stage in which the DNA polymerase synthesizes a DNA molecule of 5 nucleotides while complexed with the primer TP (initiation mode). During the synthesis of nucleotides 6 to 9 the complex undergoes some structural change (transition mode), and the DNA polymerase finally dissociates from the primer TP when the nucleotide 10 is inserted into the nascent DNA chain (elongation mode) (146). This behavior probably reflects a requirement of the phi 29 DNA polymerase for a DNA primer of a minimum length to efficiently carry out DNA-primed elongation. This view is supported by the following data. First, Méndez et al. (146) demonstrated that primer molecules of 6 nucleotides or less are not elongated. This fits well with the observation that phi 29 DNA polymerase synthesizes a DNA chain of 5 nucleotides before it changes from the initiation mode to the elongation mode in TP-DNA-primed reactions. Second, abortive replication products consisting of the primer TP linked up to 8 nucleotides were particularly observed under conditions that decrease the strand displacement capacity of phi 29 DNA polymerase (146). Finally, de Vega et al. (62) demonstrated that phi 29 DNA polymerase covers a DNA region of 10 nucleotides, which may be indicative of the optimum length to carry out polymerization. Interestingly, the phi 29 DNA polymerase mutant in which Asp456, belonging to the conserved "YxDTDS" motif at the polymerization domain (see below), has been changed into Gly is unable to proceed further than 5 nucleotides from the initiation complex. This suggested that the phi 29 DNA polymerase residue Asp456 is crucial to entry into the transition stage of phi 29 DNA replication (185).

A similar transition step has also been demonstrated in replication of adenovirus (124) and probably is a general feature of protein-primed DNA replication.


THE FOUR MAIN PROTEINS REQUIRED FOR IN VITRO DNA REPLICATION
Top
Previous
Next
References

In the phi 29, B103, and GA-1 genomes, genes 6, 5, 3 and 2 are located in a single early-expressed operon (Fig. 2). In phi 29, these genes are indispensable for in vivo phage DNA replication. Gene 2 encodes the DNA polymerase, gene 3 encodes the TP, gene 5 encodes SSB, and gene 6 encodes DBP. An in vitro phi 29 DNA replication system, based on these four purified proteins, has been established (27). The availability of this system has allowed a detailed analysis of the in vitro phi 29 DNA replication mechanism and functional analysis of these four main replication proteins. Characteristics of these four proteins are given below.

DNA Polymerase

Gene 2 of phi 29, B103, and GA-1 encodes a DNA polymerase. In phi 29 and GA-1 the DNA polymerase has been shown to be required for replication of its phage DNA (29, 114). The DNA polymerases encoded by phi 29, B103, and GA-1 belong to the B-type superfamily of DNA-dependent DNA polymerases (also referred to as eukaryotic or alpha -like polymerases). This family includes a large number of prokaryotic and eukaryotic enzymes that are sensitive to certain drugs (aphidicolin and phosphonoacetic acid) and nucleotide analogs (butylanilino-dATP and butylphenyl-dGTP). The DNA polymerase of phi 29 has been analyzed in detail (for reviews, see references 31 and 32). The monomeric phi 29 DNA polymerase, which has a size of only about 66 kDa, catalyzes both the initiation and elongation stages of DNA synthesis (29, 30). To accomplish this, it is able to carry out two distinguishable synthetic reactions: TP deoxynucleotidylation and DNA polymerization. In addition, it has two degradative activities: pyrophosporolysis and 3'-5' exonucleolysis. Moreover, it has two intrinsic properties: high processivity and strand displacement ability (25). Due to the phi 29 DNA polymerase properties, in vitro phi 29 DNA replication does not require accessory proteins and DNA helicases (25).

The enzymatic activities of the phi 29 DNA polymerase have been mapped by site-directed mutagenesis. A structural map, given in Fig. 5, shows that the phi 29 DNA polymerase has a bimodular organization, with the N-terminal portion constituting the 3'-5' proofreading domain and the C-terminal portion constituting the domain responsible for its 5'-3' synthetic activities. The bimodular organization of the phi 29 DNA polymerase has been proven experimentally. Analysis of a purified C-terminal deletion derivative of phi 29 DNA polymerase containing the 188 N-terminal amino acids showed that it was devoid of any synthetic activity but retained 3'-5' exonuclease activity (31). Reciprocally, a purified N-terminal deletion derivative containing the C-terminal 388 amino acids had neither 3'-5' exonuclease nor strand displacement activity but did have synthetic activities (209). Available three-dimensional structures of other DNA polymerases show that the bimodular organization is characteristic of proofreading proficient DNA polymerases (reviewed in reference 121).

C-terminal domain of phi 29 DNA polymerase. The polymerization activity of the phi 29 DNA polymerase is confined to the C-terminal domain of the enzyme. This part of the phi 29 DNA polymerase has three regions containing motifs that are conserved in other DNA polymerases belonging to family B. These three motifs are Dx2SLYP (motif A, also named motif 1), Kx3NSxYG (motif B, also named motif 2a), and YxDTDS (motif C, also named motif 3). The positions of these and other conserved motifs described below are indicated in Fig. 5, together with the amino acid sequence corresponding to each motif present in the DNA polymerase of phi 29, B103, and GA-1. Site-directed mutagenesis at motifs A, B, and C of phi 29 DNA polymerase (21, 34-36) showed that these three regions form an evolutionarily conserved polymerization-active site.

Figure 5 shows that of these three motifs, only motif C is fully conserved in the DNA polymerases of B103 and GA-1. Whereas motif A is conserved in the DNA polymerase of B103, a Met residue in the GA-1 polymerase occupies the corresponding position of phi 29 Leu253. Analysis of a phi 29 DNA polymerase mutant in which Leu253 had been replaced by a Val residue (L253V) showed that whereas it was not affected in template-primer DNA binding, it was strongly affected in reactions involving the use of TP as primer (35). With this result in mind, it would be interesting to study the effects of a phi 29 L253M DNA polymerase mutant and relate it to the reciprocal mutation in the GA-1 DNA polymerase (M253L). For motif B, the residue corresponding to Asn387 of phi 29 DNA polymerase is occupied by an Asp in the B103 polymerase (Fig. 5). The involvement of phi 29 DNA polymerase Asn387 in the correct binding of the primer terminus at the polymerization active site was demonstrated by the analysis of the N387Y mutant (36). Taking into account the protein sequence of the B103 DNA polymerase, it would be interesting to study possible effects of replacing Asn387 by Asp (N387D).

In addition to motifs A, B and C, two other motifs, Tx2GR (motif 2b) and KxY (motif 4), were identified in the C-terminal portion of phi 29 DNA polymerase and analyzed by site-directed mutagenesis (37, 145). These two motifs, which are also conserved in the C-terminal portion of B103, GA-1 (Fig. 5), and other B-type DNA polymerases, are involved in primer stabilization at the active site. In addition, motif 2b is involved in TP and metal binding (145). For several DNA polymerases, including the phi 29 DNA polymerase, it has been demonstrated that three Asp residues form a metal binding triad required for catalysis at the polymerization active site (reviewed in reference 32). In the phi 29 DNA polymerase, the three Asp residues implicated are Asp249, belonging to motif A, and Asp456 and Asp458, both belonging to motif C (21, 35, 185). These three Asp residues are conserved in the DNA polymerases of B103 and GA-1 and in all other known members of the B-type DNA polymerases. Also, Arg438 of motif 2b of phi 29 DNA polymerase plays a role in catalysis of the polymerization reaction (145). Moreover, three highly conserved Tyr residues were shown to be involved, directly or indirectly, in interaction with deoxynucleoside triphosphates (dNTPs). These residues, also conserved in the B103 and GA-1 DNA polymerases (Fig. 5), are Tyr254 of motif A (34, 35), Tyr390 of motif B (34, 36), and Tyr454 of motif C (21). Since the phi 29 residues Tyr254 (motif A) and Tyr390 (motif B) are also involved in selection of dNTP binding, they play an important role in the fidelity of DNA replication (184). In addition, a single and specific replacement of Tyr254 (motif A) by a Val residue enables the mutant phi 29 DNA polymerase to incorporate ribonucleotides without affecting its wild-type affinity for dNTPs (38). This indicates that phi 29 Tyr254 is responsible for the discrimination against the 2'-OH group of an incoming ribonucleotide.

In addition, seven residues that are invariant or highly conserved in the C-terminal domain of B-type DNA polymerases were shown to be involved in binding template-primer structures. These residues are Ser252 of motif A (35), Asn387 (see above) and Gly391 of motif B (36), Thr434 and Arg438 of motif 2b (145), and Lys498 and Tyr500 of motif 4 (37).

N-terminal domain of phi 29 DNA polymerase.

(i) Proofreading. The insertion discrimination values of the phi 29 DNA polymerase range from 104 to 106 and the efficiency of mismatch elongation is 105- to 106-fold lower compared to a properly paired terminus (72). These values illustrate the high fidelity with which the phi 29 DNA polymerase replicates DNA. As with other proofreading-proficient DNA polymerases, the phi 29 DNA polymerase owes its high fidelity to its 3'-5' exonuclease activity (81), which is confined to the N-terminal part of the enzyme. Bernad et al. (20) proposed that three N-terminally located regions, ExoI, ExoII, and ExoIII, form the 3'-5' exonuclease active site (Fig. 5) and are evolutionarily conserved in prokaryotic and eukaryotic DNA polymerases. This proposal has been proven valid for various DNA polymerases of eukaryotic and prokaryotic origin (for a review, see reference 60). The three Exo domains contain five invariant residues that are involved in metal binding and 3'-5' exonuclease catalysis. In phi 29 DNA polymerase, these residues are Asp12 and Glu14 in ExoI, Asp66 in ExoII, and Tyr165 and Asp169 in ExoIII (20).

The two-metal-ion mechanism, first proposed for the 3'-5' exonuclease active site of polymerase I (18), was shown also to apply to phi 29 DNA polymerase and can be extrapolated to other proofreading-proficient DNA polymerases (73). Another invariant residue, Lys143 of phi 29 DNA polymerase, was analyzed and shown to be important for the catalytic efficiency of the 3'-5' exonuclease activity (63). In addition, other residues in the Exo motifs that are conserved in B103, GA-1, and most other prokaryotic and eukaryotic DNA polymerases were functionally analyzed. Two of these, Thr15 and Asn62, located at the ExoI and ExoII motifs, respectively, were shown to act as single-stranded DNA ligands playing a critical role in the stabilization of the frayed primer terminus at the 3'-5' exonuclease active site (64). Also, Phe65 of the ExoII motif and residues Ser122 and Leu123, which are part of a newly identified motif [S/T]Lx2h, were shown to be important for (i) stable interaction with ssDNA, (ii) 3'-5' exonucleolysis of ssDNA substrates, and (iii) proofreading of DNA polymerization errors (65). In addition, these studies showed that the aromatic ring of Phe65 appeared to be critical to orient the ssDNA substrate in a stable conformation to allow 3'-5' exonucleolytic catalysis. These three residues, Phe65, Ser122, and Leu123, are also conserved in the B103 and GA-1 DNA polymerases.

(ii) Strand displacement. After the initiation, sliding-back, and transition steps, continuous polymerization, carried out by a single phi 29 DNA polymerase molecule, completes the replication of the almost 20-kb DNA strand (30). Using primed M13 DNA as the template, the phi 29 DNA polymerase is able to synthesize DNA chains of more than 70 kb (25). This demonstrates the high processivity and strand displacement activity of the phi 29 DNA polymerase. Replication of phi 29 DNA starts nonsimultaneously from either end of the linear DNA molecule (117), generating so-called type I replication intermediates (Fig. 4). Until the two converging DNA polymerases collide, DNA polymerization is coupled to strand displacement, which makes a helicase unnecessary (25). Various DNA polymerases, but not the one encoded by phi 29, are prone to replication slippage. This particular type of error, which results in deletions, is caused when a polymerizing DNA polymerase slips between two short sequence duplications. Recently, evidence has been presented that the high strand displacement activity of the phi 29 DNA polymerase prevents replication slippage (48).

Surprisingly, functional analysis of the phi 29 DNA polymerases containing mutations in one of the five invariant residues in the Exo motifs critical for 3'-5' exonuclease activity, Asp12, Glu14, Asp66, Tyr165, or Asp169, showed that they were also strongly affected in their strand displacement activity (73, 194). In addition, mutants corresponding to Lys143, the residue which is conserved in GA-1 and B103 DNA polymerases and was shown to play an auxiliary role in catalysis of the exonuclease reaction, were affected in strand displacement activity (62). These results indicated that the strand displacement activity of phi 29 DNA polymerase is located in its N-terminal domain, somehow overlapping with the 3'-5'exonuclease activity.

Mutations of residues Thr15 and Asn62, shown to act as ssDNA ligands but not playing a direct role in the phi 29 DNA polymerase 3'-5' exonuclease catalysis reaction, displayed wild-type levels of strand displacement activity (64). Therefore, it seems that impaired strand displacement activity is restric