MMBR Figure table search 04
Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowReprints and Permissions
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Murray, N. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Murray, N. E.

 Previous Article  |  Next Article 

Microbiology and Molecular Biology Reviews, June 2000, p. 412-434, Vol. 64, No. 2
1092-2172/00/$04.00+0
Copyright © 2000, American Society for Microbiology. All rights reserved.

Type I Restriction Systems: Sophisticated Molecular Machines (a Legacy of Bertani and Weigle)

Noreen E. Murray*

Institute of Cell and Molecular Biology, University of Edinburgh, Edinburgh EH9 3JR, United Kingdom

SUMMARY
GENERAL INTRODUCTION
    Host-Controlled Modification---A Historical Perspective
    Nomenclature and Classification of R-M Systems
        Nomenclature.
        Classification.
INTRODUCTION TO TYPE I R-M SYSTEMS
    General Characteristics
    The Family Concept
ENZYMES
    Introduction
    Specificity Subunit---HsdS
        Variable sequences.
        Conserved sequences.
        Imposition of symmetry.
        Association of subunits.
        DNA specificity.
        Molecular model.
        HsdM needed for DNA binding.
    Modification Subunit---HsdM
        Active site.
    Modification Enzyme
        Discrimination of methylation state.
        DNA binding.
        Base flipping.
    Restriction Subunit---HsdR
    R-M Complex
        DNA binding.
        Recognition of methylation state.
        Communication.
        Subunit interactions.
        Functional analyses of motifs.
        DNA translocation.
    Assembly and Its Implications
BIOLOGY OF TYPE I R-M SYSTEMS
    Host-Controlled Modulation of Restriction Activity
        Self-protection.
        Proteolytic control.
        Significance.
    Mechanisms by Which Plasmids and Phages Avoid Restriction
    Detection, Distribution, and Diversity
    Evolution
    Relevance to Bacteria
CONCLUSION AND FUTURE DIRECTIONS
ACKNOWLEDGMENTS
REFERENCES


SUMMARY
Top
Next
References

Restriction enzymes are well known as reagents widely used by molecular biologists for genetic manipulation and analysis, but these reagents represent only one class (type II) of a wider range of enzymes that recognize specific nucleotide sequences in DNA molecules and detect the provenance of the DNA on the basis of specific modifications to their target sequence. Type I restriction and modification (R-M) systems are complex; a single multifunctional enzyme can respond to the modification state of its target sequence with the alternative activities of modification or restriction. In the absence of DNA modification, a type I R-M enzyme behaves like a molecular motor, translocating vast stretches of DNA towards itself before eventually breaking the DNA molecule. These sophisticated enzymes are the focus of this review, which will emphasize those aspects that give insights into more general problems of molecular and microbial biology. Current molecular experiments explore target recognition, intramolecular communication, and enzyme activities, including DNA translocation. Type I R-M systems are notable for their ability to evolve new specificities, even in laboratory cultures. This observation raises the important question of how bacteria protect their chromosomes from destruction by newly acquired restriction specifities. Recent experiments demonstrate proteolytic mechanisms by which cells avoid DNA breakage by a type I R-M system whenever their chromosomal DNA acquires unmodified target sequences. Finally, the review will reflect the present impact of genomic sequences on a field that has previously derived information almost exclusively from the analysis of bacteria commonly studied in the laboratory.


GENERAL INTRODUCTION
Top
Previous
Next
References

Host-Controlled Modification---A Historical Perspective

Awareness of the phenomenon of restriction and modification and the consequent revolution in molecular biology grew from the observations of microbiologists in the early 1950s that the host range of a bacterial virus (phage) could be influenced by the bacterial strain in which the phage was last propagated. While phages produced in one strain of a bacterial species would readily infect a culture of the same strain, they might only rarely achieve successful infection of cells from a different strain of the same species. This finding implied that the phages carried an "imprint" that identified their immediate provenance. The occasional successful infection of a different strain resulted in the production of phages that had lost their previous imprint and acquired a new one, i.e., they acquired a new host range, hence the term host-controlled modification (see reference 5 for an early review).

In the 1960s, elegant molecular experiments showed the imprint to be a DNA modification that was lost when the phage DNA replicated within a different bacterial strain; phages that conserved one of their original DNA strands retained the modification, whereas phages containing two strands of newly synthesized DNA did not (6). The modification was shown to protect the DNA against an endonuclease, the barrier that prevented, or "restricted," the successful propagation of an incoming phage genome. Later it was proven that the modification and restriction enzymes both recognize the same target, a specific nucleotide sequence. The modification enzyme is a DNA methyltransferase (161) that methylates specific bases within the target sequence, and in the absence of the specific methylation the target sequence renders the DNA sensitive to the restriction endonuclease. When DNA lacking the appropriate modification imprint enters a restriction-proficient cell, it is therefore recognized as foreign and degraded by the endonuclease. For each unmodified target sequence, there is only a low probability that it will become modified and escape attack by the endonuclease. Since the host-controlled barrier to successful infection by phages that lack the correct modification was referred to as restriction, the relevant endonucleases have acquired the colloquial name of restriction enzymes. Similarly, the methyltransferases are more commonly termed modification enzymes. Classically, a restriction enzyme is accompanied by its cognate modification enzyme, and the two comprise a restriction and modification (R-M) system. Most restriction systems conform to this classical pattern. There are, however, some restriction endonucleases that attack DNA only when their target sequence is modified; such modification-dependent restriction enzymes do not, therefore, coexist with a cognate modification enzyme.

The classical R-M systems and the modification-dependent restriction enzymes share the potential to attack DNA derived from different strains and thereby restrict DNA transfer. They differ in that in one case an associated modification enzyme is required to protect DNA from attack by the cognate restriction enzyme, while in the other a modification enzyme specified by one strain imparts a signal that provokes the degradative activity of a restriction endonuclease in another.

Early examples of host-controlled modification, though they were not always recognized as such, were reviewed by Luria (107). Two papers in particular stimulated interest in the phenomenon. In one, Bertani and Weigle (17), using temperate phages (lambda  and P2), identified the classical R-M systems of Escherichia coli K-12 and E. coli B. In the other, Luria and Human (108) identified a restriction system of the second, nonclassical kind.

T-even phages were used by Luria and Human as test phages, and after their growth in a mutant E. coli host, these phages were restricted by wild-type E. coli K-12 but not by Shigella dysenteriae. An understanding of this restriction phenomenon requires knowledge of the special nature of the DNA of T-even phages. When the DNA of T-even phages is replicated, the unusual base 5-hydroxymethylcytosine (HMC) substitutes for cytosine, and the hydroxymethyl group is then glucosylated in a phage-specific pattern. In the mutant strain of E. coli used by Luria and Human, glucosylation fails. The vulnerable nucleotide sequences of T4, normally protected by glucosylation, are recognized by an endonuclease in E. coli K-12 because they include the modified base HMC rather than cytosine residues. S. dysenteriae does not restrict the nonglucosylated phage because it lacks the relevant endonuclease. Later it was discovered that methylated cytosine residues in the context of the correct nucleotide sequence also evoke restriction by these modification-dependent endonucleases (145).

The biological tests for a classical R-M system are illustrated by the pioneering experiments of Bertani and Weigle (17). Phage lambda  grown on E. coli strain C (lambda .C or lambda .0), where E. coli C is a strain that apparently lacks an R-M system, forms plaques with poor efficiency (efficiency of plating [EOP] of 2 × 10-4) on E. coli K-12 because the phage DNA is attacked by a restriction endonuclease. Phage lambda  grown on E. coli K-12 (lambda .K) forms plaques with equal efficiency on E. coli K-12 and E. coli C, since it has the modification required to protect against the restriction system of E. coli K-12 and E. coli C has no restriction system. In contrast, lambda .K will form plaques with very low efficiency on a third strain, E. coli B, since E. coli B has an R-M system with different sequence specificity from that of E. coli K-12.

The classical restriction endonucleases of E. coli K-12 and B were not only the first to be detected but the first to be purified (104, 117). The demonstration that the restriction endonuclease from E. coli K-12 produced a digest of phage lambda  DNA comprising large DNA fragments (117) was exciting, for it implied a highly specific target for the endonuclease, a view supported by genetic studies which showed that target sequences in phage genomes could be mutated and mapped (54, 93, 124). The rational expectation was that type I restriction enzymes would cut DNA close to their target sequences, but Horiuchi and Zinder (68) showed otherwise. The enzyme from E. coli B cut the DNA of phage f1 nonspecifically at considerable distances from the unmodified target sequences. Type I restriction enzymes therefore failed to provide the anticipated analytical reagents, but they raised the alert so that Ham Smith immediately appreciated the significance of his observation that Haemophilus influenzae strain Rd degraded P22 phage DNA, and as a consequence, he purified HindII, the first type II restriction enzyme (160). The phenomenon of restriction, identified for type I and methylation-dependent systems, laid the foundations for modern molecular biology, yet the molecular complexity and biological importance of these systems remain to be fully understood.

Nomenclature and Classification of R-M Systems

Nomenclature. Classical R-M systems are designated by a three-letter acronym derived from the name of the organism in which they occur. The first letter comes from the genus, and the second and third letters come from the species. The strain designation, if any, follows the acronym. Different systems in the same organism are distinguished by roman numerals. Thus, EcoKI and EcoBI are the first classical restriction enzymes identified in E. coli K-12 and E. coli B, respectively (they are referred to as EcoK and EcoB in early papers). Restriction endonucleases and modification methyltransferases are sometimes distinguished by the prefixes R and M, respectively, but the prefix is commonly omitted, particularly for type I R-M systems.

Classification. R-M systems are classified on the basis of their composition and cofactor requirements, the nature of their target sequence, and the position of the site of DNA cleavage with respect to the target sequence. Currently three distinct, well-characterized types of classical R-M systems are known (I, II, and III), although a few do not fit well into any of these. The first R-M systems to be identified, those characteristic of E. coli K-12 and E. coli B, were designated type I, but the type II systems became better known, since they, unlike type I restriction enzymes, cut DNA into discrete fragments. A summary of the properties of the principal types of classical R-M systems is given in Fig. 1, and the reviews selected (5, 10, 18, 65, 83, 92, 118, 144, 149, 170, 192) include some of particular historical significance.


View larger version (18K):
[in this window]
[in a new window]
 
FIG. 1.   Distinguishing characteristics and organization of the genetic determinants and subunits of the different types of R-M systems. ENase, restriction endonuclease; Mtase, methyltransferase. Modified with permission from a figure in reference 83.


INTRODUCTION TO TYPE I R-M SYSTEMS
Top
Previous
Next
References

General Characteristics

Type I R-M systems are multifunctional enzymes that can catalyze both restriction and modification. S-Adenosylmethionine (AdoMet) is the cofactor and methyl donor for the methyltransferase activity; the endonucleolytic activity requires ATP, AdoMet, and Mg2+. The nucleotide sequences recognized by type I enzymes are asymmetric and comprise two components, one of 3 or 4 bp and the other of 4 or 5 bp, separated by a nonspecific spacer of 6 to 8 bp. All known type I enzymes methylate adenine residues, one in each component of the target sequence, but on opposite strands. The type I R-M enzyme binds to its target sequence, and its activity as an endonuclease or a methyltransferase is determined by the methylation state of the target sequence. If the target sequence is unmodified, the enzyme, while bound to its target site, is believed to translocate, or pull, the DNA towards itself simultaneously in both directions in an ATP-dependent manner. This translocation process brings together enzymes bound to different sites on the same molecule, and it is thought that DNA cleavage occurs when translocation is impeded, either by collision with another translocating complex or by the topology of the DNA substrate.

A type I restriction enzyme comprises three subunits encoded by three closely linked genes, hsdR, hsdM, and hsdS. The acronym hsd was chosen at a time when R-M systems were referred to as host specificity systems and hsd denotes "host specificity of DNA." hsdM and hsdS are transcribed from the same promoter, but hsdR is transcribed from its own promoter. The two subunits encoded by hsdM and hsdS, HsdM and HsdS (often referred to as M and S), are both necessary and sufficient for methyltransferase activity. The third subunit (HsdR or R) is required for restriction. The S (specificity) subunit includes two target recognition domains (TRDs) that impart target sequence specificity to both the restriction and modification activities of the complex. HsdM includes the binding site for AdoMet and the active site for DNA methylation; HsdR includes the active site for ATP hydrolysis and other sequences essential for DNA translocation and endonuclease activity.

The Family Concept

The finding that type I R-M systems exist as closely related members of a family has been of fundamental value to their analysis (10). Evidence for related systems was first indicated by the demonstration that mutants with defects in the allelic genes encoding EcoKI and EcoBI could complement each other. On the basis of such tests, it was inferred that each enzyme comprised three subunits, that the subunits of EcoKI and EcoBI were interchangeable, and that the subunit encoded by one gene, hsdS, confers target sequence specificity on the multimeric complex (22, 58, 69). EcoKI and EcoBI became the founder members of a family of type I systems, type IA. The essential difference between two members of one family resides in the regions of the HsdS subunit that confer sequence specificity.

Hybridization screens of bacterial DNAs and serological screens of bacterial extracts first suggested that allelic genes might also encode sufficiently dissimilar type I R-M systems to warrant their separation into different families (122). As expected, the nucleotide sequences of the hsd genes for EcoKI and EcoBI would hybridize to each other and antibodies raised against EcoKI reacted with EcoBI, but in contrast, DNA probes comprising the EcoKI genes failed to hybridize with those of E. coli 15T-, which encoded EcoAI; similarly, antibodies against EcoKI did not cross-react with EcoAI. The hsd genes in these two strains behave as alleles in genetic tests (7) but have very different nucleotide sequences (34, 80). EcoAI defines a second family of type I systems, type IB.

A third family, type IC, headed by EcoR124I, includes plasmid-encoded members (141) and a chromosomally encoded relative (EcoprrI) identified in a natural isolate of E. coli. The first representative of a fourth family, ID (180), is the R-M system of Salmonella enterica serovar blegdam, identified initially on the basis of biological tests (26). The genes encoding the ID systems, but not those for EcoprrI (183), map to the same region of the E. coli chromosome as those for type IA and type IB. Currently, each type I R-M system identified in E. coli, or in a close relative, has been allocated to one of the four families (the members are listed later in Table 2).


ENZYMES
Top
Previous
Next
References

Introduction

The restriction endonucleases from three families of type I systems (IA, IB, and IC) have been purified and characterized. Each is a large oligomeric complex of relative molecular weight ~400,000 to 500,000, which in the presence of Mg2+, ATP, and AdoMet functions as an endonuclease on a DNA substrate that includes unmethylated target sequences but catalyzes the transfer of methyl groups from AdoMet to DNA substrates that include hemimethylated target sequences (187). Early evidence indicated that EcoKI comprises subunits of three sizes, with approximate molecular weights of 135,000, 62,000, and 52,000. Estimates of the relative amounts of each subunit in EcoKI indicated two of each of the larger polypeptides and one of the smallest (118). An analysis of the polypeptides specified by the cloned hsd genes of E. coli K-12 permitted the correlation of polypeptides with genes (151) and the consequent suggestion that the stoichiometry of the subunits within the complex is R2M2S1.

Convincing evidence that endonuclease activity is associated only with the R2M2S1 complex was obtained much later (45). For a time it appeared that EcoR124I and -II, members of the IC family, have one rather than two R subunits and that they retain endonuclease activity in the absence of AdoMet (75). However, it has now been shown that the EcoR124I complex has a tendency to lose one HsdR polypeptide (76) and that AdoMet copurifies with EcoR124I and other type IC enzymes (41, 133; P. Janscak and T. A. Bickle, personal communication). Type IB complexes readily lose both HsdR subunits (167), but it seems probable that the predominant active complex for any type I R-M system includes two HsdR subunits.

A modification enzyme without endonuclease activity was demonstrated in extracts from E. coli B, in addition to the larger complex with both activities (53, 100). Strains of bacteria encoding EcoKI (169), EcoAI (167), and EcoR124I (174) are now known to possess a complex with only methyltransferase activity, as well as the large R-M complex. In all cases, the stoichiometry of the enzyme lacking endonuclease activity is M2S1.

In the following sections, evidence for the functional domains within each subunit is summarized before the activities of the enzyme complexes are considered.

Specificity Subunit---HsdS

Variable sequences. When the sequences of specificity genes (hsdS) of members of the type IA family were searched for differences that would correlate with the recognition of different target sequences, each pairwise comparison revealed two long regions (~450 to 500 bp) of apparently unrelated sequence (referred to as variable regions), in addition to some minor differences in the intervening conserved sequences. Together, the two variable regions comprise the major portion of the gene, but given the bipartite nature of the target sequences, it was inescapable that each variable region might encode polypeptide sequences responsible for the recognition of one component of the target sequence (62). Subsequently, comparisons of the nucleotide sequences of hsdS genes within the IB (80), IC (64), and ID (A. J. B. Titheradge and N. E. Murray, unpublished data) families also revealed two long variable regions flanking a conserved region, as shown for a IC member in Fig. 2a.


View larger version (43K):
[in this window]
[in a new window]
 
FIG. 2.   Specificity polypeptide of EcoR124I. (a) Organization indicating the two variable regions (TRDs) and the regions conserved in sequence for all members of one family. Repeats of similar sequence (below the arrows) are identified by their conserved marking. The number of the amino acid residue at the beginning and end of each region is given below the diagram. (b) Model of Kneale (86), in which the repeated sequences form linkers joining the TRDs in a rotationally symmetrical configuration. The nucleotide sequences identified by each TRD are shown. (c) Consequent model of the methyltransferase, in which the two HsdM subunits bind to the linker region to generate an enzyme with pseudodyad symmetry.

Analyses of hybrid hsdS genes provide support for the notion that the variable regions encode TRDs. It was demonstrated for the IA family that the specificity conferred by a chimeric HsdS polypeptide has one half of its recognition sequence in common with each parental target sequence (55, 127), a finding confirmed for members of the IB (179) and IC (64) families. It was demonstrated for the first hybrid hsdS gene that any amino acid differences in the central conserved region are irrelevant to sequence specificity and that when two TRDs from different families recognize the same nucleotide sequence, they have significant similarity, e.g., 50% identity, despite the absence of much similarity between the sequences of the remainder of HsdS or of HsdM and HsdR (33).

Conserved sequences. Two structural roles are anticipated for the conserved regions of HsdS polypeptides. These are the maintenance of the relative positions of the two TRDs and the specific associations of HsdS with other components of the R-M complex. Evidence that the conserved peptide sequence that links the two TRDs serves to position the TRDs on the target sequence came from the chance occurrence of a derivative of EcoR124I with a new specificity. The sequence of the hsdS gene of this R-M system, EcoR124II (originally referred to as EcoR124/3), is consistent with a recombination event in which unequal crossing-over between a misaligned 12-bp duplication in the central conserved sequence of the hsdS gene, or slipped mispairing during DNA replication, has led to a triplication of that sequence. As a result, the two components of the target sequence of EcoR124II are separated by 7 bp rather than the 6 bp in the target sequence of EcoR124I (63, 142). The region in which the four additional amino acids occur is predicted to be strongly helical, and the increase in length is likely to be sufficient to accommodate a change in spacing of 1 bp (142).

Imposition of symmetry. In addition to the regions conserved in HsdS subunits of the same family, there are regions of similarity within each HsdS subunit (Fig. 2a). For EcoKI (type IA), this was noted by Argos (8) for regions in the central and carboxy-terminal conserved sequences. In the IB family, a segment of amino acids is common to both the amino-terminal and central conserved sequences (Fig. 3c), in addition to regions similar to those in EcoKI (80). The analysis of sequence similarities in the specificity subunit of a type IC member indicated that a sequence in the central conserved region was incompletely repeated at the carboxy terminus, with the remainder of the repeat located at the amino terminus (Fig. 2a). This "split repeat" led Kneale (86) to propose a model in which the amino and carboxy termini of HsdS are in close proximity so that they associate to form a linker domain similar to that provided by the central conserved sequence (Fig. 2b). These two domains of similar sequence would then dictate the symmetrical association of the two HsdM subunits (Fig. 2c). The repeated sequences identified within the HsdS subunits of the type I systems of the gram-positive bacterium Lactococcus lactis (153) are remarkably reminiscent of those identified in the type IC family, and a close analysis of members of the type IA family finds some evidence for a short sequence at the amino terminus of HsdS that has similarity with part of the repeat in the central conserved region (86).


View larger version (34K):
[in this window]
[in a new window]
 
FIG. 3.   Experimental support for the symmetrical arrangement of the TRDs (large open circles or arrows) within specificity subunits. (a) Predicted arrangement of two truncated subunits of EcoDXXI and the consequent recognition of a hyphenated symmetrical target sequence. (b) The HsdS subunits of StySKI (type IB) and EcoR124I (type IC). The IB family of enzymes have a long N-terminal conserved sequence, while the IC family has a long C-terminal conserved tail. The amino-terminal TRD of StySKI and the carboxy-terminal TRD of EcoR124I (indicated by arrows) have 36% amino acid identity. The DNA targets recognized are indicated. The complement of the EcoR124I 3' target sequence (CGAY) is a degenerate version of the StySKI 5' target sequence (CGAT), consistent with a symmetrical organization of two similar TRDs. (c) Circularly permuted variants of the specificity gene of EcoAI (type IB) can retain activity. The normal organization of regions within an HsdS polypeptide is given in the first diagram, followed by an active but circularly permuted variant. The black segments identify the well-conserved repeats.

The symmetrical organization of the HsdM subunits was implicit in an earlier model (27). The present model (Fig. 2c) and that of Willcock et al. (191) indicate a methyltransferase structure with twofold rotational symmetry in which inversely oriented TRDs will each make contact with one HsdM subunit (113).

This symmetrical configuration of the domains within HsdS is supported by the following observations. (i) Truncated forms of the HsdS subunit of either EcoDXXI or EcoR124I, in which the carboxy half of the polypeptide is missing but the central conserved sequence is retained, associate to form an active enzyme that recognizes a bipartite target sequence that is a palindromic version of the trinucleotide specified by the amino-terminal TRD (1, 113). Hence, EcoDXXI recognizes TCA(N7)RTTC, while the derivative with a truncated HsdS polypeptide recognises TCA(N8)TGA (Fig. 3a). (ii) The amino-terminal TRD of StySKI and the carboxy-terminal TRD of EcoR124I are very similar (36% identity) despite being members of different families. This similarity correlates with a similarity of target sequence evident in the complementary strands (Fig. 3b), a finding consistent with the amino- and carboxy-terminal TRDs being inversely oriented when they bind their target sequences (179). (iii) Permutations of the hsdS gene of EcoAI have been made in which HsdS remained active when a sequence from the N-terminal conserved region was transferred to the carboxy terminus (Fig. 3c) (74).

Association of subunits. Interaction with other subunits was an anticipated role for the conserved sequences of HsdS polypeptides (62). Convincing evidence in support of this prediction derives from the truncated derivatives of HsdS polypeptides of EcoDXXI (Fig. 3a) and EcoR124I. Two truncated polypeptides can substitute for one normal HsdS subunit if the subunits retain their ability to interact with HsdM. Active complexes, i.e., those in which HsdS retains the ability to bind HsdM, are recognized by their new specificities (1, 109, 113). Analyses of truncated polypeptides implicate the three conserved regions of HsdS (Fig. 2a) in binding HsdM (109). As predicted, a deletion within the central conserved region of an HsdS subunit prevents binding to HsdM (189).

Point mutations have given some indication of the regions of HsdS that are involved in protein-protein interaction. One mutation in the hsdS gene of EcoKI destabilized the methyltransferase at high temperature, consistent with a defect in the binding of HsdM to HsdS (197; D. T. F. Dryden, V. Zinkevich, and K. Firman, personal communication). Rare mutations in the hsdS genes of both EcoKI (131) and EcoR124I (189) lead to a restriction-deficient, modification-proficient (r-m+) phenotype, the phenotype predicted if HsdR can no longer associate with the methyltransferase. These mutations may occur preferentially at the borders of conserved and variable regions. Weiserova and Firman (189) suggested that the mutations identify regions of importance for the assembly of the R-M complex, though not necessarily sites of interaction between HsdS and HsdR.

DNA specificity. The HsdS subunits of type I R-M enzymes do not possess obvious DNA-binding motifs within their TRDs of 150 to 180 amino acids. No three-dimensional structural analysis of a type I R-M enzyme or any component of a type I R-M enzyme has been achieved. There is, therefore, no direct evidence to identify which amino acids within a TRD interact with DNA. Two approaches to understanding the mechanism of DNA recognition have been made: one relies on modeling, and the other relies on random mutagenesis of a TRD. Fortunately, the strongest predictions from modeling were obtained for the amino-terminal TRD of EcoKI, the TRD subjected to mutagenesis.

Molecular model. Sturrock and Dryden (166) supplemented sequence data from known type I R-M systems with data for putative systems identified in genomic sequences to derive a molecular model for the recognition region of a TRD. These authors used amino acid sequences combined with secondary-structure prediction to align 51 TRDs. The inclusion of secondary-structure prediction enhances the strength of the amino acid alignments, making distant similarities more apparent. This is particularly helpful because the amino acid identities between TRDs that recognize different target sequences are usually less than 25%. The alignments suggest a common tertiary structure, and secondary-structure predictions with strong similarity to the known structure of the TRD of the type II HhaI methyltransferase. Of the 51 sequences compared, that of the amino-terminal TRD of HsdS from EcoKI shows the closest similarity to the TRD of M · HhaI, sufficiently so to suggest that EcoKI, like M · HhaI, might interact with DNA via two short polypeptide loops flanking a beta -strand.

The experimental approach of O'Neill et al. (131) aimed to localize the protein-DNA interface by random mutagenesis. It was anticipated that amino acids that could be changed without loss of R-M activity were unlikely to be involved in target recognition, while substitutions that resulted in an r-m- phenotype would include amino acids involved in a specific interaction with DNA. Most of 101 substitutions affecting 79 of the 150 residues, including quite severe changes, had no detectable effect on phenotype; changes at only seven positions conferred an r-m- phenotype. Five of the seven residues identified are in an interval between residues 80 and 110 which includes the predicted loop-beta -strand-loop: the model places two of these (residues 91 and 107) close to the protein-DNA interface (Fig. 4a and b). Three further residues (92, 95, and 103), all close to the DNA in the model, were changed by site-directed mutagenesis, and substitutions for each impaired both modification and restriction. Additional residues within the predicted beta -strand and second loop have been changed (M. O'Neill and N. E. Murray, unpublished data); a mutant with a substitution for residue 105 is r-m-, while those with substitutions for residues 94 and 106 retain only modification activity (r-m+).


View larger version (47K):
[in this window]
[in a new window]
 
FIG. 4.   Model of the amino acid segment comprising residues 43 to 157 of the amino terminus of EcoKI interacting with DNA as proposed by Sturrock and Dryden (166). (a) Side view. (b) Bird's eye view. Residues in red are substitutions that cause the loss of restriction and modification; those in yellow have no detectable effect on activity. Residue 141, in gray, was previously thought to lead to a loss of both activities but is now known to impair rather than inactivate. The DNA structure is from the complex of M · HhaI with its DNA target sequence (85) and therefore shows an extrahelical cytosine rather than adenine. The figure is modified with permission from that published in O'Neill et al. (131).

Two substitutions (residues 57 and 141) outside the predicted loop-beta -strand-loop region have been reported to confer an r-m- phenotype (131), but neither residue appears to be relevant to specificity. Only one of three substitutions identified for residue 57 conferred an r-m- phenotype, and it is now known that this phenotype results from a mutation in hsdM, not the change at position 57 (M. O'Neill and N. E. Murray, unpublished data). For residue 141, additional tests indicate that the G-to-A substitution severely impairs but does not destroy the restriction and modification activities. A tyrosine residue previously shown by cross-linking to be in close proximity to the DNA (28) is also outside the loop-beta -strand-loop region. This tyrosine residue has been replaced by cysteine, and the mutant retains modification activity (M. O'Neill and N. E. Murray, unpublished data). Currently, the genetic data and the analysis of mutant EcoKI enzymes correlate well with the predictions of the structural model for the TRD.

Two lines of evidence from other systems are also consistent with the structural model. First, Taylor et al. (176) used chemical modification to identify accessible lysine residues in EcoR124I. Three lysine residues in the carboxy-terminal TRD were especially susceptible to modification in the absence of bound DNA. The most strongly modified residue lies within the second of the proposed loops (166). Second, the three residues identified by chemical modification of EcoR124I are conserved in the amino-terminal TRD of the type IB system, StySKI (179), which recognizes the same target sequence as the carboxy-terminal TRD of EcoR124I.

HsdM needed for DNA binding. The HsdS subunits of EcoKI (193) and EcoR124I (132) are insoluble in the absence of HsdM, but the HsdS subunit of EcoR124I has been produced in soluble form as a glutathione-S-transferase (GST) fusion product (97). The resulting fusion protein is unable to make specific complexes with DNA. The GST moiety at the N terminus interferes with the recruitment of the second HsdM subunit, preventing assembly of the active methylase; its removal allows HsdS to assemble with HsdM units to form an active modification enzyme (115). These findings are consistent with the model (Fig. 2c) in which the conserved sequence at the N terminus forms part of the region that contacts HsdM. The HsdS subunit of EcoAI, a type IB enzyme, is soluble in the absence of HsdM, but it, too, requires HsdM to make sequence-specific complexes (74). For EcoKI, even the association with one HsdM subunit is sufficient to promote a sequence-specific interaction, though with a higher Kd than the full M2S1 complex (136). It would appear that HsdM is important for the positioning of the domains of HsdS, at least in part by maintaining its rotationally symmetrical configuration (74, 86).

Modification Subunit---HsdM

Like HsdS, HsdM is essential for restriction as well as modification. AdoMet, the methyl donor for modification, is an essential cofactor for restriction (117), and early experiments showed that AdoMet binds to the HsdM subunit of EcoKI (24). Consistent with this, the amino acid sequence predicted from the nucleotide sequence of the hsdM gene of E. coli K-12 (106) includes a version of a motif, N/DPPF/Y/W, that is characteristic of both N-6 adenine and C-4 cytosine methyltransferases and is now generally referred to as motif IV (for reviews of methyltransferases, see references 29, 42, and 129).

Active site. The HsdM subunit is well conserved (~90%) within a family (156, 183), but comparisons between members of different families generally indicate only 25 to 30% identity (156). Site-directed mutagenesis of the hsdM gene of E. coli K-12 demonstrated the relevance of two conserved sequences to methyltransferase activity (191). A change in the sequence predicted to be motif I (D/E/SXFXGXG) abolished the binding of AdoMet, while changes in motif IV (N/DPPF/Y/W) prevented catalysis but did not affect binding of the cofactor. Nevertheless, a tryptophan residue substituted for phenylalanine in motif IV of EcoKI is sufficiently close to the AdoMet-binding site to enhance cross-linking with the methyl donor (191). Subsequently, molecular modeling of type I HsdM subunits, based on sequence alignment and predicted secondary structures, suggested a domain in HsdM subunits which resembles that of the gamma  class of type II N-6 adenine methyltransferases (44) and includes the six motifs found in the catalytic domain of M · TaqI (99). The model of the catalytic domain for the HsdM subunit of EcoKI is consistent with the location of proteolysis-sensitive sites (31) and the mutational analysis of Willcock et al. (191). In this model, the glycine at amino acid residue 177 (motif I) is located close to the cofactor, and the phenylalanine at residue 269 (motif IV) is positioned at the edge of the active site, where it can interact with the target adenine should this be flipped out of the DNA helix, as has been shown for the cytosine residue during methylation by M · HhaI (85). Base flipping allows access to the target base and is therefore predicted to be common to all DNA methyltransferases as well as some DNA repair enzymes (150).

Modification Enzyme

A complex with only methyltransferase activity has been purified from representatives of the type IA, IB, and IC families. The stoichiometry of this active methyltransferase is M2S1 (167, 174). EcoKI and EcoBI (type IA) dissociate to M1S1 and M1 (43, 45, 100, 136), but for EcoKI the active form has been shown to be M2S1 (43). The type I modification enzymes catalyze the transfer of the methyl group from AdoMet to the N-6 position of specific adenine residues in their respective target sequences, probably utilizing the cationic-pi interactions proposed for those type II enzymes that methylate adenine at the N-6 position or cytosine at the N-4 position (152).

Discrimination of methylation state. The three type I families differ significantly in their relative responses to hemimethylated versus unmethylated target sequences. EcoAI, the type IB representative, like most known methyltransferases, shows little or no preference for hemimethylated DNA (167). In contrast, in vitro experiments indicate that members of the IA and IC families have a very strong preference for hemimethylated DNA (approximately 100-fold compared to unmethylated DNA); the data for the type IA system have been obtained from experiments with both plasmid DNA (169) and oligonucleotide duplexes (43) as substrates, and those for IC were obtained from oligonucleotide duplexes (175). Consistent with the in vitro evidence, unmodified phage DNA is a poor substrate in vivo for methylation by a type IA modification system (82, 105), but in apparent contradiction to the in vitro evidence, unmodified phage DNA was found to be a good substrate in vivo for methylation by an r-m+ type IC system (S. Makovets and N. E. Murray, unpublished data).

Mutants of potential relevance to the discrimination between hemimethylated and unmethylated target sequences (m* mutants) were selected for EcoKI by their enhanced ability to modify unmethylated target sequences in phage lambda . The restriction proficiency of all the mutants was impaired, though usually only slightly. Mutants selected by their r-m+ phenotype were also found to have an m* phenotype. Analysis of 22 m* mutants identified nine residues (14 substitutions), all within the N-terminal third of HsdM (82). Five amino acid substitutions, affecting three of these nine residues, resulted in the absence of any detectable restriction activity in vivo. Methyltransferase has been purified from each of four m* mutants, two r- and two r+/- (D. T. F. Dryden and N. E. Murray, unpublished data). The rate of methylation of unmethylated DNA was enhanced for all the enzymes, though to very different degrees (from 2- to 240-fold). The enzyme with the greatly enhanced rate of activity was shown to have a marked reduction in preference for hemimethylated DNA. An enhanced ability to modify unmodified target sequences could lead to competition between the two activities of the R-M complex and a consequent reduction in restriction. Competition does not entirely explain the present data, in which no direct correlation was found between the enhanced rate of methylation and the deficiency in restriction proficiency. The complete absence of restriction activity may require a more germane explanation than merely the inability to compete with the enhanced modification activity. It could be, as suggested by Kelleher et al. (82), that in these mutants an unmethylated target sequence does not trigger the enzyme to adopt the restrictive mode.

DNA binding. Enzymes of the IA and IC families have been investigated by gel retardation and footprinting experiments. The studies were aimed not only at understanding how the enzymes discriminate their target sequences from other DNA sequences, but also how they distinguish the methylation state of their target sequence.

DNA-binding studies for the EcoKI methyltransferase showed that differences in binding affinity contribute to the distinction between specific and nonspecific DNA sequences (
135, 136). However, the methylation state of the recognition sequence had no effect on the binding affinity, suggesting that the preference for a hemimethylated rather than an unmethylated DNA substrate is effected mainly at the level of catalysis. Similarly for the type IC EcoR124I methyltransferase, an increase in kcat rather than a decrease in Km was suggested to be the reason for the faster methylation of hemimethylated DNA (175). Both M · EcoKI and M · EcoR124I cover approximately 25 to 30 bp of DNA, as determined by exonuclease III and DNase I footprinting experiments (114, 135, 136, 175). The efficiency of binding by M · EcoKI is enhanced by AdoMet (135), and the inactive, partially assembled from (M1S1) covers the same length of DNA as the active methyltransferase (M2S1), indicating that the two subunits of HsdM are located on either side of HsdS, away from the helical axis of the DNA (136).

Interference footprinting experiments show that M · EcoKI (134) and M · EcoR124I (116) make contacts in the major groove of the DNA helix. Only in the case of the IA family has it been possible to examine the effect of the cofactor AdoMet in any detail, given that AdoMet copurifies with IB and IC enzymes. The presence of AdoMet has a striking effect on the interference pattern for M · EcoKI bound to unmodified DNA but not when bound to either hemimethylated or fully methylated DNA. For M · EcoKI, the methylation state of the target sequence therefore affects the conformation of the protein at the DNA interface, and it would appear that AdoMet could play an important role in the discrimination between unmodified and modified DNA (134).

Base flipping. The footprinting of M · EcoR124I provides the following circumstantial evidence that the adenine residues flip out to provide access for methylation (114). Two sites that are hypersensitive to hydroxyl radical cleavage have been identified within the target sequence, one per strand, each associated with the adenine that is the substrate for methylation. The authors argue that it is unlikely to be fortuitous that the hypersensitive site on each strand coincides with the adenine that is the target for methylation. A plausible explanation is that binding of the enzyme induces a marked conformational change in the structure of the sugar-phosphate backbone of the DNA in the region around those bases that are the targets for methylation. Mernagh et al. (116) also showed that M · EcoR124I binds more strongly when either uracil or an abasic site is substituted for one of the target adenines. Again, this would be consistent with the idea that the adenine residues, like the cytosine residues in the HhaI target sequence (85, 150), are flipped out during the methylation reaction. Base analog experiments with the HhaI methyltransferase have shown that this enzyme binds more strongly to substrates in which the target base is mismatched (84).

Restriction Subunit---HsdR

The polypeptide sequences of HsdR subunits representative of the four families of type I systems have only 20 to 30% identity (180). All these sequences, however, include motifs characteristic of ATP-binding proteins, consistent with the ATP dependence of restriction. In addition, there are other conserved sequences indicating the presence of motifs previously identified in ATP-dependent helicases and putative helicases (59, 123, 180). There are several superfamilies of ATP-dependent helicases. Members of superfamilies 1 and 2 include the motif DEAD or a variant of this motif and are often referred to as DEAD-box proteins (103). The motifs in HsdR subunits are indicative of superfamily 2 (59). The structures of DEAD-box helicases indicate that the motifs form a nucleoside triphosphate (NTP)-binding pocket and a portion of a nucleic acid-binding site (88, 186; see reference 64a for a recent review). It is suggested that the conserved motifs define an "engine" that powers translocation on single-stranded DNA and unwinding of duplex DNA (64a). Velankar et al. (186) present an elegant model for the coupling of the energy to DNA translocation in which the enzyme "inchworms" along a single DNA strand using unpaired bases. Nearly all reported mutations affecting DEAD-box motifs impair the hydrolysis of NTP or the coupling of NTP hydrolysis to nucleotide unwinding (64a). Mutagenesis of the hsdR gene of EcoKI showed that each of the seven DEAD-box motifs is essential for restriction in vivo (35, 188).

The number of bacterial genomes for which sequences are available has increased significantly since the alignment of HsdR sequences reported by Titheradge et al. (180). The additional sequences have improved the reliability of alignments and the prediction of the secondary structure. Comparisons of HsdR sequences with Rep and PcrA, DNA helicases of known structure (88, 186), suggest that the HsdR subunits have the same secondary structure as the helicases in the region that includes the DEAD-box motifs (36). In Rep and PcrA, the motifs reside in two domains that couple ATP hydrolysis to DNA helicase activity (20). The fragmentation patterns produced by limited proteolysis of HsdR are consistent with the location of the DEAD-box motifs in two domains similar to those first observed in Rep and PcrA, in which the DEAD-box motifs cluster around a cleft between two domains (36).

The earlier alignments of HsdR sequences (180) detected an additional conserved sequence in the N-terminal part of the polypeptide which proteolysis experiments indicate to be in a separate domain from those including the DEAD-box motifs (36). This additional conserved sequence (Fig. 5) has similarities with motifs associated with DNA nicking in both type II restriction enzymes and the RecB family of nucleases. Site-directed mutagenesis proved the relevance of this motif to the endonuclease activities of EcoAI (78) and EcoKI (36, 37).


View larger version (12K):
[in this window]
[in a new window]
 
FIG. 5.   Domains and motifs of HsdR of EcoKI. The N- and C-terminal regions are omitted, since their roles are not known. The two domains that include the DEAD-box motifs correlate with IA and 2A, as determined for the structures of DNA helicases (36). Substitutions for the underlined amino acids confer a restriction-deficient phenotype.

R-M Complex

The earliest biochemical interest in restriction enzymes was as proteins that made specific interactions with DNA by a recognition process assumed to be intolerant of errors. The question of specificity for type I R-M systems extends to include the mechanism by which each enzyme not only recognizes the methylation state of its target sequence, but then reacts as a methyltransferase if the target sequence is already hemimethylated or as an endonuclease if it is unmethylated. Finally, of special interest is the mechanism by which type I R-M enzymes translocate DNA for considerable distances before breaking phosphodiester bonds in both strands of the DNA duplex. Associated with this mechanism is the problem of what halts translocation and triggers DNA cleavage. Complete answers to all of these complex molecular problems are not yet available (see reference 170 for a recent review).

DNA binding. The restriction pathway is presumed to be initiated by the ATP-dependent conformational change originally reported for EcoKI (19) but analyzed only recently by footprinting (137). EcoKI remains the best-studied R-M complex in terms of sequence-specific binding, the effect of the cofactor AdoMet on representatives of the type IB and IC enzymes being less easy to assess because it is difficult to separate AdoMet from these enzymes (41, 133; P. Janscak and T. A. Bickle, personal communication). EcoKI binds to DNA in the absence of ATP. A strong footprint of 42 to 46 bp is detected only if the DNA includes a target sequence. On the addition of ATP in the presence of AdoMet, unmethylated or hemimethylated target sequences remain protected, and the footprint shrinks to 30 bp; both fully modified target sequences and nonspecific DNA lose all protection. ATP and AdoMet are both needed for the conformational change in response to target sequences, though S-adenosyl homocysteine may be substituted for AdoMet and a nonhydrolyzable analog may be substituted for ATP (137).

Recognition of methylation state. In the presence of the functional cofactors, EcoKI methylates hemimethylated DNA but initiates ATP-dependent translocation if the target sequence is unmethylated. AdoMet has long been implicated in determination of the methylation state of the target sequence. Burkhardt et al. (27) suggested that the HsdM subunits would use the methyl group of the AdoMet as a probe for the presence of the methylated base in the major groove. For modified DNA, steric hindrance from the methyl groups on the adenine residues could prevent a conformational change. Only if both adenines were unmethylated would the HsdM subunits enter the major groove to give the "closed conformation," in which the HsdR subunits are appropriately positioned to initiate DNA translocation. Recent data indicate that the TRDs of HsdS enter the major groove (28), and by analogy with type II modification enzymes, the bases that are to be modified are anticipated to flip out of the helix (116). Nevertheless, AdoMet could still serve as a probe for the methylation state of the target DNA (134). Steric hindrance by AdoMet could block the positioning of a methylated base within the active site after the adenine residues become exposed. This refinement of the model maintains a critical role for AdoMet in restriction as well as modification, and a simple prediction is that loss of AdoMet binding to HsdM should cause an r-m- phenotype. Consistent with this, a mutant of EcoKI in which AdoMet binding to the HsdM subunit is blocked as the result of a single-amino-acid substitution in motif I (D/E/SXFXGXG) (191) is deficient in restriction as well as modification (V. Doronina and N. E. Murray, unpublished data).

Communication. The means by which the enzyme communicates the methylation state of its target sequence has been probed by looking for mutations in hsdM or hsdS that prevent access to the restriction pathway. Analysis of an hsdM or hsdS mutant with an r-m+ phenotype could identify an enzyme locked into the modification mode, or it could indicate that assembly of the R-M complex is prevented. The r-m+ mutants of the EcoKI system resulted from substitutions in the amino-terminal third of HsdM and also showed enhanced modification, i.e., they were m* mutants, more suggestive of defects in communication than assembly (82). Indeed, one of these mutants has been shown to make an R-M complex that is defective in restriction (D. T. F. Dryden and N. E. Murray, unpublished data).

Subunit interactions. Mutations conferring an r-m+ phenotype have been identified in the hsdS genes of type IA and IC systems. In EcoR124I (189) these mutations are at the junction between a conserved region and a TRD, while in EcoKI they can be close to the junction (197) or at various positions within the TRD (131). A cautious interpretation of r- phenotypes for EcoKI, however, is required by the recent discovery that mutations in hsdM that result in inadequate modification induce host-mediated alleviation of restriction (111) (see section on modulation of restriction activity). A mutation in hsdS may prevent binding of HsdS to HsdM or, as suggested by Weiserova and Firman (189), the TRD may influence the precise positioning of HsdR, perhaps in response to the methylation state of the target sequence. HsdM interacts and communicates with HsdR, but for HsdS the evidence is still unclear.

Functional analyses of motifs. The hsdR gene for EcoKI is now well charted, with mutations in regions that encode sequences common to all type I R-M systems (Fig. 5). These mutations indicate that each conserved region is important for restriction. Amino acid substitutions in the DEAD-box motifs do not prevent the conformational change associated with tight binding to the target sequences in the presence of AdoMet and ATP. These proteins, however, are all deficient in ATPase activity and DNA translocation, consistent with a role for the DEAD-box motifs in the coupling of ATP hydrolysis to DNA translocation (35, 37). The failure to translocate DNA was demonstrated by an in vivo assay in which wild-type EcoKI translocates the T7 genome from the phage particle into the bacterial cell (57).

Conservative substitutions within the amino acid sequence characteristic of endonucleases do not block either the ATPase or DNA translocase activities of EcoAI (78) and EcoKI (37). These mutations are believed to be within a separate domain from those of the DEAD-box motifs (37), and, as expected from their sequence identity with the active sites of type II restriction endonucleases, they block the nicking and cutting activity of the R-M complex. This implies a common mechanism for the hydrolysis of phosphodiester bonds by type I and type II systems. Early reports that the ends of the DNA fragments generated by type I endonucleases are refractory to terminal labeling by the transfer of phosphate groups to the 5' ends (53) may reflect the technical problem imposed by the absence of nucleotide specificity at the site of cleavage.

DNA translocation. The type I R-M enzymes cleave DNA at variable positions remote from their recognition sequences (68). Electron microscopy has been used to identify possible intermediates in the reaction leading to cleavage of both linear and covalently closed plasmid DNA (52, 195). The ATP-dependent formation of DNA loops, both twisted and untwisted, has been detected. A model was proposed in which the enzyme binds to its recognition sequence, makes a second nonspecific contact with DNA, and subsequently moves the DNA past the bound complex, generating loops in an ATP-dependent process (195). It was also suggested that if type I R-M enzymes were topoisomerases, they would overcome the conformational problems encountered should the enzyme, while anchored to its target sites, maintain contact with the major groove as the DNA is pulled towards the complex. Elegant experiments using catenated plasmid DNA (171) provided direct confirmation that communication between recognition and cleavage sites stems from a process in which the enzyme follows the contour of the DNA substrate. Meanwhile, using linear DNA, Studier and Bandyopadhyay (165) provided evidence for a model in which DNA is pulled past the bound protein on both sides of the recognition sequences, and endonuclease activity results when two translocating complexes meet (Fig. 6). This model was derived from the analysis of products obtained from the digestion of T7 DNA with EcoKI. When the restriction reaction was synchronized by the addition of ATP to protein-DNA complexes in the presence of AdoMet, diffuse bands of DNA were detected on gels, consistent with the production of relatively discrete DNA fragments. The ends of the fragments, according to the model, would focus around the midpoints between the adjacent recognition sequences. Earlier, though less discriminating, in vivo experiments suggested the cutting of DNA between recognition sites (23). According to the Studier model, a common stimulus for cleavage of the DNA could be the hindrance of translocation (Fig. 6). Recent experiments demonstrate the stimulation of DNA cutting when two translocating enzymes from different families collide and even when a translocating enzyme encounters a Holliday junction that is unable to migrate (77).


View larger version (24K):
[in this window]
[in a new window]
 
FIG. 6.   ATP-dependent DNA translocation. (a) Model of Studier and Bandyopadhyay (165). EcoKI bound to adjacent target sequences translocates DNA towards itself. Collision blocks translocation and stimulates the nicking of each DNA strand. Two domains of HsdR flanking the DNA are indicated. (b) In this variant, the EcoKI complexes have dimerized prior to translocation (51). (c) When translocation is impeded by some other protein or structure, endonuclease activity is stimulated (77).

Recently, atomic force microscopy has been used to observe the DNA translocation and cleavage process by EcoKI (51). In these experiments, EcoKI bound to two DNA target sites was seen to dimerize prior to the addition of ATP and the initiation of translocation. DNA-dependent dimerization cannot be an absolute requirement for DNA translocation, since a single target is sufficient for DNA translocation (57, 77). Dimerization of bound enzymes, however, could facilitate cooperation between two complexes, thereby enhancing restriction.

The translocation activity of EcoKI assayed by transfer of the T7 genome from the phage particle into the bacterial cell was estimated as ~100 bp of DNA per s (37, 57). The DEAD-box motifs are required for DNA translocation and appear to be organized within domain structures like those deduced for DNA helicases (36). These findings tally with a translocation mechanism that is dependent on helicase activity. If the enzyme follows the helical contour of the DNA while remaining bound to its target sequence, the DNA ahead of the translocating complex will be overwound (+ve supercoils) and that behind will be underwound (-ve supercoils) (36, 171). Recent experiments support this expectation and demonstrate that negative supercoils are generated by EcoAI in the presence of ATP and E. coli topoisomerase I (78a). These topological changes would impede translocation on covalently closed circular DNA in the absence of either a nicking or a topoisomerase activity. DNA nicks dependent upon the endonuclease motif of HsdR have been shown to be irrelevant to translocation in vivo (37) and in vitro (78a). Although no conventional topoisomerase activity has been observed (78a), relief by topoisomerase activity has not been ruled out, and conserved tyrosine residues have been identified within HsdR (37). Should DNA nicking and rejoining not be associated with translocation, it may be necessary to resort to a model in which the HsdR subunits are free to rotate around or detach from the methylase core of the enzyme, which remains bound to the target sequence (37, 78a).

Currently there is no direct evidence to indicate that type I R-M systems are helicases, and preliminary attempts to demonstrate helicase activity by conventional strand displacement assays have failed (78a; G. P. Davies, personal communication). Earlier experiments, cited in reference 52 but done in 1973, approached this question by using psoralen to introduce cross-links between pyrimidine residues in the strands of T7 DNA. Low levels of psoralen blocked the activity of the RecBCD nuclease but had no detectable effect on EcoBI. These observations were taken to support a translocation mechanism in which EcoBI utilizes only the exterior of the helix, rather than strand separation (52). The effect of psoralen has not been reinvestigated using the refined techniques and substrates currently available; cross-links might trigger cutting if they impede the translocating complex.

The footprints obtained with EcoKI in the presence of AdoMet shorten following the addition of ATP and become similar in length to those found with M · EcoKI. This could be taken as evidence for the loss of HsdR subunits, but Powell et al. (137) have shown that HsdR remains in the complexes formed between EcoKI and the oligonucleotide substrate of 45 bp. These authors suggested that EcoKI has three DNA-binding regions: a "core" region, which recognizes one target sequence, and a region on each HsdR subunit. The HsdR subunits would make tight contact with flanking DNA in the absence of cofactors, but this contact would be weakened in the presence of cofactors to allow the conformational change required for DNA translocation. Each complex would have an HsdR subunit at either side of the symmetrically arranged core of M · EcoKI (Fig. 6a), and these flanking HsdR subunits would pull the DNA towards the complex from either side of the enzyme, meeting the requirement for DNA translocation in both directions.

Assembly and Its Implications

The assembly of EcoKI has been analyzed in vitro (45). An assembly pathway relevant to the bacterial cell was proposed on the basis of experiments quantifying the interactions between intermediate complexes and subunits. The methyltransferase (M2S1) is formed from the reversible association of M with M1S1. HsdR binds very tightly to both M1S1 and M2S1, but the only complex with endonuclease activity is R2M2S1.

The relative strengths of protein-protein interactions determined in vitro can be used, at least in part, to explain regulation of the R-M activities in vivo. It is obvious that regulation is essential when hsd genes are transferred to a modification-deficient recipient. Experiments fail to find evidence for transcriptional regulation of hsdR (106, 139). The assembly pathway could, in the following way, form the basis of posttranslational control after expression of the hsd genes. Initially, HsdM and HsdS will exist mainly as HsdM and M1S1 with very little M2S1 until the concentrations of the subunits reach a critical level. HsdR will bind M1S1 and consequently will be unavailable for binding to M2S1, thereby imposing a further delay in the production of R2M2S1. Recent experiments for EcoKI indicate that there is a lag of many generations (~11) before cells become fully modification proficient, as assessed by their ability to modify infecting lambda .0 (S. Makovets and N. E. Murray, unpublished data) and longer, ~15 generations, before restriction proficiency is established (138). It was proposed that any of the intermediate complexes (M1S1 and R1M1S1 or HsdR) could be targets for cellular proteases (45), thereby delaying the appearance of functional restriction enzyme. However, recent studies on the host-controlled modulation of the restriction activities (see next section) question a major role for assembly pathways in controlling the restriction activities of either EcoKI or EcoAI.

All the known type I R-M enzymes are likely to assemble by a similar pathway, although the affinities of analogous protein-protein interactions may differ. For members of the type IB and IC families, the M1S1 intermediate appears to be less prevalent than for the type IA family, but for EcoAI (type IB), R2M2S1 readily dissociates to yield the methylase and free HsdR, while for type IC enzymes, the relative affinities of HsdR for M2S1 and R1M2S1 are quite different. The preferential stability of the R1M2S1 intermediate for EcoR124I (76) is currently the only explanation for the easy establishment of type IC hsd genes in a new host.


BIOLOGY OF TYPE I R-M SYSTEMS
Top
Previous
Next
References

Host-Controlled Modulation of Restriction Activity

Self-protection. Bacteria are assumed to tolerate the presence of a classical restriction endonuclease because the cognate modification enzyme maintains the methylation of target sequences within chromosomal DNA. A maintenance methylase does not, however, explain the long-established fact that genes encoding R-M systems are readily transferred to recipient cells in which the chromosomal DNA is unmodified. Under these circumstances, a delay in the appearance of restriction activity in the recipient cell is necessary to allow time for methylation of the unmodified chromosome (138). Other experiments demonstrate that the maintenance methyltransferase activity of an r+m+ cell is sometimes unable to cope with the protection of unmethylated targets that arise in response to DNA damage (111). In both these cases, it is now known that there is an additional level of control over the endonuclease activity of some type I R-M systems which enables the bacteria to survive in the absence of complete modification of chromosomal target sequences. This section of the review will trace the development of our current understanding of the mechanisms by which restriction activity is modulated.

Host DNA would be protected against the endonucleolytic activity of a newly acquired restriction system if the functional modification enzyme is produced before the restriction enzyme. Representatives of all three types of classical R-M systems have been shown to be equipped with promoters that could permit transcriptional regulation of the two activities. Transcriptional regulation of some of the genes encoding type II systems has been demonstrated. Genes encoding repressor- like proteins, referred to as C-proteins for control, have been identified in some instances (72, 172, 173). The C-protein for the BamHI system has been shown to activate efficient expression of the restriction gene and modulate the expression of the modification gene (73). Consequently, when the R-M genes are transferred to a new environment in which there is no C-protein, there will be preferential expression of the modification gene, and only after production of the C-protein will transcription of the restriction gene be activated. For type I R-M systems, despite the presence of two promoters, there is no evidence for transcriptional regulation of gene expression (95, 106, 139).

Proteolytic control. The heterooligomeric nature of type I R-M systems provides opportunity for the regulation of the R-M activities purely on the basis of the affinities with which different subunits bind to intermediates in the assembly pathway (45). Nevertheless, efficient transmission of the genes encoding EcoKI was shown to depend on a host function (139). A number of energy-dependent proteases are now known to play important regulatory roles in bacteria, and these were obvious candidates for this host function. The energy-dependent proteases identified in bacteria are often large oligomeric assemblies within which substrate recognition is imposed by one component and protease activity is imposed by another. The unfolded protein substrate is translocated to a chamber within the oligomeric complex and is then degraded processively (for reviews, see references 60 and 61). The protease specified by clpX and clpP was implicated in the regulation of restriction activity; in the absence of either ClpX or ClpP, acquisition of the hsd genes for EcoKI or for EcoAI led to the death of m- recipients (110). While ClpXP is a protease, ClpX itself can function as a substrate-specific chaperone. Loss of ClpX imposed a bigger barrier to gene transfer than loss of ClpP, suggesting a dual role for ClpX. This could imply a requirement for ClpX as a chaperone in addition to its role as a component of the ClpXP protease (110).

The delay in the appearance of restriction activity following the acquisition of R-M genes by a recipient cell lacking the cognate modification activity is only one of a number of situations in which a temporary loss of restriction proficiency is required. A temporary loss of restriction proficiency, referred to as restriction alleviation (RA), also occurs in response to treatments that damage DNA (17, 38, 50, 67, 177, 178). UV light, nalidixic acid, 2-aminopurine (2-AP), and 5-bromouracil have all been shown to induce RA. The alleviation of restriction in response to these treatments, like the temporary loss of restriction proficiency associated with the establishment of a new specificity, is dependent on ClpXP (111).

DNA damage can lead to the generation of unmodified target sequences. These may arise either as the result of DNA repair via homologous recombination or as a consequence of mutations that create new target sequences. Double-strand breaks induced by external agents or by genetic lesions that lead to stalling of DNA replication are repaired by RecA-dependent homologous recombination. If this recombination involves two segments of hemimethylated DNA, either the annealing of unmethylated strands or DNA synthesis may generate localized regions of unmethylated DNA (Fig. 7). In contrast, increasing the frequency of mismatches either by treatment with a mutagenic agent such as 2-AP, an analog of adenine, or by a mutator gene such as mutD will occasionally generate new but unmodified target sequences.


View larger version (30K):
[in this window]
[in a new window]
 
FIG. 7.   Generation of unmodified target sequences following UV irradiation. Methylated strands of DNA are shown as thick lines, and unmethylated strands are shown as thin lines. Homologous recombination, involved in the repair of double-strand breaks or postreplicative repair, can generate regions of unmethylated, double-stranded DNA via annealing of two unmethylated strands (regions within boxes). In addition, the SOS mutagenesis pathway leads to new (unmodified) target sequences as the result of base changes.

Restriction is found to be permanently (constitutively) alleviated in dam (49), topA, and mutD strains, and in each strain restriction is restored by a mutation in clpX (111). Unmodified targets in the chromosomal DNA of these mutant strains may arise continuously as the result of enhanced rates of mutation and DNA repair, and they could provide the signal for cells to protect their DNA by the alleviation of restriction activity.

The recent analysis of the role of ClpXP in RA has led to the identification of a molecular pathway that protects the bacterial chromosome against the potentially lethal effects of either an established type I R-M system or a newly acquired one. In cells encoding either EcoKI or EcoAI, RA in response to treatment with 2-AP correlates with the ClpXP-dependent loss of the HsdR subunit (111). The degradation of HsdR is prevented by either a missense mutation in hsdR or the absence of HsdM and HsdS, as if HsdR is degraded only when it is part of a functional complex. If the stimulus for RA is DNA breakage, then those breaks made by another system should serve as a stimulus for the degradation of the HsdR subunit of EcoKI. There is no evidence for degradation of the HsdR subunit of a nonfunctional EcoKI complex when treatment with 2-AP induces degradation of the HsdR subunit of functional EcoAI within the same bacterium. It appears that the signal for RA lies within EcoKI itself. This finding and the dependence on unmodified DNA targets suggest that HsdR is only recognized by ClpXP after the R-M complex has embarked on its restriction pathway (Fig. 8). Some step in the ATP-dependent DNA translocation could expose the target in HsdR to ClpX (111). Strong support for this hypothesis is provided by the finding that mutations in each of the seven DEAD-box motifs that impair translocation and ATPase activity (37) prevent ClpXP-dependent degradation of HsdR (111; V. Doronina and N. E. Murray, unpublished data), while mutations in the endonuclease motif that do not affect translocation (37) leave a complex in which the HsdR subunit remains accessible to ClpXP (V. Doronina and N. E. Murray, unpublished data). The "functional restriction complex" required as a substrate for proteolysis is one that can translocate DNA, not necessarily one that can break DNA. A remarkably specific control mechanism is inferred, effective only once the relevant restriction pathway has been initiated and able to act before any damage is inflicted on unmodified chromosomal DNA (111). The model meets an earlier suggestion of Heitman (