
Department of Biochemistry and Molecular Genetics, College of Medicine, University of Illinois at Chicago, Chicago, Illinois 60607
SUMMARY INTRODUCTION NATURAL IMPEDIMENTS TO DNA REPLICATION DNA Binding Proteins Termination of replication in prokaryotes. Eukaryotic ribosomal barriers. Epstein-Barr virus protein EBNA-1. Nonhistone protein-DNA complexes in budding yeast. Replication stalling in the mating-type switch locus of fission yeast. Artificially constructed, tightly bound protein-DNA complexes in E. coli. Transcription Transcription-replication collisions. Transcription-replication collisions and organization of genomes. Strong RNA-DNA hybrids. Unusual DNA Structures Brief overview of unusual DNA structures. Inhibition of replication by unusual DNA structures. Replication Slow Zones in Budding Yeast GENOMIC INSTABILITY CAUSED BY REPLICATION STALLING AT NATURAL IMPEDIMENTS Genomic Instability Caused by Protein Binding Genomic Instability Caused by Transcription Genomic Instability Caused by Unusual DNA Structures Chromosomal Fragility CONCLUSION ACKNOWLEDGMENTS REFERENCES
| SUMMARY |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Exogenous factors that affect DNA replication do so either by damaging the DNA template (for example, UV light, gamma irradiation, DNA-modifying agents, and topoisomerase poisons) or by depleting nucleotide pools (for example, hydroxyurea and methotrexate) (71). In the first case, replication is blocked at the sites of damage because of the inability of the replication fork to pass through a corrupted DNA template. In the second case, replication is inhibited throughout the genome because of the lack of all or some of the deoxynucleoside triphosphates (dNTPs) necessary for DNA synthesis. Technically speaking, chemically induced DNA lesions can also be considered an intrinsic impediment to replication, since they abound under physiological conditions in the absence of any external damaging agents. For example, an average Escherichia coli cell has about 100 such lesions in the DNA, each with a 30-min half-life. When unrepaired, these lesions represent a challenge to replication (276).
Genetic determinants of replication inhibition are mutations in genes that affect the accuracy and speed of DNA synthesis (for example, components of the replication apparatus and nucleotide pool control).
Intrinsic, or natural, impediments of replication include DNA binding proteins, transcription units, unusual DNA structures, and replication slow zones (recent reviews include references 109 and 251). A growing body of evidence accumulated over the last decade indicates that replication inhibition by natural impediments can lead to genomic instability, drawing additional attention to this issue (for example, see references 15, 32, 33, 150, 201, 211, 242, 279, and 289 and below). In this review, evidence of replication inhibition at natural impediments is summarized and discussed, with emphasis on the genome's function and stability.
| NATURAL IMPEDIMENTS TO DNA REPLICATION |
|---|
|
|
|---|
Stalling of replication forks at natural impediments can either occur "on purpose" or be "accidental" (if inhibition of replication is a side effect of another biological process). Sometimes, the distinction between the two scenarios is not obvious. The first instance includes termination of replication at bacterial termini and in eukaryotic ribosomal operons; replication slow zones in budding yeast; both the replication termination site and the replication pause site in the mating-type locus in fission yeast; and, probably, EBNA-1 of Epstein-Barr virus (EBV). The latter instance probably includes unusual DNA structures, collisions with transcription, and certain proteins (such as TraY in E. coli and certain nonhistone-DNA complexes in yeast). Typically, replication arrest in the first instance is nearly complete; hence, such sites are usually called RFBs, for "replication fork barriers" (the exceptions are replication slow zones in budding yeast and the replication pause site in the fission yeast mating-type locus). In the second instance, forks are usually capable of bypassing the obstacle after temporarily pausing. Such sites are therefore called RFPs, for "replication fork pausing" sites. Pausing can last from several seconds, as it does in yeast centromeres, to almost half an hour, as in EBV.
|
-helical segments on the fork-blocking side of Tus (Fig. 1B, left). Therefore, it was suggested that when the DnaB helicase approaches the Ter-Tus complex from its blocking end, it encounters these segments and cannot reach the DNA binding domain to disrupt the protein-DNA interactions. In contrast, when the helicase approaches from the passage end, it can reach the DNA binding domain, disrupt Tus-DNA contacts, and proceed further. Thus, the presence of these protruding
-helical segments on one side of Tus could determine the orientation specificity of fork arrest. (ii) It was further suggested that the contrahelicase activity of Tus could depend on specific protein-protein contacts between Tus and DnaB helicase rather than resulting from "passive roadblock" (210). The region of this protein-protein interaction was mapped to the L1 loop of Tus, located on its helicase-blocking side. It was therefore proposed that the localization of the L1 loop could be the determinant of polarity as well. (iii) Recently, it was suggested that when DnaB approaches the Ter-Tus complex from the blocking side, it unwinds the corresponding part of the Ter site, exposing a particular cytosine residue. The contact between this flipped cytosine and Tus stabilizes the complex to such an extent that it dissociates 40 times more slowly than the regular Ter-Tus complex. This locking behavior can occur only when DnaB approaches from the blocking side of Tus (209). Note that model i implies simple steric hindrance, model ii implies specific protein-protein contacts, and model iii implies asymmetrical DNA unwinding but not protein-protein interactions. The mechanism of replication termination in Bacillus subtilis is generally similar to that in E. coli in terms of the existence of a trap opposite from the origin of replication in the chromosome, consisting of two clusters of sequences that arrest replication forks of opposite polarities (Fig. 1A, right). Yet both protein and DNA components differ in sequence as well as architecture, and the explanation of the polar nature of arrest seems to be different (for reviews, see references 5, 27, and 97). There are at least six termination sites: TerI through TerVI, with TerI being the most frequently used (TerI and TerII were discovered first; until recently, they were referred to in the literature as IRI and IRII, for inverted repeat I and inverted repeat II, respectively). Each Ter site is bipartite, i.e., it consists of two elements, core and auxiliary sites, and each of those elements binds one homodimer of the replication termination protein, RTP (the molecular weight of a monomer is 14.5 kDa). RTP was purified (158) and shown to arrest replication in vitro in an orientation-dependent manner when bound to the TerI (IRI) and TerII (IRII) sites (124, 271). RTP and Tus share very little sequence similarity, and the crystal structure of RTP revealed a lack of structural similarity and confirmed that RTP exists as a symmetrical dimer (28).
Two symmetrical RTP homodimers bind one Ter site. How can a symmetrical protein structure arrest replication in an orientation-dependent manner? The answer comes from the asymmetry of protein-DNA contacts between RTP dimers and core and auxiliary sites (Fig. 1B, right) (27, 151, 178, 252). RTP binds the core site much more strongly than the auxiliary site. It can bind the core site in the absence of the auxiliary site but not vice versa. The replication fork is arrested only if it approaches the core site first, and yet the presence of the auxiliary site is required. These data, taken together, suggest the following explanation. When the replication fork approaches the auxiliary site first, it displaces the RTP dimer bound to it, and the RTP dimer at the core site can be displaced as well. However, if the replication fork approaches the core site first, the RTP dimer at the core site cannot be displaced, owing to the cooperative action of the dimer bound to the auxiliary site. This explanation is supported by the fact that bidirectional Ter sites from B. subtilis plasmids consist of two core sites (191).
The RTP dimer has not been crystallized together with its binding sites. Thus, it is theoretically possible that the structure of the RTP dimer bound to the core site might differ from that of the RTP dimer bound to the auxiliary site, which could add to the asymmetry of the complex. Like Tus, RTP is probably not a simple roadblock either, since specific interactions between the contrahelicase surface of RTP and the helicase appear to be important for replication arrest (178).
Why is it important to block the progression of the forks at the terminus rather than allowing them to meet and terminate at random? One possible explanation comes from the observation that in vitro, Ter sites positioned in the oriC-replicated plasmid the same way they are positioned in the chromosome prevent overreplication of the plasmid DNA (94). In the presence of Tus, the replication fork that reached a Ter site in the blocking orientation first was arrested, and the second replication fork stopped when it reached the first one. In contrast, in the absence of Tus, replication failed to cease when the two forks met each other, leading to overreplication. Therefore, the authors suggested that the role of the Ter/Tus system might be in preventing chromosomal overreplication. Although it is not obvious why the replication fork arrested at the Ter-Tus site would stop the oppositely moving fork whereas a simple collision between the two forks would not, this explanation seems very appealing. Supporting it is the finding that the absence of Tus in vivo leads to the switch to rolling circle replication instead of termination and unstable maintenance of the R1 plasmid (138). It would be interesting to know how eukaryotes get around this problem, since it is believed that termination of replication in eukaryotic chromosomes occurs by random meeting of the two converging forks (or by their reaching the end of the chromosome). Perhaps eukaryotic forks are different in the sense that they properly terminate upon meeting each other. Replication slow zones in yeast (33; also see below), which lead to periodic pausing of forks in the genome, might also provide a clue.
Another explanation could be that the existence of the terminus as a spatial chromosomal macrodomain helps coordinate the completion of replication, resolution of accidental chromosome dimers (by the XerCD/dif system), chromosomal segregation, and cell division in bacteria. Unlike eukaryotes, bacteria do not separate these processes temporally and therefore require tight coordination between them (see, for example, reference 216 and references therein). In E. coli, coordination between cell division and chromosome segregation is facilitated by FtsK, a septum-localized DNA translocase, which assists the resolution of chromosome dimers by the XerCD recombinase, and decatenation of sister chromatids by interaction with the DNA topoisomerase IV (232).
In both gram-positive and gram-negative bacteria, environmental stresses (such as amino acid starvation) induce accumulation of a signaling molecule, ppGpp (guanosinetetraphosphate), which activates a stringent responsea pleiotropic reaction aimed at energy savingthat includes shutting down transcription of rRNAs and replication (31). During the stringent response in E. coli, replication is blocked at the level of initiation. In B. subtilis, DNA synthesis starts but both forks become arrested at around 200 kb from the origin (Fig. 1A, right) (157). This phenomenon was called "replication checkpoint of B. subtilis," and its functioning was shown to depend on the presence of RTP (156); however, the mechanism of regulation of the replication arrest is not known. The RTP binding site was located around one of the two arrest sites (4). It was further shown that this site was bipartite, like Ter, and bound two RTP dimers (although much more weakly than Ter sites from the terminus). Surprisingly, however, when placed in a plasmid, it was capable of arresting replication in vivo regardless of orientation and the stringent response (75). Therefore, it is not clear whether this site is actually involved in the checkpoint fork arrest and, if it is, what makes it active only during the stringent response. One elegant answer to the last question could be that RTP binding to the checkpoint sites is disrupted by transcription through them, which is shut down by ppGpp during the stringent response (27). It is also not known whether replication arrest during the checkpoint possesses directionality (because the sites of arrest are located close to the origin, there might be no need for it).
Interestingly, elevation of ppGpp levels dramatically improves the survival of UV-irradiated E. coli cells deficient in the Holliday junction resolvase RuvABC. This is likely owing to ppGpp-mediated destabilization of transcription complexes stalled at UV-induced lesions in DNA (184; also see below). Therefore, although the stringent response inhibits replication initiation in E. coli, it also helps in the completion of already-started rounds of replication.
Studies of the bacterial replication termination proteins Tus and RTP showed that the strength of binding to DNA does not necessarily determine whether the protein blocks the fork. Instead, the architecture of the protein and/or its interaction with the components of the replication machinery matters (27, 75, 120, 210).
Eukaryotic ribosomal barriers. The eukaryotic rRNA RFB was first discovered in Saccharomyces cerevisiae, in the nontranscribed spacer of the rRNA genes (21, 162). The rRNA locus in S. cerevisiae is arranged as a cluster of tandem repeats of the following composition: 35S RNA (precursor of the 18S, 5.8S, and 25S rRNAs), transcribed by RNA polymerase I, followed by the first nontranscribed spacer and 5S rRNA, transcribed by the RNA polymerase III in the opposite direction, followed by the second nontranscribed spacer; the latter contains a bidirectional origin of replication (Fig. 2A). It was demonstrated that the first nontranscribed spacer contains a replication barrier (21, 162). The RFB function does not depend on transcription per se (22) but is caused by the DNA binding protein Fob1 (132), which wraps the FRB DNA around itself by binding two sites in the target sequence (129). One site, RFB1, is the major site of replication arrest, and the other site, FRB2, is the minor site (22, 129). Fob1 possesses potent polar contrahelicase activity (208), arresting only the forks that are about to enter the transcription units in the 3'-to-5' direction. It was therefore suggested that a possible role of ribosomal barriers was to protect the rRNA genes from the head-on transcription-replication collisions (see below). Deletion of the fob1 gene in S. cerevisiae (along with a reduction in the copy number of rRNA gene repeats) leads to transcription-mediated replication inhibition along the whole transcribed area (279). Two proteins which are known to act upon stalled replication forks, Tof1 and Csm3, have been shown to affect forks arrested at the Fob1-caused barrier: in their absence, the replication blockage becomes less prominent (29). Tof1 is the budding yeast homologue of the fission yeast Swi1, and Csm3 is the budding yeast homologue of the fission yeast Swi3 (see below).
|
Schizosaccharomyces pombe seems to combine both aforementioned scenarios. There are three pause sites in the S. pombe rRNA gene that are caused by protein binding (the fourth site, RFP4, is minor and becomes prominent in the absence of RFB1 to RFB3; it is most likely caused by collisions with transcription [145]). RFB1 is the strongest arrest site. It is located distal to the rRNA transcription unit and is therefore met by the replication fork first. Inhibition of replication at RFB1 is mediated by the Sap1 protein (144, 192). Like Fob1, Sap1 is not involved in the termination of transcription of rRNA, but unlike Fob1, it possesses another important function, most likely in chromatin organization, which makes it essential for yeast viability. Sap1 is also involved in another event associated with replication stalling, fork pausing in the mating-type switch locus mat1, although in that case, Sap1 binding is not required for the fork stalling but rather is involved in the imprinting (see below). Replication fork stalling at RFB2 and RFB3 (located proximal to the rRNA transcription unit) is caused by the transcription termination protein Reb1, as in the mammalian scenario (256). Replication stalling at all three sites is dependent on the products of two mating-type switching genes, swi1 and swi3 (145), just as the stalling in the mating-type switch locus mat1 (see below). Swi1 (homolog of the budding yeast Tof1) was shown to stabilize stalled forks genome-wide and to activate checkpoint in response to genotoxic stress (217). Interestingly, forks stalled at RFP4 did not depend on Swi1 and Swi3 (145). A comparison of RFBs from budding yeast, mammals, and fission yeast is shown in Fig. 2B.
Ribosomal RFBs have also been observed in Pisum sativum (171), Tetrahymena thermophila (174, 310), Xenopus laevis (181, 300), and humans (165). In Tetrahymena (310) and Xenopus (181), they appear to be developmentally regulated, being most evident at the stages of maximal amplification and transcription, respectively, of the rRNA genes. Unlike all of the other organisms examined so far, the ribosomal RFB in human rRNA locus was found to be nonpolar (bidirectional), while still acting to make transcription and replication proceed in the same direction (165). Interestingly, in T. thermophila, the RFB was shown to predominantly inhibit forks moving in the direction of transcription (174).
Epstein-Barr virus protein EBNA-1. EBNA-1 (for "Epstein-Barr nuclear antigen 1") is the only viral protein essential for latent replication of the EBV circular genome from the latent origin of replication, OriP. (In contrast, lytic EBV replication uses two other origins, orilytL and orilytR, and virus-encoded replication machinery [reviewed in reference 292].) OriP consists of two functional elements: FR (for "family of repeats"), which has 20 strong EBNA-1 binding sites, and DS (for "dyad symmetry"), which has 4 weaker EBNA-1 binding sites. EBNA-1 binding to the DS element facilitates initiation of replication (presumably by recruitment of ORC [for "origin recognition complex"]). Two replication forks assemble at the DS element and proceed in opposite directions. Subsequently, one of them is arrested in the FR element. Thus, initially bidirectional replication is converted essentially into a unidirectional mode (Fig. 3). The replication barrier in the EBV DNA (74) was shown to be EBNA-1 dependent: EBNA-1 significantly enhances replication pausing at this barrier, but, interestingly, replication pausing occurs even in the absence of EBNA-1 (although to a much lesser extent) (59). In vitro, EBNA-1 binding to the DNA substrate inhibits the helicase activity of both SV40 large T antigen and the E. coli main replicative helicase DnaB in an orientation-independent manner (unlike Tus, RTP, and Fob-1) (64).
|
The biological meaning of the EBNA-1-mediated replication barrier and the resulting asymmetrical replication of the viral genome could be the prevention of the head-on transcription-replication collisions at the EBERs. EBERs (EBER-1 and EBER-2, two nontranslated EBV-encoded RNAs) are latent phase-specific small RNAs transcribed by RNA polymerase III. They are located close to the FR element and are replicated codirectionally with transcription as a result of replication arrest (Fig. 3). Alternatively, replication inhibition at the sites of EBNA-1 binding to FR might be a mere consequence of its strong binding. The FR element is required for the proper segregation of viral DNA in the host cells after mitosis, and strong binding of EBNA-1 to FR ensures the segregation.
Nonhistone protein-DNA complexes in budding yeast. Extensive studies of chromosomal replication in the budding yeast S. cerevisiae detected replication stalling at a number of genomic loci where nonhistone proteins were bound to DNA beside the ribosomal barrier. The first example of inhibition of replication by tightly bound protein-DNA complexes was described when replication forks were shown to pause at centromeres in the S. cerevisiae genome (82). By altering the direction of replication through the centromeric region of the genome, the authors were able to conclude that the pausing was orientation independent. Using yeast plasmids, they confirmed the orientation independence of the pause site and showed that this pausing strongly correlated with the ability of the centromeric DNA to bind nonhistone proteins that form a tightly packed nuclease-resistant structure of centromeres. Comparison of the amount of stalled versus normal replication intermediates allowed them to estimate the time of pausing as 0.1 to 0.2 min.
Replication stalling was detected at inactive origins of replication located near one of the transcriptionally silent mating-type loci, the HML locus (293). The authors suggested that it is most likely caused by the presence of origin-specific proteins, such as ORC (for "origin recognition complex").
Replication stalling was also observed in telomeric and subtelomeric regions (115). It was shown to be orientation independent, and the only candidate protein responsible for it was the telomeric DNA binding protein Rap1 (the Reb1, Tbf1, Rif1 and Rif2, and Sir2, Sir3, and Sir4 proteins were not required) (177).
The Rrm3 helicase was identified because its absence increased recombination in the rRNA locus in yeast (126). When replication in the rrm3-deficient yeast strain was studied meticulously (114-116), it appeared that replication forks pause at about 1,400 discrete sites in the S. cerevisiae genome. Breakage of forks and high levels of recombination were also observed at these sites. These sites include tRNA genes, centromeres, telomeres, silent mating-type loci, and inactive origins of replication as well as the rRNA locus (specifically, the 5S rRNA genes, the beginnings and ends of the 35S rRNA genes, and inactive origins of replication), i.e., positions of tightly bound nonhistone protein-DNA complexes. Confirming the involvement of the protein-DNA complexes, disruption of the binding sites and/or elimination of the corresponding proteins abolished replication stalling. The fact that some replication stalling at the same loci was observed in the wild-type yeast but was greatly increased in the absence of Rrm3 suggests that the function of Rrm3 is to remove such tightly bound protein conglomerates from the DNA to assist replication of the genome. The existence of at least one helicase, which helps the replication forks to pass tightly bound protein-DNA complexes without stalling and to avoid subsequent fork breakage at numerous sites in the genome, indicates the importance and the genome-wide scale of the problem of protein-caused impediments of replication (114). In vertebrates, the Williams syndrome transcription factor-imitation-switch protein (WSTF-ISWI) chromatin remodeling complex (ISWI is a nucleosome-dependent ATPase) might be involved in assisting the replication of heterochromatin (18).
Replication stalling in the mating-type switch locus of fission yeast. The fission yeast S. pombe has two mating types: plus and minus. The current type of a cell is encoded in the mat1 locus. The switch occurs by means of transferring the genetic information from one of the silent donor loci, mat2-P or mat3-M, into the active mat1 locus (Fig. 4). The switch is regulated in a very interesting way: it occurs after two consecutive divisions, with only one of the four cells undergoing it. When a newly switched cell divides, it generates a "nonswitchable" cell and a "switchable" cell; in the next round of division, the switchable cell gives rise to a nonswitched cell and a switched cell. Since all of the cells are identical in terms of genetic information, epigenetic regulation, i.e., a strand-specific imprinting that marks the DNA strand that is segregated to the switchable cell, has been proposed to explain the asymmetry (reviewed in reference 46).
|
Orientation-specific pausing of the replication fork was observed at the site of imprinting (called MPS1, for "mat1 pause site 1" [Fig. 4]). It was suggested that this pausing facilitates placement of the primer at a specific position of the lagging strand (48).
The Swi1 and Swi3 proteins, which are necessary for the imprinting, were shown to also be necessary for replication stalling at both sites (48). Swi1 (a homologue of budding yeast Tof1) is involved in the stabilization of stalled forks and activation of checkpoint in response to genotoxic stress (217), and both Swi1 and Swi3 are involved in replication stalling at the ribosomal locus in S. pombe (see above). Interestingly, the Sap1 protein, which is responsible for replication stalling in the ribosomal locus, is involved here as well. However, Sap1 is not necessary here for the fork pausing itself but rather is involved in the establishment and/or maintenance of the imprinted state (48). Another two proteins, Rtf1 and Rtf2 (replication termination factors 1 and 2, respectively), contribute to the replication stalling at RTS1 (41). Rtf1 has sequence similarity with Reb1 of budding yeast and TTF-1 of mammals, which cause replication stalling in rRNA genes (see above).
Although it is presumed that protein-DNA complexes are responsible for the replication stalling at MPS1 and replication termination at RTS1, the cause of neither has been identified. The Swi1 and Swi3 proteins are not known to possess sequence specificity; thus, although they play a major role in replication stalling, there should be another factor that is responsible for the sequence specificity (Rtf1 and Rtf2 are good candidates).
Note that the mechanism used by fission yeast is different from that used by the budding yeast S. cerevisiae, which can switch mating types as often as every generation by a mechanism that involves the introduction of a double-stranded break by a site-specific endonuclease (reviewed in reference 83).
Artificially constructed, tightly bound protein-DNA complexes in E. coli. Controlled, site-specific replication stalling was achieved by using tight binding of multiple molecules of TetR-YFP (tetracycline repressor fused to yellow fluorescent protein) to an array of 240 copies of its operator, tetO, inserted into the E. coli chromosome (239). Induction of the expression of TetR-YFP caused replication stalling independent of the fork's polarity. The stalling was observed on two-dimensional gels and prevented cell proliferation, which indicated that even after a prolonged period of time, this barrier could not be overcome by the replication fork. Addition of anhydrotetracycline, which abolishes the binding of TetR to tetO, resulted in resumption of replication and disappearance of the stalled intermediates in just 5 min. This, together with the fact that the SSB protein was associated with the tetO array for several hours, suggests that the replication fork was paused in a physiological state and resumed replication immediately after the obstacle was removed. Supporting this explanation is the finding that replication restart did not require recombination proteins.
"Infinite" replication stalling at the TetR-tetO array resulted in unviability of cells, so why did the cells not deal with this situation by homologous recombination? One explanation could be that the bulky, 240-copy protein-DNA complex prevented the accessibility of DNA for the recombination machinery.
Another example of replication stalling at artificially constructed, tightly bound, bulky protein-DNA complexes in E. coli comes from a study of replication of the d(GA)n/d(TC)n minisatellite. When cloned in a plasmid, the d(GA)n/d(TC)n repeats caused a length-dependent and orientation-independent replication inhibition in E. coli (142). This turned out to be owing to the binding of multiple molecules of the TraY protein. TraY belongs to the family of ribbon-helix-helix DNA-binding proteins (246) and is essential for F factor conjugal transfer (152). Binding of TraY to DNA introduces substantial bending (173); thus, binding of multiple TraY protomers to d(GA)n/d(TC)n repeats results in the formation of nucleosome-like structures wherein the DNA is wrapped around the core of TraY molecules. The existence of such a complex is supported by the observation of TraY-induced topological changes in circular DNA molecules carrying the d(GA)n/d(TC)n repeats.
800 nucleotides [nt]/s versus 20 to 50 nt/s [reviewed in reference 136]). Therefore, both in the head-on and in the codirectional scenarios, there would be a collision. However, in eukaryotes, the speed of replication is comparable to that of transcription, making codirectional collisions unlikely.
|
29 replisome halted upon encountering stalled B. subtilis RNA polymerase in both orientations (62, 63). Once RNA polymerase movement was resumed, DNA synthesis continued at its normal speed in the head-on case but was much slower for the codirectional alignment. The first evidence of transcription-replication collisions in vivo came from studies of an inducible replication origin placed on either side of the rrnB ribosomal operon in the E. coli chromosome (67). Replication codirectional with the ribosomal operon proceeded at its typical high speed but was significantly slower when faced with head-on transcription. In eukaryotes, transcription-replication collisions were detected at two genomic loci: tRNA (58) and rRNA (279). In the first case, polar RFPs were observed at tRNA genes in S. cerevisiae (58). The replication fork stalled when it encountered tRNA genes transcribed head-on but not those transcribed codirectionally. The dependence of RFP activity on the functionality of both the gene promoter and RNA polymerase III argued for the direct involvement of transcription. It was proposed that the transcription elongation complex was responsible for the RFP sites at the tRNA genes. Recently, however, it was suggested that the transcription initiation complex, rather than elongating RNA polymerase, could be responsible for replication slowing at these genes (114). In the second case, transcription-replication collisions at the ribosomal locus in S. cerevisiae were detected when the fob1 gene (which encodes the Fob1 protein, responsible for the replication fork barrier, RFB, at the 3' end of the rRNA genes; see above) was deleted and the number of rRNA gene repeats was reduced to increase the ratio of transcribed to nontranscribed rRNA (279).
Although the general consensus was that head-on collisions with the transcription machinery are much more detrimental for replication fork progression than are the codirectional collisions, the mechanism responsible for replication slowing in the head-on versus codirectional scenarios was under debate. Two possible scenarios of replication inhibition in the case of head-on collision with the transcription machinery have been proposed: physical interaction with the transcription machinery or excessive positive superhelicity generated by the two head-on processes. Both elongating RNA polymerase (168, 303) and the replication fork (237) generate positive supercoils in the downstream DNA. Consequently, frontal movement of the two machineries could generate a highly positively supercoiled DNA domain restraining both processes prior to direct encounter (19, 58, 67). Supporting the formation of such positively supercoiled domains, the fraction of DNA knots appeared to be greater in those plasmids where replication collided with transcription head-on than codirectionally (222). This was explained by the migration of positive supercoils, accumulated between the replisome and RNA polymerase, to the newly synthesized DNA behind the fork. Recently, a study was set up to distinguish between the two scenarios (199). It confirmed that codirectional transcription had no (or little) effect on replication fork progression in E. coli cells compared with that of head-on transcription, which severely impeded replication fork progression in ColE1-based plasmids. It also proved that in the head-on scenario, the replication fork was slowed as the result of direct physical interaction with the transcription machinery rather than by propagation of superhelical stress, since replication pausing zones were, in fact, strictly limited to the transcribed DNA areas. Note that these data do not refute the accumulation of positive supercoils upon the head-on transcription-replication collisions (222) but exclude these topological constraints as the cause of replication inhibition.
There is more to transcription than elongation. Recently, replication stalling caused by the transcription initiation complex as well as the RNA polymerase present at the transcription terminator was observed in E. coli (198). These findings might be important given that the majority of genes are not actively transcribed during DNA replication. The replication fork stalls upon head-on encounters with the transcription initiation complex and codirectional encounters with the RNA polymerase present at the transcription terminator. (The latter is likely owing to the existence of some trapped form of the RNA polymerase that cannot be readily displaced from DNA by the replication fork, as in the course of codirectional collision with the normal transcription elongation complex).
Notably, in both instances, the replication fork stalled after passing the transcribed region from either direction (Fig. 6). It is therefore plausible to speculate that transcription initiation and termination elements could serve as polar "punctuation marks" for DNA replication, i.e., attenuating replication fork progression as it traverses the transcribed areas. One possible role of such "punctuation marks" could be to provide extra time for the mismatch repair or gene conversion machineries to clear the coding areas of any newly acquired mutations, thereby helping to maintain the integrity of transcribed regions.
|
Transcription-replication collisions and organization of genomes. The data on the organization of bacterial, plasmid, and bacteriophage genomes point to selection against head-on collisions (19, 218). Sequencing of the E. coli genome confirmed that there was a bias toward codirectional alignment of transcription units with the direction of replication (16). Most strikingly, all seven ribosomal operons face the direction of their replichores. For other genes, however, this bias is much less pronounced: approximately 62% of tRNA genes and only approximately 55% of protein-coding genes are aligned codirectionally with replication. Similar principles of gene arrangement were observed for other bacteria, such as B. subtilis, Borrelia burgdorferi, Treponema pallidum, Haemophilus influenzae, Helicobacter pylori, Mycoplasma genitalium, and Mycoplasma pneumoniae (187), as well as for bacteriophages T7 and lambda (19).
While both DNA replication and transcription in bacteria continue throughout the life cycle, the situation is different in eukaryotes, owing to the existence of the phases in the cell cycle and temporal separation of DNA replication from most of transcription. It is well known, however, that at least some genes, including histone genes and genes coding for components of the protein synthesis machinery, such as tRNA and rRNA genes, are being transcribed during S phase (37, 235, 273). In the case of rRNA genes, the RFB at their 3' end assures codirectional alignment of transcription and replication (see above). The existence of such barriers suggests a requirement to protect important genes from head-on transcription-replication collisions.
Particularly highly expressed protein-coding genes, which represent only a small fraction of all genes, are spaced along the chromosome, oriented codirectionally with replication, and form their own topological domains (53, 54). At the same time, genome-wide analysis of gene distribution in bacteria suggested that it is the "essentiality" rather than the "expressiveness" of a gene that likely determines its orientation relative to replication (250). About 90% of essential genes in B. subtilis and 70% of essential genes in E. coli are transcribed codirectionally with replication. The authors believe that the deleterious consequences of the head-on transcription-replication collisions could come from the displacement of the RNA polymerase, formation of truncated transcripts, and consequently, truncated proteins that would serve as dominant-negative forms of essential proteins. Note, however, that the amounts of such truncated peptides should be negligible given the short time that the replication fork spends at any given gene. Also, this model implies that RNA polymerase is displaced during head-on collisions but not during codirectional collisions, which might not be the case.
The issue of RNA polymerase displacement is somewhat contradictory. In vitro, RNA polymerases from both E. coli and B. subtilis were shown to stay bound to the DNA template and to continue elongation after the passage of the replication fork in both orientations (62, 63, 166, 167), but in an in vivo study carried out with E. coli, RNA polymerase was shown to be dislodged from DNA when it encountered the replication fork in both orientations (67). An alternative explanation for the preferential alignment of essential genes with the direction of replication could therefore be that the head-on transcription-replication collisions may be avoided at essential genes to prevent their inhibitory effects on replication. Blockage of DNA replication and subsequent restart occasionally might lead to genomic instability. In the case of head-on transcription-replication collisions, there is a fair chance that the restarted fork would encounter another elongating RNA polymerase, increasing the chances of inaccurate restart. These multiple replication restarts could thus be avoided within essential genes to protect their loci from instability.
The study (184) strongly supports the idea that stalled RNA polymerase could be a major challenge for replication in vivo and could require replication restart. The authors observe that an increase in the amount of the stringent response messenger ppGpp dramatically improves the survival rate of UV-irradiated E. coli cells deficient in the Holliday junction resolvase RuvABC and that this survival depends on the RecG helicase. These data are explained in the following way. UV irradiation leads to the appearance of lesions in DNA, which cause RNA polymerase stalling; the subsequent collisions between the replication fork and stalled RNA polymerase molecules require replication restart. In the wild-type cells, replication restart can be carried out by the action of RuvABC on stalled replication forks. Other pathways of restart are probably less prominent, as the cells die due to the inability to cope with the amount of required restart events in the absence of RuvABC. At high levels of ppGpp, which destabilizes transcription complexes stalled at UV-induced lesions, the amount of stalled RNA polymerases becomes smaller and RuvABC-deficient cells can survive UV irradiation, because other pathways become sufficient to deal with the small number of stalled forks. This observation proves that the presence of stable stalled transcription complexes creates a requirement for the replication restart and, if this requirement is not met, leads to loss of viability. The viability of UV-irradiated RuvABC-deficient cells at high levels of ppGpp depends on the RecG helicase, which implicates RecG in the other pathways of restart. RecG can facilitate regression of the fork, formation of the four-way junction, and, upon repair or bypass of the lesion, reannealing of the nascent strands.
Direct evidence of genomic instability caused by head-on transcription-replication collisions is described below.
Strong RNA-DNA hybrids. Two other transcription-related examples of replication inhibition have been described in vivo: R loops (259, 260, 288) and transcription-dependent replication inhibition at the poly(G)/poly(C) repeat (141).
An R loop is a structure formed when cRNA hybridizes with one of the strands in the DNA duplex, displacing the other DNA strand and making it form a loop. Short, transient R loops are normal intermediates of transcription. Long and unusually stable R loops form when transcribed RNA is meant to be a primer for DNA replication; i.e., it needs to stay attached to DNA after the RNA polymerase is gone. This is the case for certain bacterial plasmids, for example, ColE1-based plasmids (113), and for the mitochondrial genomes of eukaryotic cells from yeasts to humans (7, 35, 223). (In ColE1-based plasmids, the transcript that forms the R loop and primes replication is called RNA II).
R loops have been shown to interfere with the movement of the replication fork in the study of a peculiar derivative of the pBR322 plasmid with two ColE1 origins of replication facing each other. (This is different from the majority of spontaneously formed plasmid dimers that are always organized "head-to-tail" so the two origins face the same direction.) As with other plasmid dimers, both origins in this one were potentially functional, but only one was active in any replicating molecule. The silent origin was capable of stalling the replication fork moving from the active origin, and the mutual orientation of the origins was crucial for this effect (288). This effect was completely dependent on the presence of the promoter within the silent origin (259). Given that the main replicative helicase, DnaB, is not capable of dissociating RNA-DNA hybrids in vitro, it was suggested that DnaB is unable to unwind the RNA-DNA hybrid at the silent origin. Since DnaB moves in the 5'-to-3' direction along the lagging-strand template, it encounters hybrids transcribed from the head-on-oriented promoters but not from the codirectional ones. In accordance with this explanation, the site of the replication blockage mapped exactly to the 3' end of RNA II (260).
The ColE1 promoter transcript RNA II is different from regular transcripts. It forms a long hybrid with the template DNA, thus allowing for the priming of initiation of replication. Since it is this hybrid that arrests replication, there is strong doubt that this situation could be similar to the conventional collision of replication and transcription from regular promoters because RNA normally does not form such long, stable hybrids with DNA. Long R loops at highly transcribed regions become evident only in the absence of one of three cellular enzymes: topoisomerase I, RNase H, or RecG. Thus, these enzymes normally take care of long R loops by preventing their formation, hydrolyzing them, or unwinding them, respectively (61, 103, 290).
Topoisomerase I relaxes negative superhelicity formed behind the elongating RNA polymerase (60). In its absence, R-loop formation becomes favorable, owing to the increased negative superhelicity that promotes DNA unwinding and thus the formation of the RNA-DNA hybrid. Growth of the topA strain is impaired, especially at low temperatures (when the RNA-DNA hybrids are particularly stable) (182). It is completely abolished in the absence of RNase H (which degrades RNA in the RNA-DNA hybrids) and is partially rescued when it is overexpressed (61). Inactivation of RecG, a structure-specific helicase that promotes branch migration of Holliday junctions, is also synthetically lethal with topoisomerase I deficiency (103). In vitro, RecG can efficiently dissociate R loops (290). It is therefore believed that the deleterious effects of R loops in the topA strain are responsible for its phenotype. RNase H overproduction was also shown to correct the main known defect of the topA strain, a reduced rate of transcription elongation (106). Therefore, it is believed that transcriptional blocks at R loops (when RNA polymerase encounters an R loop formed by a preceding RNA polymerase) result in the low rate of transcription and, ultimately, poor growth. It is feasible, however, that since R loops affect replication progression, replication stalling at R loops could account for their toxic effect as well. Alternatively, R loops can first lead to the formation of stalled ternary transcription complexes, and those in turn can block replication progression. As discussed above, stalled ternary transcription complexes were shown to affect replication progression of the bacteriophage DNA polymerases in vitro to various extents (62, 63, 166, 167).
Transcription-dependent replication inhibition at the poly(G)/poly(C) repeats in DNA, particularly when the poly(G) strand was in the nascent RNA, was observed in E. coli cells (141). The strength of this block did not depend on the direction of replication through the complex. Although the exact nature of this phenomenon is unknown, the following explanations are plausible. The replication fork could be inhibited by the stable RNA-DNA hybrid (which is unusually strong, owing to its 100% GC content). Alternatively, it could be owing to the formation of unusual structures composed either of the duplex portion of DNA and the displaced DNA strand or of the RNA-DNA hybrid and the displaced DNA strand (the so-called "collapsed R loop" [206, 247, 248]). Yet another scenario could be that these structures first trap the RNA polymerase, leading to the formation of the stalled ternary transcription complex and in turn arresting the replication fork.
|
Natural DNAs, particularly in eukaryotes, are greatly enriched in simple DNA repeats (44, 262). During the last decade, these repeats attracted very broad attention owing to two major scientific developments. First, it was discovered that more than two dozen human hereditary disorders are caused by progressive expansions of simple microsatellites (see below) (229, 298). Second, a drastic increase in the length polymorphism of mono- and dinucleotide repeats was observed in certain human cancers, such as hereditary nonpolyposis colorectal cancer (207) (although different from the "expansion diseases," the microsatellite instability in this case is likely the consequence, not the cause, of the disease's progression). Consequently, the replication and stability of various DNA repeats have attracted very broad attention, revitalizing interest in unusual DNA structures.
Inhibition of replication by unusual DNA structures. There is a growing body of evidence of replication inhibition due to the formation of hairpins, triplexes, and G quartets both in vitro (with both single-stranded and double-stranded templates) and in vivo. In vitro, a variety of DNA polymerases have been shown to be inhibited by IRs (most likely owing to the formation of hairpins) (10, 34, 107, 119, 267, 295), G-quartet-forming repeats in the presence of potassium ions (strongly indicating G-quartet formation) (285, 296, 302), and homopurine-homopyrimidine MRs (probably owing to the formation of triplexes) (8, 50, 139, 153, 196, 228, 254). In the latter case, it turned out that the mechanisms for replication inhibition by triplex-forming sequences were different for the single-stranded and double-stranded templates. In the single-stranded DNA templates, DNA polymerization was shown to stop in the middle of H motifs, presumably because when the newly synthesized DNA chain reached the center of a repeat, its remaining single-stranded segment folded back, forming a triplex behind the polymerase and in turn trapping the latter (Fig. 8A) (8, 153). DNA polymerization through H motifs in double-stranded nicked circular DNAs (designed in such a way that the DNA supercoiling would be irrelevant) progressed smoothly when the purine strand served as a template and the pyrimidine strand was displaced. In contrast, when the pyrimidine strand served as a template and the purine strand was displaced, DNA polymerization was completely blocked at the middle of an H motif (Fig. 8B) (254). The proposed mechanism was that during DNA synthesis, DNA polymerase ran into the H motif, displacing its nontemplate DNA strand; the displaced homopurine strand could then fold back, forming a stable triplex downstream from the polymerase that in turn blocked further polymerization. If the displaced strand was homopyrimidine, a stable triplex was not formed, since cytosines were not protonated, and polymerization proceeded normally.
|
In vivo, replication inhibition due to the formation of unusual DNA structures was first proposed for the (GA)n/(TC)n repeat (244, 245) and the GA-rich repeat from the hamster dhfr gene (23) in mammalian cells. Later on, this issue was mostly studied for trinucleotide repeats that are involved in a variety of expansion diseases (see below). Note that there is not a single instance wherein all alternative explanations, such as involvement of a protein's binding in the replication blockage by a structure-forming DNA sequence, have been ruled out.
The best-studied trinucleotide repeats are (CAG)n/(CTG)n, (CGG)n/(CCG)n, and (GAA)n/(TTC)n. Early studies of expandable repeats suggested their unusual structural potential (133, 205, 230, 231, 238, 307, 308). Single-stranded d(CGG)n, d(CCG)n, d(CTG)n, and d(CAG)n stretches can fold into hairpin-like structures stabilized by both WC and non-WC base pairs (36, 72, 180, 311). Hairpins formed by the above four repeats differ in the nature of non-WC base pairs. Thus, their stability varies owing to a differential contribution from different mismatches in the following way: CGG > CCG
CTG > CAG (72). Recently, formation of folded structures by extended d(GAA)n and d(TTC)n repeats has also been postulated (88). Individual strands of expandable repeats can fold into other DNA conformations as well. For example, single-stranded d(CGG)n repeats fold into a stable quadruplex structure (283). The (GAA)n/(TTC)n repeat can form triple-helical H-DNA. Different labs proposed H-y (73), H-r (253), or a composite triplex structure called sticky DNA (253, 287). Finally, an AT-rich repeat implicated in SCA10, (ATTCT)n/(AGAAT)n, apparently forms a peculiar unwound structure while under superhelical stress (241). Formation of unusual structures by expandable repeats was shown to stall DNA polymerization in vitro (73, 122, 285), and there is a growing body of evidence showing that the same is true in vivo.
The (CAG)n/(CTG)n and (CGG)n/(CCG)n repeats were shown to inhibit the replication fork in E. coli (255), S. cerevisiae (234), and mammalian cells (I. Voineagu, unpublished observations), presumably due to the formation of imperfect hairpins [and/or G quartets in the case of (CGG)n/(CCG)n]. In both prokaryotes and eukaryotes, repeat-caused replication inhibition was length dependent. Although all of the systems produced qualitatively similar results, substantial quantitative differences were observed between different cell types. For example, a relatively short (CGG)10/(CCG)10 repeat inhibited DNA replication in yeast whereas a four-times-longer repeat was required for similar inhibition in bacterial and mammalian cells. One possible explanation could be that for the very AT-rich S. cerevisiae genome, GC-rich sequences such as the (CGG)n/(CCG)n repeat are extremely foreign. Furthermore, replication stalling by the (CGG)n/(CCG)n repeat in yeast did not depend on its orientation in the replicon (234), which is different from the orientation-dependent replication blockage in bacteria (255) and mammals (I. Voineagu, unpublished). The (GAA)n/(TTC)n repeat was found to inhibit replication in S. cerevisiae in an orientation-dependent manner, specifically when the repeat's homopurine strand serves as the lagging-strand template, strongly implicating the formation of a triplex (140). There are indications that the (GAA)n/(TTC)n repeat also inhibits replication in mammalian cells (221; also M. Krasilnikova, unpublished observations). The strength of inhibition in all systems depended on the repeat's base composition in the following order: (CGG)n/(CCG)n > (GAA)n/(TTC)n > (CAG)n/(CTG)n, correlating with the repeat's propensity to form unusual DNA structures.
Studies of replication inhibition by trinucleotide repeats are of special importance since it is most likely involved in the mechanism of the repeats' expansiona phenomenon that causes a number of human diseases (200; also see below).
| GENOMIC INSTABILITY CAUSED BY REPLICATION STALLING AT NATURAL IMPEDIMENTS |
|---|
|
|
|---|
In eukaryotes, replication forks are stabilized in the competent state and protected from the regression and subsequent restart pathways, which could lead to genomic instability. In budding yeast, Mrc1 and Tof1 stabilize stalled forks (40, 123, 169), probably through their interaction with the Cdc45-MCM2-7 complex (213). Mrc1 and Tof1 also activate the intra-S-phase checkpoint kinases Mec1 (homologue of mammalian ATR) and Rad53 (homologue of mammalian CHK2) (2, 66, 123). Activation of the intra-S-phase checkpoint helps further stabilize the forks and finish replication of the genome by allowing conserved forks to wait until replication conditions improve without disassembling (33, 57, 149, 161, 272, 280). In the absence of the intra-S-phase checkpoint, spontaneous accumulations of double-stranded breaks and chromosomal rearrangements occur without external DNA damage, which suggests that DNA replication per se is the source of genomic instability (32, 33, 211).
In fission yeast, genome-wide fork stabilization is carried out by Swi1 (homologue of Tof1), Cds1 (homologue of Rad53), and Rad3 (homologue of Mec1) (217).
In bacteria, stabilization of stalled forks has not been detected. Unlike eukaryotes, in which there is a chance that if one fork stalls, another fork from the opposite direction will reach the same place in a reasonable time, there is no such chance in bacteria (because of the large size of the replicon and terminators that prevent or at least enormously delay the fork's escape from the terminus; see above). Since replication timing is the rate-limiting step in the bacterial cell cycle, it seems beneficial to disassemble and restart stalled forks right away instead of passively waiting (although "infinite" replication stalling, which resulted in unviability of cells, has been reported [239]; see above).
It is interesting to consider whether the cell can distinguish between a temporarily stalled replication fork (which is worth preserving in the anticipation of