Microbiology and Molecular Biology Reviews, June 1999, p. 405-445, Vol. 63, No. 2
1092-2172/99/$04.00+0
Copyright © 1999, American Society for Microbiology. All rights reserved.
Department of Molecular Biology and Microbiology, School of Medicine, Tufts University, Boston, Massachusetts,1 and Department of Biochemistry, Tulane University School of Medicine, New Orleans, Louisiana2
SUMMARY
INTRODUCTION
CLEAVAGE/POLYADENYLATION PATHWAY
RNA Sequences Which Specify Cleavage and Polyadenylation
Mammalian polyadenylation signals.
(i) AAUAAA motif.
(ii) Downstream elements.
(iii) Poly(A) site.
(iv) Auxiliary sequences.
Yeast polyadenylation signals.
(i) Efficiency element.
(ii) Positioning element.
(iii) Poly(A) site.
(iv) Additional properties of yeast polyadenylation signals.
Plant polyadenylation signals.
Comparison of polyadenylation signals in mammals, yeast, and plants.
Cleavage/Polyadenylation Machinery
Mammalian cells.
(i) Cleavage/polyadenylation specificity factor.
(ii) Cleavage stimulation factor.
(iii) Cleavage factors Im and IIm.
(iv) RNA polymerase II.
(v) Poly(A) polymerase.
(vi) Poly(A)-binding protein II.
Yeast cells.
(i) Cleavage/polyadenylation factor IA.
(ii) Cleavage/polyadenylation factor IB.
(iii) Cleavage factor II.
(iv) Polyadenylation factor I.
(v) Poly(A) polymerase.
(vi) Poly(A)-binding protein I and poly(A) nuclease.
(vii) Proteins with possible auxiliary or regulatory roles in yeast polyadenylation.
Steps in Processing: Assembly, Cleavage, and Polyadenylation
Mammals.
Yeast.
Polyadenylation in Other Organisms
FORMATION OF THE 3' ENDS OF HISTONE MRNAS
INTERRELATIONSHIP OF MRNA 3'-END FORMATION AND OTHER NUCLEAR PROCESSES
Transcription by RNA Polymerase II
Transcription initiation.
Transcription termination downstream of poly(A) sites.
(i) Role of cleavage/polyadenylation factors.
(ii) Other factors with possible roles in termination.
Discussion.
Splicing
Activation of polyadenylation by splicing factors.
(i) Recognition of constitutive 3'-terminal exons.
(ii) Recognition of alternative 3'-terminal exons.
Inhibition of polyadenylation by splicing factors.
(i) Inhibition by a downstream splice donor site.
(ii) Inhibition by an upstream splice donor site.
Role of hnRNP and cap-binding proteins.
Summary.
Transport
mRNA 3'-end formation facilitates efficient transport.
Possible mechanisms for coupling 3'-end formation and transport.
REGULATION OF MRNA 3'-END FORMATION
General Themes
Regulatory Mechanisms in the Selection of Alternate 3'-Terminal Exons
Alternative polyadenylation of calcitonin/CGRP transcripts.
Alternative polyadenylation of immunoglobulin transcripts.
Utilization of duplicate poly(A) sites in HIV-1 transcripts.
Cell Cycle Regulation of Factors Involved in 3'-End Formation
Histone pre-mRNA cleavage factors.
Cleavage/polyadenylation factors.
Alterations of the Cleavage/Polyadenylation Machinery during Viral Infection
Alternative mRNA Polyadenylation in Yeast
CONCLUSIONS AND PERSPECTIVES
ACKNOWLEDGMENTS
REFERENCES
SUMMARY
|
|
|---|
Formation of mRNA 3' ends in eukaryotes requires the interaction of transacting factors with cis-acting signal elements on the RNA precursor by two distinct mechanisms, one for the cleavage of most replication-dependent histone transcripts and the other for cleavage and polyadenylation of the majority of eukaryotic mRNAs. Most of the basic factors have now been identified, as well as some of the key protein-protein and RNA-protein interactions. This processing can be regulated by changing the levels or activity of basic factors or by using activators and repressors, many of which are components of the splicing machinery. These regulatory mechanisms act during differentiation, progression through the cell cycle, or viral infections. Recent findings suggest that the association of cleavage/polyadenylation factors with the transcriptional complex via the carboxyl-terminal domain of the RNA polymerase II (Pol II) large subunit is the means by which the cell restricts polyadenylation to Pol II transcripts. The processing of 3' ends is also important for transcription termination downstream of cleavage sites and for assembly of an export-competent mRNA. The progress of the last few years points to a remarkable coordination and cooperativity in the steps leading to the appearance of translatable mRNA in the cytoplasm.
INTRODUCTION
|
|
|---|
Posttranscriptional cleavage of mRNA precursor is an essential step in mRNA maturation. Following cleavage, most eukaryotic mRNAs, with the exception of replication-dependent histone transcripts in some organisms, acquire a poly(A) tract at their 3' ends. The process of 3'-end formation promotes transcription termination (101) and transport of the mRNA from the nucleus (215). The poly(A) tail, most probably by providing a binding site for poly(A) binding protein (105), also enhances the translation and stability of mRNA (149, 368, 399, 488).
Defects in mRNA 3'-end formation can profoundly alter cell viability, growth, and development. The essential nature of the yeast genes encoding components of the polyadenylation pathway emphasizes the importance of this process. In metazoan cells, in vivo depletion of one of the cleavage proteins, CstF-64, causes cell cycle arrest and ultimately apoptotic cell death (451). A failure to correctly modify the metazoan poly(A) polymerase during the cell cycle is thought to cause a lower growth rate and cell accumulation in the G0-G1 phase (516). The appearance of short GCG repeats in the gene encoding the PAB II polyadenylation factor is associated with oculopharyngeal muscular dystrophy (59). The formation of mRNA 3' ends is a key regulatory step in the expression of many genes, and in some cases aberrant polyadenylation leads to disease. In humans, such defects cause thalassemias (203, 345) and a lysosomal storage disorder (161). Inappropriate polyadenylation may also contribute to the abnormal processing of the EAAT2 glutamate transporter transcripts observed in the brains of patients with sporadic amyotrophic lateral sclerosis (262). In this disease, the loss of functional EAAT2 correlates with motor neuron degeneration. Research into the fundamental mechanism of mRNA 3'-end formation and its regulation should lead to a better understanding of its crucial role in normal cell growth and development.
The past few years have brought astounding progress in our understanding of the biochemistry of mRNA 3'-end formation, its regulation, and its interaction with other aspects of mRNA synthesis. The factors which comprise the basic polyadenylation machinery have been identified, and the coding sequence of many, if not most, of the protein subunits has become available. The molecular mechanism by which several regulatory elements stimulate or inhibit polyadenylation has been dissected in exquisite detail, and the intimate involvement of splicing factors at these sites has been made clear. In addition, much information has accumulated on how the basic polyadenylation machinery is regulated to control the choice of poly(A) site or activity of the poly(A) polymerase. Finally, the coupling of transcription and mRNA 3'-end formation has been convincingly demonstrated in a variety of ways.
We have tried to provide sufficient background information that the reader can evaluate the developments in our understanding of mRNA 3'-end formation, primarily over the last 5 years. Due to space constraints, we are not able to give a more thorough historical account, and so we have focused on a limited number of examples to illustrate the paradigms emerging in the field. We ask the readers to refer to several recent reviews on constitutive and regulated polyadenylation for additional details (101, 139, 238, 473, 475a). Cytoplasmic polyadenylation and the role of the poly(A) tail in translation is covered in a review by Richter elsewhere in this issue (384a).
CLEAVAGE/POLYADENYLATION PATHWAY
|
|
|---|
RNA Sequences Which Specify Cleavage and Polyadenylation
Sequences on the RNA precursor ultimately determine the processing efficiency in a given cellular environment. The cis-acting elements specifying cleavage and polyadenylation and the cleavage of replication-dependent histone pre-mRNA in animal cells are well defined. Research in the last few years has led to a much better understanding of the polyadenylation signals in yeasts.
Mammalian polyadenylation signals.
In mammalian cells,
three elements define the core polyadenylation signal
the highly
conserved hexanucleotide AAUAAA found 10 to 30 nucleotides
upstream of the cleavage site, a less highly conserved U-rich or
GU-rich element located downstream of the cleavage site, and the
cleavage site itself, which becomes the point of poly(A) addition and
is thus generally referred to as the poly(A) site (Fig.
1). Additional sequences outside of this core recruit regulatory factors or maintain the core signal in an open
and accessible structure.
|
(i) AAUAAA motif. The consensus sequence AAUAAA was initially revealed by a comparison of nucleotide sequences preceding the poly(A) sites in several mRNAs (377) and has since been found in almost all polyadenylated mRNAs of animal cells (473). Extensive mutagenesis studies and the analysis of naturally occurring mutations have conclusively established that this hexanucleotide is essential for both cleavage and poly(A) addition (reviewed in references 286, 472, and 487). The sequence AAUAAA is one of the most highly conserved sequence elements known (374). The most frequent variant is AUUAAA, whose activity is comparable to that of the canonical sequence. Mutations of any other nucleotide strongly inhibit processing (422, 489, 496), although in some genes, these forms are the only hexanucleotide-like sequences upstream of the poly(A) site. Poly(A) sites with these variants, as well as those that have no discernible upstream AAUAAA element, i.e., no sequence differing by less than 2 nucleotides from the consensus, are usually associated with alternative (208, 421, 508) or tissue-specific (76, 476) polyadenylation.
(ii) Downstream elements. The second element of the core polyadenylation signal is within approximately 30 nucleotides downstream of the poly(A) site. This downstream element (DSE) is more diffuse and poorly conserved, and two main types have been described, a U-rich element and a GU-rich element. The U-rich element is a short run of U residues (94, 162). The GU-rich type has the consensus YGUGUUYY (Y = pyrimidine) and has been found downstream of the poly(A) site in two-thirds of the 70 genes surveyed in a 1985 study (302). A polyadenylation signal may have only one DSE (94, 300) or may have both a U-rich element and a GU-rich element working together synergistically (162). However, some DSEs contain no matches to either the U- or GU-rich motif (431). Point mutations or small deletions in the DSE have only weak effects, and larger deletions are required to abolish function, which is in agreement with the idea that the DSE is poorly defined and possibly redundant (510). Nevertheless, the proximity of the DSE to the poly(A) site can affect the cleavage site position (280, 294) and the efficiency of cleavage (162, 300).
(iii) Poly(A) site.
The selection of the cleavage site
is determined mainly by the distance between the upstream AAUAAA
sequence and the DSE(s) (80). The local sequence
surrounding the cleavage site is not conserved, although
adenosine is found at the cleavage site of 70% of vertebrate
mRNAs (422). Thus, the first nucleotide of the poly(A)
tail in most mRNAs is probably template encoded, although this has been
proven experimentally in vitro for only two poly(A) sites (317,
422). A study involving saturation mutagenesis demonstrated that
the order of preference for the cleavage site nucleotide follows the
order of A > U > C
G (80). The pentultimate
nucleotide is most often a C residue (in 59% of all genes analyzed)
(422). Thus, a CA dinucleotide defines the poly(A) site for
most genes.
(iv) Auxiliary sequences.
Other sequence elements can
modulate the efficiency of 3' processing in a positive or negative
fashion. The molecular mechanisms by which some of these operate are
described in later sections, and only the auxiliary sequences are
discussed here. One class of enhancer sequence is located upstream of
the AAUAAA element (USE) (Fig. 1), and has been found
primarily in viral poly(A) sites, such as adenovirus E3
(43), adenovirus L1 (126), adenovirus L3
(372, 373), adenovirus L4 (431), Epstein-Barr
virus DNA polymerase (427), simian virus 40 (SV40) late
(74, 277, 410), ground squirrel hepatitis virus
(392), and retroviruses such as human immunodeficiency virus
type 1 (HIV-1) (65, 87, 127, 128, 466). These USEs are often
U rich, but a consensus sequence has not emerged (Table
1).
|
Yeast polyadenylation signals. Signals which direct mRNA 3'-end formation in the yeast Saccharomyces cerevisiae are somewhat different from those used in higher eukaryotes in both sequence and organization. Yeast polyadenylation signals are less highly conserved than are poly(A) signals in higher eukaryotes and are unexpectedly complicated. At least three elements are needed to make up a minimal yeast mRNA 3'-end region: (i) the UA-rich efficiency element and related sequences, functioning to activate the positioning element; (ii) the A-rich positioning element, which directs the position of the cleavage site; and (iii) the actual site of polyadenylation, PyAn (Fig. 1) (reviewed in reference 182). A recent analysis of 1,352 pre-mRNA 3'-end processing sites, corresponding to 861 different genes, has confirmed this organization of signal sequences (170).
(i) Efficiency element. Efficiency elements are found at a variable distance upstream of the cleavage site and often contain alternating UA dinucleotides or U-rich stretches. Early comparisons of a number of yeast mRNA 3' ends led to the proposal of a bipartite motif UAG ··· UAUGUA (395). Another sequence, UUUUUAUA, was also identified as an efficiency element in several yeast genes including GCN4, PHO5, and ADH1 (141, 181, 198, 199, 224). A subsequent study reduced these putative signals to a hexanucleotide, UAYRUA, with the sequence UAUAUA working best for mRNA 3'-end formation (223). The U residues at the first and fifth positions are the most critical nucleotides in this sequence. Furthermore, computer analysis has shown that more than half of the approximately 1,000 yeast nuclear genes examined contain UAUAUA sequences in their 3' region (181). Thus, most yeast genes, e.g., GAL7 and MRP2 (1, 356), use UAUAUA as the efficiency element, while other genes, e.g., CYC1 and GCN4, appear to use related sequences, such as UAUUUA, UAUGUA, and UUUUUAUA (141, 181).
(ii) Positioning element. The second element, the positioning element, directs cleavage to a position approximately 20 nucleotides downstream of this sequence. Deletion or mutation of the sequence UUAAGAAC in the 3' region of CYC1 changes the location of the poly(A) site but not the overall mRNA processing level, indicating that it represents a signal distinct from the efficiency element (395). In addition to the sequence UUAAGAAC, several other A-rich sequences have been characterized as positioning elements, with AAUAAA and AAAAAA being the most efficient (183). Related motifs are also functional, except that sequences such as GAUAAA, GAAGAA, and GAAUAA are either inefficient or completely inactive, suggesting that a guanosine residue at the first position has an inhibitory effect on the signal function.
The positioning element can also contribute to the efficiency of processing, as seen from the fact that deletion of the AAUAAA motif in the 3'-end region of the ADH2 gene, the GAL7 gene, or a heterologous cauliflower mosaic virus gene resulted in a reduction in the use of wild-type poly(A) site (1, 222, 225). A single point mutation in the TRP4 gene altered the efficiency of processing as well as the selection of the poly(A) site (133). The positioning element normally resides between the efficiency element and the poly(A) site, but for the FBP1 gene, it is found upstream of a series of efficiency motifs (18).(iii) Poly(A) site. Mapping of the poly(A) sites of several yeast genes has shown that polyadenylation occurs most frequently at a Py(A)n sequence (Py = pyrimidine) (36, 195). In contrast to animal genes, in which a single poly(A) site is found downstream of AAUAAA, many yeast genes use a cluster of poly(A) sites downstream of the efficiency and positioning elements. When positioning elements are mutated, the poly(A) sites become scattered over a much broader region (17, 194, 396). A recent survey indicates that T-rich motifs are frequently found immediately before and after the poly(A) site, especially in genes with suboptimal efficiency or positioning elements (170). The importance of these flanking sequences has not been tested experimentally.
(iv) Additional properties of yeast polyadenylation signals. Some yeast polyadenylation signals function in both orientations (17, 224, 356). This is in part due to the convergent transcription of closely packed genes. When this is not the case, it is likely that the sequence variability of the yeast signal motifs, the general TA richness of the 3' ends of yeast genes (170), and the primary dependence on the UA-rich element, and not the positioning element, for efficient processing will increase the probability that an adequate signal is fortuitously found in the reverse orientation.
The efficiency and positioning elements are not only degenerate but also redundant, and most yeast polyadenylation signals are more complex than the minimal one presented in Fig. 1. The sequence redundancy provides an explanation why deletion or mutation of such motifs in several yeast genes has only slight or no effect. In the GCN4 gene, mutation of both copies of the UUUUAU sequence was necessary to reduce processing activity in vitro (141). The CYC1 gene employs multiple weaker elements (UUUAUA, UAUGUU, and UAUUUA) which act additively to constitute a strong signal (181). Deletion of an AAUAAA sequence upstream of the poly(A) site of the ADE8 transcript had no apparent effect, which may be due to the presence of two copies of the AAAAA sequence adjacent to AAUAAA (199). The necessity for a specific sequence downstream of the cleavage site in yeast is not clear. A downstream sequence is required for efficient in vivo 3'-end processing of the ADH2 transcript (222). In many cases, deletion of all or nearly all downstream sequences has little or no effect on 3'-end formation (18, 141, 210, 356, 401), and in vitro substrates with as few as 7 to 10 nucleotides beyond the poly(A) site are efficiently cleaved (82, 401). The presence of mRNA secondary structure around signal sequences may be important in some yeast genes (222, 401), and a long-range interaction of 5' and 3' untranslated regions has been demonstrated for MFA2 mRNA (130). While the contribution of RNA conformation has not been rigorously investigated for any site, circular RNA substrates are not cleaved in vitro (443). The polyadenylation signals of the fission yeast, Schizosaccharomyces pombe, have been characterized only for the ura4 gene (219). In this case, three elements were important: two site-determining elements upstream of the poly(A) sites and an efficiency element downstream of the poly(A) sites.Plant polyadenylation signals. The process of mRNA 3'-end formation in plants is poorly understood, but some information on important cis-acting elements is available (reviewed in references 220 and 387). At least three signals are required: the near-upstream element (NUE), the far-upstream element (FUE), and the cleavage site itself (Fig. 1). NUE is located about 10 to 30 nucleotides upstream of the cleavage site and presents in variant forms, from AAUAAA-like motifs to other related or unrelated sequences. FUE is usually U rich and is found approximately 100 nucleotides upstream of the cleavage site. Similar to other organisms, cleavage often occurs at a PyA dinucleotide. There are multiple cleavage sites in many genes, and use of a particular site is determined predominantly by the position of the NUE.
Comparison of polyadenylation signals in mammals, yeast, and plants. A tripartite signal, composed of an A-rich sequence, a U-rich element, and a PyA cleavage site forms a common minimal polyadenylation signal in all eukaryotes (Fig. 1). In mammals, the hexanucleotide AAUAAA is highly conserved, present in a single copy, and absolutely necessary for 3'-end processing. In yeast, the A-rich motif sometimes serves only to position the poly(A) site and is often duplicated. The second set of sequence elements are U rich or UA rich and work in conjunction with the A-rich sequence. In mammals, these are most often present in single copy downstream of the cleavage site and are essential. When located upstream of the A-rich signal, they are stimulatory. In yeast and plants, this type of signal is often redundant and is usually found upstream of the cleavage site and most often upstream of the A-rich sequence as well. Cleavage occurs preferentially at CA in mammals and PyA in yeast and plants. No downstream signals have been clearly identified in yeast or plants. For the most part, the polyadenylation signal appears to be recognized as a one-dimensional string of nucleotides.
Cleavage/Polyadenylation Machinery
Multiple protein factors are required for the formation of mRNA 3' ends and are generally conserved between yeast and mammals. The in vitro processing assays developed in mammalian cells by Moore and Sharp (315) and in yeast cells by Butler and Platt (68) have provided a successful approach for isolating these activities via fractionation of cell extracts and cloning the relevant genes and cDNAs from protein sequence. In mammals, cleavage/polyadenylation specificity factor (CPSF), cleavage-stimulatory factor (CstF), cleavage factors Im and IIm (CF Im and CF IIm), RNA polymerase II (pol II) and poly(A) polymerase (PAP) are involved in the cleavage step and CPSF, PAP, and poly(A)-binding protein II (PAB II) are involved in polyadenylation (Fig. 2). In yeast, cleavage requires CF IA CF IB, and CF II and polyadenylation uses CF IA, CF IB, polyadenylation factor I (PF I), Pab1, and Pap1 (Fig. 2). Yeast genetics and application of the two-hybrid screen in yeast cells has led to the discovery of some of the important yeast genes, and the recent completion of the S. cerevisiae genomic sequence has allowed the quick identification of several yeast proteins as homologues of mammalian CPSF subunits. In this section, we describe what is known about these factors (summarized in Tables 2 and 3 for mammal and yeast cell factors, respectively). At the end, we present a model for the complexes involved in each step of the reaction.
|
|
Mammalian cells.
(i) Cleavage/polyadenylation specificity factor. In the late 1980s, it was discovered that AAUAAA-dependent processing required a factor termed cleavage and polyadenylation factor (CPF), specificity factor (SF), or polyadenylation factor 2 (PF2), and now called CPSF (96, 168, 453, 454). CPSF is required for both the cleavage and poly(A) addition reactions and, consistent with this function, recognizes AAUAAA, a signal also essential for both reactions. Gel retardation experiments showed that CPSF specifically binds AAUAAA-containing RNAs (32, 45, 237). RNA modification interference assays indicated that all six nucleotides of AAUAAA are necessary for binding (237) and that RNAs as short as 10 nucleotides can be bound specifically (491). CPSF thus appears to recognize only the AAUAAA sequence, independent of any secondary structure. The binding of purified CPSF is very weak but can be greatly enhanced by a cooperative interaction with CstF bound to the downstream signal sequence (167, 280, 485, 499). CPSF purified from calf thymus or HeLa cells is a large protein complex containing subunits of 160, 100, 70, and 30 kDa, referred to as CPSF-160, CPSF-100, CPSF-70, and CPSF-30, respectively (45, 328).
The sequence of the largest subunit (160 kDa) contains a possible bipartite nuclear localization signal (NLS) and sequences roughly similar to the RNP1 and RNP2 motifs found in many RNA-binding proteins (228, 329). The carboxyl end of CPSF-160 also has homology to the C terminus of Rse1, a yeast pre-mRNA-splicing protein (79). The cross-linking of CPSF-160 to RNA in the processing extract depends on AAUAAA (314), and recombinant CPSF-160 (rCPSF-160) alone binds preferentially to AAUAAA-containing RNAs (329), supporting the idea that this subunit is crucial for AAUAAA recognition. However, the specific binding of rCPSF-160 is less efficient than that observed with intact CPSF, suggesting that the participation of other CPSF subunits facilitates the recognition of AAUAAA. With HIV-1 pre-mRNAs, CPSF-160, as part of CPSF, can be cross-linked to the RNA in two places, at the AAUAAA and at the USE (164). CPSF-160 interacts specifically with the 77-kDa subunit of CstF and with PAP (329), which is consistent with the cooperative interactions of CPSF with CstF or PAP in forming stable complexes on the RNA precursor (reviewed in references 101, 238, and 473). Interestingly, rCPSF-160 inhibits the activity of PAP in nonspecific assays, implying that CPSF may facilitate both poly(A) synthesis and termination (329). The Drosophilia CPSF-160 is essential for viability (402). The 100- and 73-kDa subunits of CPSF are closely related, with 23% identity and 49% similarity (227, 229). Antibodies raised against CPSF-100 coimmunoprecipitate all four subunits of CPSF, confirming their association as a complex (227). The functions of both CPSF-100 and CPSF-73 are unknown, but CPSF-100, as part of CPSF in extract, can be cross-linked to RNA by UV light (136), suggesting close contact with the precursor and perhaps a role in RNA binding. The fourth subunit, CPSF-30, contains five CCCH zinc finger repeats followed by a CCHC zinc knuckle. Both types of motifs have been implicated in binding nucleic acid. In agreement with this sequence feature, CPSF-30 binds RNA polymers, with a distinct preference for poly(U). It has not always been detected in active CPSF preparations (164, 328) and may be less strongly associated with the other CPSF subunits under some conditions. However, it is coimmunoprecipitated with the other CPSF subunits (31, 227), and immunodepletion of this protein from extract or partially purified CPSF fractions inhibits cleavage and polyadenylation (31). The role of CPSF-30 is most probably to cooperate with CPSF-160 in the recognition of RNA substrates and, through an interaction with PAB II (85), to stabilize the polyadenylation complex. The Drosophila clipper (clp) gene encodes a homolog of CPSF-30 which has five CCCH zinc finger motifs and two CCHC zinc knuckles. Members of this highly conserved family of proteins have been found in mouse, zebrafish, Caenorhabditis elegans, and S. cerevisiae (29). CLP is a nuclear protein that is posttranscriptionally regulated during development, and the zebrafish homolog, no arches, is essential for normal pharyngeal arch development (155). The yeast homolog, Yth1, is part of the PF I polyadenylation factor (31). The region containing the zinc finger motifs of CLP has endoribonucleolytic activity specific for RNA hairpins (28), and the C-terminal zinc knuckles confer to CLP a binding preference for RNA with G- and/or C-rich clusters (29). While these properties have not been reported for CPSF-30 and an association of CLP with other CPSF subunits has not been demonstrated, the remarkable conservation of the two proteins certainly raises the possibility that CPSF-30 is directly involved in cleavage at the poly(A) site.(ii) Cleavage stimulation factor. CstF is necessary for cleavage but not for poly(A) addition (168, 453), although it can stimulate poly(A) addition on substrates with a CstF binding site upstream of the AAUAAA hexanucleotide (319). Purification of CstF from HeLa cells showed that it consists of three polypeptides of 77, 64, and 50 kDa (167, 452). cDNAs for all three have been sequenced (447, 448, 450). CstF-77 and its yeast and Drosophila homologs have eight repeats very similar to the tetratricopeptide repeat (TPR)-like motifs found in the yeast Prp39 and Prp42 U1 small nuclear ribonucleoproteins (snRNP) (303, 438). The repeats in the CstF-77 family of proteins lack the highly conserved alanine and glycine residues of a TPR, and this new motif has been termed HAT (half a TPR) (370). Like TPR repeats, HAT repeats may mediate protein-protein interactions (101, 370). Consistent with this sequence feature, CstF-77 was shown to be the middle subunit bridging CstF-64 and CstF-55, with three of them arranged in a linear fashion (448). Its direct interaction with CPSF-160 probably contributes to the mutual stabilization of the CPSF-CstF-RNA complex (329). CstF-77 is homologous to the Drosophila suppressor of forked [su(f)] protein (448). Mutations in su(f) can enhance or suppress the effects of transposon insertion, probably through changes in polyadenylation.
CstF-64 contains a classical RNA-binding domain (RBD) close to its amino terminus. It is connected by a hinge region to an unusual proline- and glycine-rich region (40%) in the carboxy terminus (447). Embedded in the Pro-Gly-rich region is a domain of 12 repeats of MEAR(A/G) in the mouse and human genes and 11 repeats of LEPRG in the chicken gene (455). By a UV cross-linking assay with whole-cell extracts, CstF-64 was first implicated in AAUAAA-dependent binding (314, 497). However, the site of CstF-64 was subsequently mapped to the U-rich DSE of pre-mRNA (280). The AAUAAA dependence reflects the strong cooperative binding of CPSF to AAUAAA and of CstF to the DSE. RNA ligands selected by CstF resemble DSEs in sequence and function (42). Selection-amplification (SELEX) using the CstF-64 RBD confirmed that this region is sufficient to recognize GU- and U-rich sequences (449). CstF-50 contains seven transducin, or WD-40, repeats, a motif which can mediate protein-protein interactions in other proteins (450).(iii) Cleavage factors Im and IIm. CF Im and CF IIm are required only for cleavage. (The designations CF I and CF II have been used for both the mammalian and yeast systems, but the factors are not homologous. The subscript "m" is included to differentiate the mammalian factor from the yeast one.) CF Im has been purified to near homogeneity. Three polypeptides of 25, 59, and 68 kDa and possibly a fourth one of 72 kDa copurify with CF Im activity (388, 389). The three smaller subunits can be UV cross-linked to RNA substrates (388). By gel retardation assays, purified CF Im has higher affinity for RNAs containing polyadenylation signal sequences than for unrelated RNAs. Furthermore, CF Im increases the stability of the CPSF-RNA complex, suggesting that this factor may also interact with CPSF and contribute to the overall stability of the 3'-end-processing complex (388). cDNAs encoding the 25- and 68-kDa subunits have been recently isolated (389). While CF Im-25 has no known motifs, CF Im-68 contains three distinct domains: an amino-terminal RNP-type RBD, a proline-rich region in the middle, and a carboxyl-terminal section consisting of alternating residues of opposite charge, with arginine residues alternating with glutamate, aspartate, and serine residues. This domain organization is strongly reminiscent of that found in the superfamily of RS-rich splicing factors. Most interestingly, recombinant CF Im-25 and CF Im-68 can be assembled in vitro and can replace purified CF Im in cleavage assays. Preliminary sequence data from studies of the CF Im-59 polypeptide reveal that it is similar to CF Im-68, suggesting either that it is a degradation product of CF Im-68 or that CF Im exists as heterodimers of CF Im-25-68 or CF Im-25-59 (389). Analysis of the kinetics of the cleavage reaction indicates that interaction of CF Im with RNA substrate may be an early step in the assembly of the 3'-end-processing complex, which facilitates the recruitment of other processing factors (389).
CF IIm has not been purified to homogeneity, and its function is not known.(iv) RNA polymerase II. Pol II, through the conserved carboxyl-terminal domain (CTD) of its largest subunit, has properties which make it an authentic cleavage factor. The CTD is found in a hyperphosphorylated form in elongating transcriptional complexes (186). An understanding of a role for the CTD in the cleavage of pre-mRNAs evolved from a study of the function of creatine phosphate (CP) in this reaction (204). In the early in vitro work on cleavage and polyadenylation, ATP was shown to be necessary for cleavage of precursor mRNAs (316), and therefore CP was included to regenerate ATP through creatine phosphokinase activity in the extracts. However, other phosphocompounds can substitute for CP in the cleavage reaction, leading to the proposal that CP was acting as a mimic for a phosphoprotein (204, 205). This hypothesis was confirmed when a synthetic CTD, even without phosphorylation, or purified pol II was shown to direct the cleavage of polyadenylation precursor in the absence of CP or ATP (205). Pol II does not appear to be involved in the poly(A) addition step. The interaction of the CTD with CPSF and CstF (299) may stabilize the cleavage complex or, as proposed by Hirose and Manley (205), have allosteric effects, such as those found for CTD and the capping enzyme (443). This involvement of Pol II in cleavage is consistent with earlier studies showing that protein-encoding mRNAs transcribed by Pol I and Pol III were for the most part not polyadenylated (257, 429, 432) and with the recent demonstration of the colocalization of CstF and phosphorylated pol II in vivo (439). Other consequences of the coupling of transcription and mRNA 3'-end formation are explored in a later section.
(v) Poly(A) polymerase. An activity which added adenosine residues to the 3' ends of RNAs was discovered in the early 1960s, and at the time it was a reaction of unknown significance (reviewed in reference 135). It is now well established that the this activity, PAP, plays a key role in the 3'-end formation of mRNA in eukaryotic cells. The first PAP, purified to homogeneity from calf thymus, was a degradation product of 57 to 60 kDa (475). Cloning and expression of the bovine PAP cDNAs have identified at least two isoforms of PAP which are generated by alternative splicing (380, 475). The longest forms, PAP I (77 kDa) and PAP II (82 kDa), differ only at their C termini and are enzymatically active (380, 515). PAP II may be the predominant full-length species, since most cDNAs isolated from other animal cells encode PAP II-related isoforms (30, 460, 475). Several other short forms of PAP (PAPs III, V, and VI) encode truncated proteins that are enzymatically inactive, and their function is unknown (380, 515).
Analysis of PAP has indicated an organization of functional domains as illustrated in Fig. 3 (290, 291, 380, 381). The amino-terminal two-thirds of PAP is highly conserved in eukaryotes and contains a catalytic domain with homology to a family of nucleotidyltransferases including many DNA and RNA polymerases (207, 291, 507). The catalytic core of this family is characterized by a triad of conserved aspartate residues that are essential for activity. These three aspartate residues are located at positions 113, 115, and 167 in bovine PAP (291). A primer-binding domain (C-RBS in Fig. 3) is located between amino acids 488 and 508, in the carboxyl-terminal portion of PAP (291). It overlaps with a region (amino acids 493 to 538) needed for AAUAAA-dependent activity (460), which may interact with CPSF-160. This region also encompasses NLS-1, which is required, together with a second NLS (NLS-2) about 140 residues downstream of the first, for efficient localization of PAP to the nucleus (129). Besides the bipartite NLS, the carboxyl-terminal region is also rich in serine and threonine residues, which are the targets for multiple phosphorylations which regulate PAP activity (2, 102). The last 20 amino acids of PAP (amino acids 720 to 739) are involved in autoregulation of U1A transcripts and in the coupling of splicing and polyadenylation (177).
|
(vi) Poly(A)-binding protein II. CPSF and PAP suffice for poly(A) addition to a precleaved RNA substrate. However, rapid elongation and control of poly(A) tail length requires an additional factor, PAB II (44). The 33-kDa PAB II contains a very acidic amino-terminal domain, a very basic carboxyl-terminal domain, and a single RNP domain in its middle region. The protein tends to form oligomers and binds specifically to poly(A) and poly(G) (331, 474). In vitro assays indicate that PAB II binds directly to CPSF-30 (85). A Drosophila homolog is the product of the rox2 gene (60).
Yeast cells.
Fractionation of whole-yeast-cell
extracts has identified five functionally distinct activities involved
in cleavage and polyadenylation (82, 241). CF IA, IB, and II
are sufficient for the cleavage reaction, while specific poly(A)
addition requires CF IA and CF IB, Pap1, Pab1, and PF I (Fig. 1). All
these factors have been purified to near homogeneity, and genes
encoding most of the components have been cloned (Table
3). All the genes that have been cloned are essential. While the polyadenylation signals used by mammals and
yeast are rather different in consensus sequence and organization, the
factors which comprise the cleavage/polyadenylation apparatus in these
two organisms exhibit surprising conservation.
|
(i) Cleavage/polyadenylation factor IA. CF I was originally identified as an activity needed for both the cleavage and poly(A) addition reactions (82), and further purification separated it into two components, CF IA and CF IB (241). CF IA consists of four polypeptides, Rna14 and Rna15 (241, 311), Pcf11 (13), and a 50-kDa polypeptide (241, 369). The first indication of the involvement of Rna14 and Rna15 in poly(A) mRNA metabolism came from the dramatic poly(A) tail shortening seen in strains harboring temperature-sensitive mutations in the RNA14 and RNA15 genes (312). These mutations are synergistically lethal with mutations in the PAP1 gene (311). Extracts from rna14 and rna15 mutants are defective in both cleavage and poly(A) addition, and fraction complementation assays suggested that Rna14 and Rna15 were components of CF I (311). Purification of CF I showed that Rna14 and Rna15 are indeed CF IA subunits (241). These two polypeptides are tightly associated, as indicated by a two-hybrid assay (239) and by coimmunoprecipitation as a heterodimer from whole-cell extract or CF I-containing fractions by antibodies against Rna15 (241). The 76-kDa Rna14 has sequence homology to mammalian CstF-77 (24% identity) (448).
Rna15, the 38-kDa subunit, is the yeast homologue of CstF-64 (448). It contains an RNA recognition motif (RRM)-type RBD in its amino-terminal region and can be UV cross-linked to substrate RNA (241, 312, 448). Its RNA-binding site in the processing complex is not known, and the existing clues are somewhat conflicting. Alone or as part of CF IA, Rna15 cross-links equally well to wild-type RNA and mutant RNA lacking the AU-rich efficiency element (308). Interestingly, the RBD found in Rna15 closely resembles that of mammalian CstF-64, and both show higher affinity for U-rich sequences (449). However, RNAs selected by Rna15-RBD affinity do not interact with CstF-64, and they bear some similarity to the upstream AU-rich efficiency element (449). The quite distinct RNA-binding preferences of these two closely related RBDs is consistent with the divergence in polyadenylation signal sequences between yeast and mammals. Downstream of the RBD, Rna15 has a stretch of glutamines and asparagines, similar to the opa sequences of Drosophila developmental genes and to those of several transcriptional regulators (312). Pcf11 was identified as a 70-kDa protein which interacts with both Rna14 and Rna15 in a two-hybrid screen (13). The N terminus has some similarity to CTD-binding regions in other proteins such as Nrd1, a yeast heterogeneous nuclear RNP (hnRNP)-like protein which interacts with the mouse Pol II CTD in a two-hybrid assay and affects the elongation of Pol II transcripts containing an Nrd1-binding site (438, 439). Pcf11 also contains a striking stretch of 20 consecutive glutamine residues followed by the region responsible for the interaction with Rna14 and Rna15 (13). Extracts from pcf11 temperature-sensitive mutant strains are defective in both cleavage and poly(A) addition (13). Moreover, Pcf11-specific antibodies recognize the 70-kDa polypeptide of purified CF IA (240). Homology to Pcf11 has not been found in the mammalian system. Microsequencing of internal peptides from the 50-kDa subunit revealed that it corresponded to elongation factor 1
(EF-1
), an essential
GTP-binding protein functioning in translation elongation (240). Antibodies against Pcf11 bring down only the four
subunits of CF IA from CF I-containing fractions, and the 50-kDa
protein in the immunoprecipitate was recognized by EF-1
-specific
antibodies (235, 514). However, a 50-kDa subunit has also
been identified as a new protein called Clp1, which, like EF-1
, has
a P-loop motif indicative of ATP-GTP binding (369). Thus,
the identity of the 50-kDa subunit of CF IA needs to be further clarified.
(ii) Cleavage/polyadenylation factor IB. Purified CF IB is a single polypeptide of 73 kDa (241) encoded by the HRP1/NAB4 gene (239). This gene was previously identified as a suppressor of a temperature-sensitive npl3 allele, a gene encoding a protein which is involved in mRNA export (201) and can be cross-linked to nuclear poly(A)+ RNA in yeast (308). Hrp1 is structurally related to the mammalian hnRNP A, B, and D proteins (239) and has two RRMs in its middle region, both containing RNP1 and RNP2 sequences (201). The last 50 amino acids of Hrp1 are rich in arginine and glycine, with potential RGG methylation sites. A similar domain of hnRNP A1 can mediate protein-protein interactions (97). Experimental data indicate that the UA-rich polyadenylation signal is the likely binding site for Hrp1. The UV-induced RNA cross-linking of recombinant Hrp1 and the endogenous protein in yeast extracts is greatly enhanced by the presence of this sequence (83, 239). A recent SELEX analysis has also shown that UAUAUA is a high-affinity binding site for Hrp1 (464). The closest counterpart to Hrp1 in the mammalian system, at least in terms of having an amino-terminal RBD and a function in cleavage, is CF Im-68.
Recombinant Hrp1 can fully replace the yeast CF IB in both the reconstituted cleavage and poly(A) addition assays (239). Synergistic-lethal-interaction assays and two-hybrid analysis indicated that Hrp1 interacts in vivo with Rna14 and Rna15 but not with Pap1 (239), consistent with its copurification with CF IA. Hrp1 shuttles between the nucleus and the cytoplasm (239), a property facilitated by Hmt1-catalyzed arginine methylation (202, 423). A recent study has reported that Hrp1 is not essential for cleavage of pre-RNAs and instead may regulate cleavage site utilization (308). In this study, pre-mRNA substrates were cleaved at additional sites as well as the normal cleavage site in the absence of Hrp1, and Hrp1 acted in a concentration-dependent manner to suppress the use of the alternate sites. This discrepancy may be attributable to relative concentrations of factors in the reconstituted assays, to differences in the composition of CF II used in the two studies, or to other experimental procedures.(iii) Cleavage factor II. CF II has been purified by taking advantage of its ability to reconstitute the cleavage reaction in the presence of purified CF IA and CF IB (512). It contains four polypeptides, Cft1/Yhh1 (150 kDa) (442, 512), Cft2/Ydh1 (105 kDa), Brr5/Ysh1 (100 kDa), and Pta1 (90 kDa) (513). Cft1/Yhh1 was first identified by its sequence homology to mammalian CPSF-160 (24% identity and 51% similarity) (442). Depletion of extract with antibodies to Cft1/Yhh1 abolished both cleavage and poly(A) addition, consistent with CF II also being part of PF I (see below). However, addition of a CF II-containing fraction restored cleavage activity but not poly(A) addition (442). Cft1/Yhh1-specific antibodies recognize the 150-kDa component of purified CF II (512) and precipitate only four subunits (Cft1/Yhh1, Cft2/Ydh1, Brr5/Ysh1, and Pta1) from partially purified preparations (513).
Cft2/Ydh1, the 105-kDa subunit of CF II, has significant homology to CPSF-100 (24% identity and 43% similarity) (512). Cft2/Ydh1, as part of CF II, can be UV cross-linked to wild-type full-length pre-mRNA substrate but not to wild-type precleaved RNA or mutated substrate that lacks a (UA)6 efficiency element, suggesting that Cft2/Ydh1 may recognize the efficiency element and/or poly(A) site (512). The cross-linking of Cft2/Ydh1 to RNA is also ATP dependent. The third subunit (100 kDa) of CF II is Brr5/Ysh1 (512), a yeast homologue of mammalian CPSF-73 with 23% identity and 48% similarity through its entire length and 53% identity in the first 500 amino acids (78, 229). Brr5/Ysh1 was identified by this homology and as a mutation which gave a cold-sensitive defect in the in vivo splicing of mRNA. The smallest subunit of CF II has been recently identified as a protein encoded by PTA1, an essential gene affecting pre-tRNA processing (342, 513). In one report, extracts from brr5/ysh1 and pta1 mutant strains were shown to be deficient in poly(A) addition but not cleavage (78, 369), whereas a recent study found that these extracts were defective in both steps (513). The cleavage defect could be rescued by CF II. These discrepancies may be due to differences in extract preparation or culture conditions, which could influence the concentration or stability of proteins in the mutant extracts. A protein corresponding to Yth1, the yeast CPSF-30 homolog, was not detected by silver staining in the purified preparation of Zhao et al. (512). It has been reported that Yth1 can be detected by immunoblotting in partially purified CF II (unpublished results cited in reference 308) and that point mutations in the second zinc finger cause reduced cleavage activity in vitro (309). Interestingly, the entire CF II complex copurified with PF I (see below) (369), suggesting that CF II also plays a role in poly(A) addition. The involvement of CF II in both cleavage and poly(A) addition and the sequence homology of three of its subunits to those of CPSF support the idea that CF II is the functional homolog of this mammalian factor.(iv) Polyadenylation factor I. PF I was originally identified as an activity which supported poly(A) addition but not cleavage (82). A multiprotein complex from yeast containing PF I activity has been recently purified by restoration of polyadenylation activity to extracts with mutated Fip1 subunit of PF I (369). In addition to Fip1, this complex contained Pap1 (see below), Yth1, all four subunits of CF II, and two uncharacterized proteins, Pfs1 (58 kDa) and Pfs2 (53 kDa). The PF I-Pap1-CF II association is also indicated by coimmunoprecipitation experiments with yeast whole-cell extracts, since all CF II components as well as Pap1 and Fip1 were precipitated specifically by antibodies against Pap1 or Fip1 (513). It is not clear whether CF II in extract and cells exists in two forms (free and associated with PF I-specific subunits) or whether it is separated from PF I activity only by chromatography.
Fip1 (for "Factor interacting with Pap1") was identified as a protein that interacts with Pap1 in a two-hybrid screen (371). It has a predicted molecular mass of 35 kDa but migrates as 55 kDa on sodium dodecyl sulfate-polyacrylamide gels. Fip1 has some sequence homology to CPSF-160 (329) and contains a very acidic amino terminus and a proline-rich region at the carboxyl terminus. This proline-rich domain is also found in several other proteins, including CstF-77 (448), the 70K U1 snRNP (379), and Pab1 (398). This domain in Fip1 is not required for cell viability (371), although truncation of Fip1 beyond the proline-rich region causes temperature-sensitive growth. Extract from this strain is defective in poly(A) addition but efficient in cleavage. Biochemical analysis with in vitro translation products demonstrated that Fip1 interacts directly with Pap1, Yth1, and Rna14, although the binding to Rna14 is weak (31, 371). Fip1 alone can alter the processivity of Pap1 (517). These properties suggest that it is the functional homolog of CPSF-160 in the specific poly(A) addition reaction and plays a central role in the assembly of the yeast polyadenylation machinery. Like Cft1/Yhh1 and Brr5/Ysh1, Yth1 (for "yeast thirty-kilodalton homolog") was also isolated by sequence similarity to a CPSF subunit. It has 40% identity and 60.5% similarity to CPSF-30 (31). However, Yth1 lacks the C-terminal zinc knuckle motifs found in the metazoan proteins (31). Yeast strains containing the yth1-1 mutant, which is truncated at its carboxyl terminus to leave only four zinc finger motifs, are temperature sensitive for growth. Extracts prepared from this mutant are normal in cleavage but deficient in poly(A) addition. While Yth1 in a highly purified PF I preparation cannot be detected by silver staining, its presence as a PF I subunit was confirmed by immunoblot analysis (369). The two additional proteins, p58 and p53, that copurified with PF I are encoded by the essential PFS1 and PFS2 genes (unpublished data cited in reference 238). Pfs1 has a zinc knuckle, and Pfs2 has WD-40 repeats.(v) Poly(A) polymerase. Yeast Pap1 was the first factor of the yeast 3'-end processing machinery to be purified, and its gene, PAP1, was the first to be identified. PAP1 was cloned both by sequencing of the purified 64-kDa protein and by complementation of a temperature-sensitive pap1 allele (263, 351). The yeast and mammalian Pap1 proteins are 47% identical within the first 400 amino acids, a region thought to comprise the catalytic domain and include the nucleotidyltransferase active site (291) (Fig. 3). The carboxyl-terminal regions are not conserved. Monoclonal antibodies which recognize epitopes in the amino- and carboxyl-terminal regions of Pap1 do not recognize the mammalian enzyme (242). An RNA-binding site at the carboxyl-terminus (C-RBS) is thought to interact in part with the phosphate backbone of polynucleotides and is essential for processive activity (517). At least two other contacts with the RNA substrate exist in addition to C-RBS. While their location on the enzyme is not known, one is thought to recognize the last 3 nucleotides of the RNA primer and to help the enzyme discriminate against deoxyribonucleotide substrates, and another base-specific site is proposed to interact with the primer 12 to 14 nucleotides upstream of the 3' end (517).
Mutational analysis has identified two specificity domains, SpD1 and SpD2, located at either end of Pap1 as indicated in Fig. 3 (517, 518). These are probably necessary to recruit Pap1 to the polyadenylation machinery by interaction with specificity factors and to regulate its activity. In agreement with this idea, deletion of SpD1 eliminates the activity of the enzyme in association with the polyadenylation factors but has no effect on its ability to extend an oligo(A) primer. Furthermore, both SpD domains interact with the Fip1 subunit of PF I by two-hybrid analysis. Interestingly, SpD2 partially overlaps with the C-RBS. The presence of recombinant Fip1 increases the Km of Pap1 for RNA 50-fold, prevents the cross-linking of Pap1 to RNA, and results in a shift to a distributive mode of action, consistent with a direct interaction of Fip1 at SpD2 (517). It is interesting that a specificity domain of the mammalian PAP also overlaps with a carboxyl-terminal primer-binding site (291, 460). Unlike the mammalian system, the yeast Pap1 is not required for efficient cleavage of precursor in vitro. However, a mutation in Pap1 conferring temperature-sensitive growth can influence the choice of poly(A) site in the ACT1 transcript (284).(vi) Poly(A)-binding protein I and poly(A) nuclease. Yeast Pab1 is the major RNP associated with the poly(A) tails of mRNA in both the nucleus and cytoplasm (3, 420, 446). Amino acid sequence analysis of this 70-kDa polypeptide revealed four RRM-type RBDs at its amino-terminal region and a proline-rich carboxyl terminus (67). Pab1 is important for translation initiation (456-458) and for deadenylation-dependent mRNA turnover (71). Recent studies have found that it is also involved in mRNA poly(A) tail formation. Pab1 cofractionates with CF IA (310) and interacts with Rna15 in two-hybrid assays and by coimmunoprecipitation (12). Cells bearing a pab1 mutation conferring temperature-sensitive growth show aberrantly long poly(A) tails in vivo (400) and in vitro (12, 310). Addition of anti-Pab1 antibodies to a processing extract results in an elongated poly(A) tail but has no effect on the cleavage reaction (12). In a reconstituted-poly(A) addition assay, Pab1 was further confirmed to function in limiting the length of poly(A) tail but was not required for cleavage (239).
Pab1 acts in concert with a poly(A)-specific nuclease (PAN) to affect the poly(A) tail length (120, 274). PAN is composed of at least two subunits, Pan2 (127 kDa) and Pan3 (76 kDa), both encoded by nonessential genes (50, 64). Deletion of the PAN2 and/or PAN3 gene resulted in a similar increase in mRNA poly(A) tail lengths in vivo and in the loss of Pab1-stimulated PAN activity in yeast extracts (50, 64). Pan2 and Pan3 directly interact with each other, as shown by coimmunoprecipitation and two-hybrid analysis (64). Pan2, which interacts with Pab1 (50), is likely to be the catalytic subunit of the PAN complex, since it is a member of the RNase D family of 3'-5' exoribonucleases (306, 321). The proline-rich C terminus of Pab1 is necessary for PAN activity in vitro (285). Recently identified deadenylating nucleases from Xenopus and human also belong to the RNase D family (249). Deadenylating nuclease is localized in both the nucleus and cytoplasm and, unlike PAN, is inhibited by PAB I, the mammalian homolog of the yeast Pab1. A recent study with polyadenylation extracts found that PAN matures newly synthesized poly(A) tails to defined poly(A) tail lengths of 50 to 90 nucleotides (63). In vivo, this process is rapid and appears to precede translation and mRNA degradation. However, it is not clear whether PAN-dependent deadenylation occurs in the nucleus as an integral step of the 3'-end-processing reaction or, instead, as an early cytoplasmic mRNA maturation event. Pab1 appears to play two roles in regulating tail length: (i) suppression of the activity of Pap1 by limiting its access to the RNA substrate (517) and (ii) recruitment of PAN. These negative effects of Pab1 are somewhat alleviated by Pbp1, a protein isolated by two-hybrid analysis as a Pab1-interactor (285). In extracts from a strain lacking Pbp1, the tails are shorter, and its proposed function is to help regulate poly(A) tail length by suppressing the activity of Pab1 or perhaps the association of PAN with Pab1.(vii) Proteins with possible auxiliary or regulatory roles in yeast polyadenylation. Other proteins which are not components of constitutive polyadenylation factors can influence the efficiency or accuracy of processing in yeast. Some of these gene products suggest interactions of polyadenylation with other cell processes or ways of regulating the activity or specificity of the reaction. For example, temperature-sensitive mutations in the essential ESS1/PTF1 gene lead to increased readthrough of certain poly(A) sites and a decrease in the level of total poly(A)+ RNA in the cell (187). This gene encodes a peptidylprolyl-cis/trans-isomerase (PPIase), an enzyme thought to function in protein folding. While in vitro work is necessary to localize the defect to transcription termination or cleavage/polyadenylation, the authors noted that PPIases associate with the phosphorylated C-terminal domain of mammalian RNA pol II (55) and with the U4/U6.U5 tri-snRNP splicing factor (209, 459) and are proposed to promote disassociation of the HIV-1 capsid core (56). These findings led to speculation that the yeast PPIase activity might be involved in the assembly of polyadenylation or transcription complexes or their disassembly at termination sites, a topic discussed later in this review.
Ref2, an RNA-binding protein, is important for the efficient processing of weak poly(A) sites (393). Disruption of the gene encoding Fir1/Pip1 (122, 394), a nonessential protein which interacts with many nuclear proteins such as Pap1, Ref2, Sir4, topoisomerase 1, and lamin, exacerbates the defective processing seen in cells lacking Ref2 (394). The yeast Pap1 was found to interact specifically with two essential proteins, Ufd1 and Uba2, by two-hybrid analysis in vivo and by coimmunoprecipitation in vitro (122). Ufd1 has been implicated in ubiquitin-mediated protein degradation (230), and Uba2 is involved in the conjugation of the ubiquitin-like protein, Smt3, to other proteins (231). Depletion of these proteins from cells affects the efficiency of processing in vitro (122). For another class of proteins, a direct connection to polyadenylation is less obvious. Mutations in several proteins involved in nucleocytoplasmic transport cause the appearance of longer poly(A) tails and transcripts which extend beyond the normal poly(A) sites, and these are discussed below. A deletion of SSM4, a gene of unknown function (283), or overexpression of SSM1, encoding a ribosomal protein (360), can suppress temperature-sensitive mutations in RNA14. The SSM4 deletion also restores proper ACT1 poly(A) site selection in an rna14-3 mutant (284). STS1, a gene implicated in nuclear targeting, protein transport, rRNA stability, and chromosome segregation, suppresses the mRNA-processing defect of rna15-2 by restoring normal levels of the Rna15 protein (11). Mutations in RET1, encoding an RNA pol III subunit, or RRP6, a gene important for 5.8S rRNA 3'-end formation, can partially rescue the growth defect of a temperature-sensitive pap1-1 mutation (61, 62), and a mutation in LCP5, a gene involved in 18S rRNA maturation, is synthetically lethal in combination with the pap1-7 allele (490). The mechanism by which these effects are mediated is not clear. Some of the mutations may simply make the cell less dependent on a poly(A) tail for translation or may worsen translation problems due to polyadenylation defects, or they may affect polyadenylation indirectly by affecting nuclear import and export of processing factors and mature mRNA. A 5'-3' exonucleolytic activity degrades the 3'-cleavage products of both the yeast and mammalian reactions. However, this activity is not required for in vitro processing of polyadenylation precursor, since it is not present when the reaction is reconstituted from purified factors. Possible additional roles in transcription termination (discussed below) or in recycling polyadenylation factors by destroying their binding sites have not been investigated. The identity of this exonuclease is not known. The activity is magnesium dependent, and in yeast it may be provided by Rat1, a nuclear exoribonuclease which has been implicated in mRNA transport (10).Steps in Processing: Assembly, Cleavage, and Polyadenylation
Mammals. The mutually cooperative interactions of CPSF, CstF, CF Im, PAP, and PAB II in catalyzing accurate and efficient cleavage and polyadenylation have been well documented (167, 329, 388, 471, 499; for a review, see reference 473). The following model of mRNA 3'-end formation in mammalian cells (Fig. 4A) is derived from these numerous studies. The initiating step in assembly of a functional cleavage/polyadenylation complex is probably the recognition of signals on the precursor by CPSF and CstF in a process assisted by CF Im. CPSF binds to AAUAAA through CPSF-160 with the help of CPSF-30 and possibly CPSF-100, and CstF binds to the DSE via CstF-64. The individual interactions of CPSF and CstF with their cognate sequences are weak but are stabilized by the cross-factor interaction of CPSF-160 and CstF-77. A final component of the initiation complex is pol II (205). While it is not known when the precise cleavage site is chosen, the CPSF-CstF interactions define the region in which it must lie. The formation of a cleavage-competent complex requires the additional recruitment of CF IIm and PAP. The contacts of CF Im and CF IIm with the other factors and RNA are not known, but PAP at this point probably interacts with CPSF-160.
|
Yeast. Since many of the yeast factors have only recently been characterized, less is known about how they interact with the RNA precursor and with each other. The available data and the homology to mammalian factors do allow us to construct a working model for assembly and processing in this organism (Fig. 4B). By analogy to the mammalian system, CF II and CF IA are probably prime players in the initiation of complex assembly. Their binding sites are not known, but from the structure of the core polyadenylation site in yeast, they are most probably located upstream of the cleavage site and may correspond to the UA-rich efficiency element and the A-rich positioning element. Extrapolating from the similarity of the best A-rich motifs (AAAAAA and AAAUAA) to the CPSF-binding site, we have placed CF II at this position. However, the cross-linking of Cft2 from highly purified CF II is dependent on the UA-rich sequence and the sequence at or downstream of the cleavage site, a finding which is not in accord with this model. Rna15 of CF IA shows some preference for U-rich RNAs, and CF IA may participate in recognition of the UA-rich motif or U tracts flanking the cleavage site. In yeast, Hrp1 (CF IB) appears to stabilize the assembly of the cleavage complex at the authentic poly(A) site (308). Recent findings suggest that it interacts with the UA-rich element as well (83, 239, 464) and, in doing so, can prevent the cross-linking of Cft2/Ydh1. The positioning of factors in the model is obviously provisional, and further experiments must be done to establish the architecture of the processing complex. A possible sequence of events is that CF II initially identifies the poly(A) site by interacting with both the UA-rich and A-rich elements and that the subsequent recruitment of Hrp1 and CF IA results in a reorganization of these contacts. Moreover, the prevalence of adenosine and uridine residues in the yeast 3' untranslated sequence may provide abundant contact points for both Rna15 and Hrp1 if they are generally positioned in the vicinity of the cleavage site by CF II.
CF IA, Hrp1, and CF II are sufficient for accurate cleavage in vitro (239). Polyadenylation requires CF IA, Hrp1, Pap1, Pab1, and PF I (239). PF I is a complex of CF II plus PF I-specific subunits (369). The Yth1 and Pfs1 subunits of PF I probably provide additional contacts with the cleaved RNA which are not shown in the model in Fig. 4B. Direct interaction with the Fip1 subunit of PF I incorporates the catalytic subunit, Pap1, into the polyadenylation holoenzyme (371). Like the mammalian PAB II, Pab1 is necessary to limit the size of the poly(A) tail (12, 239, 310), probably helped in yeast by the opposing action of PAN (64). Findings such as the copurification of Pab1 with CF I over many columns (239, 310) and the depletion of CF I activity by immunoprecipitation of Pap1 from extracts (242) suggest that in the cell, all of the factors may be preassembled into a cleavage/polyadenylation complex which can be dissected apart in vitro. The redundancy of signals at many yeast poly(A) sites suggests a need for proteins which interact with each of these elements and contribute to the stability of the processing complex. RNA-binding proteins which have been implicated directly or indirectly in 3'-end formation in yeast but are not part of t