Department of Genetics and Developmental Biology, University of Connecticut Health Center, Farmington, Connecticut
SUMMARY INTRODUCTION ORIGINS AND PREVALENCE OF ENDOGENOUS dsRNA FATE OF dsRNA IN THE CYTOPLASM Long Cytoplasmic dsRNAs Trigger Nonspecific and Global Effects Interferons. dsRNA-activated protein kinase PKR. 2',5'-AS/RNase L. TLR3-mediated dsRNA response. Other response pathways to long dsRNA Long Cytoplasmic dsRNAs Might Have Sequence-Specific Effects Short Cytoplasmic dsRNAs Elicit Specific Gene Silencing Molecular mechanism of RNAi. Dicer cleaves long dsRNAs into siRNAs. RISC complex. RNAi may be intimately connected to translation. Regulation of RNAi. Persistence and spreading of the RNAi response. Micro-RNAs Are Related to siRNAs and May Use a Similar Pathway Applications of RNAi and siRNA Technology Specificity of siRNA FATE OF dsRNA IN THE NUCLEUS Adenosine Deaminases That Act on RNA Long nuclear dsRNAs are promiscuously edited by ADAR. ADARs can also edit RNAs in a site-selective manner. ADAR structure and function. Biological importance of ADARs. Possible connections between ADARs, PKR, and RNAi. RNAi Machinery in the Nucleus Role of the RNAi machinery in the establishment of heterochromatin. Role of dsRNA in imprinting and X chromosome inactivation. Connection between RNAi and RNA Editing DNA Elimination in Tetrahymena thermophila CONCLUSIONS ACKNOWLEDGMENTS REFERENCES
| SUMMARY |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Since double-stranded RNA (dsRNA) has not until recently generally been thought to be deliberately expressed in cells, it has commonly been assumed that the major source of cellular dsRNA is viral infections. In this view, the cellular responses to dsRNA would be natural and perhaps ancient antiviral responses. While the cell may certainly react to some dsRNAs as an antiviral response, this does not represent the only response or even perhaps the major one. A number of recent observations have pointed to the possibility that dsRNA molecules are not only seen as evidence of viral infection or recognized for degradation because they cannot be translated. In some instances they may also play important roles in normal cell growth and function. The purpose of this review is to outline our current understanding of the fate of dsRNA in cells, with a focus on the apparent fact that their fates and functions appear to depend critically on not only where in the cell dsRNA molecules are found, but also on how long they are and perhaps on how abundant they are.
| ORIGINS AND PREVALENCE OF ENDOGENOUS dsRNA |
|---|
|
|
|---|
In recent years this picture has become clearer, and important advances have been made in our understanding not only of the extent and nature of antisense RNA expression within cells, but also of the cellular fates of dsRNAs. Until several years ago, while there were a number of reports of naturally occurring antisense RNA in cells of higher eukaryotes, the number of documented cases was relatively small (188). Unexpectedly, however, more recent results have shown that endogenous antisense RNA is rather common. Several groups have suggested that about 1% of all human genes might be transcribed from both strands (86, 207, 312, 344). Most recently, Yelin et al. (423) and Rosok and Sioud (315) used novel computational tools and expressed sequence tag datasets, along with experimental validation studies, to show that the true number may in fact be far higher; at least 5 to 10% of them are impacted by antisense. This antisense frequently lies in the 5' or 3' untranslated region of mRNAs.
Antisense expression to 3' untranslated regions of mRNAs may indeed turn out to be important for the regulation of some gene expression. Lipman (215) observed that the 3' untranslated regions of about 30% of vertebrate mRNAs are conserved. Why is this so? The purpose of such sequence conservation seems unlikely to reflect a need for numerous highly specific protein-RNA interactions. Rather, an attractive interpretation was that long stretches of conserved 3' untranslated region sequences may function in RNA-RNA interactions (215). Taken together with the recent findings of common antisense transcription, this may further point to a heretofore unappreciated and pervasive role for antisense regulation within the cell.
Importantly, the estimate that about 8% of genes express natural antisense RNA in cells may still be too low. In the work reported by Yelin et al. (423), only polyadenylated RNAs were examined. Some antisense is almost certainly not polyadenylated. Also, not all transcribed sequences were available for analysis by the methods employed, trans-encoded antisense transcripts were not examined, and antisense RNAs that do not span introns were not included. This work also did not address what may be a large number of small noncoding antisense RNAs (169) or dsRNA resulting from bidirectional transcription from repetitive and transposable elements, which constitute almost half of the entire human genome (59). For example, LINEs (long interspersed nuclear elements) are non-long terminal repeat retrotransposons that are present in about 105 copies in mammalian genomes, constituting about 17% of genomic DNA (198). Most of these (>99.8%) are defective owing to rearrangements, truncations, and mutations (141, 198). A large amount of RNA in cells is related to LINE elements, and a significant fraction of this RNA is of the antisense orientation (27, 232). One might thus reasonably expect there to be significant amounts of cellular dsRNA related to many or most other repetitive and transposable elements.
The message from all of this is that dsRNA is frequently formed in cells. But what happens to it? Its fate turns out to depend critically on at least two parameters: how long it is, and where it is. In the cytoplasm, dsRNAs longer than 30 bp (called long dsRNAs throughout this article) activate the potent interferon and protein kinase R PKR antiviral pathways, resulting in non-sequence-specific effects that can include apoptosis (188). On the other hand, exciting recent work has shown that RNA duplexes of only 21 to 25 bp (short dsRNAs) in the cytoplasm can enter the sequence-specific RNA interference (RNAi) pathway, where they mediate the destruction of targeted mRNAs (123). In the nucleus, many or most long dsRNAs are edited by adenosine deaminases that act on double-stranded RNA (ADARs). Finally, several lines of evidence suggest that nuclear dsRNAs can also lead to gene silencing and heterochromatin formation in an epigenetic, sequence-independent fashion.
| FATE OF dsRNA IN THE CYTOPLASM |
|---|
|
|
|---|
|
dsRNA-activated protein kinase PKR. PKR is a central player in the cytoplasmic response to dsRNA. PKR is an interferon-inducible, dsRNA-activated Ser/Thr protein kinase (49, 299). This enzyme is normally present only at low levels in cells and exists in an unphosphorylated, inactive form (138, 320). In interferon-treated cells, PKR is found predominantly in the cytoplasm and associated with ribosomes (288). However, a fraction of PKR is also found in the nucleus, primarily in the nucleoli (153, 157, 188), which suggests that PKR has multiple functions in cells, some of which are yet to be identified. In fact, it has been reported that a structured RNA element in the 3' untranslated region of the tumor necrosis factor alpha mRNA binds PKR in the nucleus and that this interaction regulates the splicing of this message (273).
PKR contains two dsRNA binding domains and a kinase domain (246). Figure 2 shows the essential domain structure of this protein, along with that of some other important cellular dsRNA binding proteins that will be discussed here. In vitro, PKR has been shown to be activated by binding to RNAs containing extensive duplex secondary structures (228). When dsRNAs enter cells or are produced in the cytoplasm, they may bind to PKR and induce dimerization, which results in a conformational change, the unmasking of its catalytic domain, and autophosphorylation (Fig. 1). The activation of PKR is independent of the sequence of dsRNAs but depends on both their concentration and on their length. PKR is activated by low concentrations of dsRNAs but inhibited by higher concentrations (322). Manche et al. (228) showed that while a duplex sequence of 11 bp could bind to PKR, 33 bp was the minimum length for activation, and maximal activation was achieved with 80 bp. Furthermore, simultaneous binding of PKR to both dsRNA-binding domains is required for its activation (356). This conclusion, however, needs to be tempered by the observation that at least some short dsRNAs can also induce the PKR pathway in a concentration-dependent manner (see below, in the RNAi section).
|
) (324), the transcription factor inhibitor I
B (186), the human immunodeficiency virus Tat protein (243), nuclear factor 90 of activated T cells, NFAT-90 (199), and the M-phase-specific dsRNA-binding phosphoprotein MPP4 (283). Phosphorylation of eIF2
has important consequences for cellular translation. Phosphorylated eIF2
binds to eIF2B very strongly, which impairs the eIF2B-catalyzed guanine nucleotide exchange reaction, resulting in inhibition of protein synthesis (Fig. 1) (50, 322).
Activated PKR can also mediate signal transduction in response to dsRNAs (299). It can phosphorylate I
B, releasing it from the transcription factor NF-
B, which can now be translocated to the nucleus, where it activates the expression of genes having NF-
B binding sites. These genes include beta interferon (369), Fas (70), p53 (55), Bax (103), and others (186, 187, 409). PKR has also been shown to influence the activity of the transcription factors STAT1 and IRF-1 (187, 408, 412), but the mechanism of this activation is still unclear.
Finally, there is evidence that dsRNAs can trigger apoptosis through PKR (103, 108, 208). PKR can activate apoptotic gene expression and induce apoptosis by activation of the Fas-associated death domain/caspase 8 pathway (12) or of caspase 9 (104). Vorburger et al. (395) reported that PKR also plays a role in E2F-1-mediated apoptosis. These authors further showed that PKR-null mouse embryo fibroblasts demonstrated significant resistance to E2F-1-induced apoptosis. Intriguingly, mice expressing PKR without its dsRNA-binding domains were sensitive to virus-induced apoptosis, while mice expressing PKR lacking its catalytic domain were not (13). Recent evidence suggests that activation of the c-Jun NH2-terminal kinases (JNK) family of mitogen-activated protein kinases and RNase L are also part of this apoptotic pathway (208). These results all serve to illustrate the complexity and diversity of effects that dsRNAs, through interaction with PKR, can exert on cells.
2',5'-AS/RNase L. In addition to the PKR pathway, the 2',5'-oligoadenylate synthetase (2',5'-AS)/RNase L pathway responds to dsRNA (Fig. 1). 2',5'-AS is upregulated and activated in interferon-treated and virus-infected cells (173). 2',5'-AS is activated upon binding to dsRNA, and RNA duplexes of at least 70 bp appear to be required for this activation (188, 248). The recent crystal structure of the porcine 2',5'-AS reveals two noncontiguous domains which assemble to form an interface for dsRNA binding. After dsRNA binding, a conformational change leads to enzyme activation (124). Activated 2',5'-AS is capable of polymerizing ATP and other nucleotides into products having novel 2'-5' linkages.
RNase L, a widely expressed cytoplasmic endoribonuclease, dimerizes and is activated by 2',5'AS (69). Activated RNase L catalyzes the degradation of viral and cellular RNAs (211), including 18S and 28S rRNAs and mRNAs, thus inhibiting protein synthesis (126, 143). RNase L negatively regulates PKR expression and activity and might cleave PKR mRNA. It has been shown that the absence of RNase L leads to selective stabilization of PKR mRNA, extensive eIF2
phosphorylation, and inhibition of viral protein synthesis (177). Activated RNase L can also induce apoptosis (39, 208). However, Martinand et al. showed that dsRNA can also induce an RNase L inhibitor, which inhibits 2',5'-AS binding to RNase L (233). These observations suggest that a level of regulation of this cellular response to dsRNAs remains to be clarified.
TLR3-mediated dsRNA response.
Toll-like receptors (TLRs) are a family of innate immune recognition cell surface receptors that recognize a variety of microbial nucleic acid derivatives and metabolites to induce antimicrobial immune responses. Ten human family members have been identified so far. Each TLR family member recognizes distinct pathogen-derived ligands (272, 380). One TLR family member, TLR3, recognizes dsRNA and can induce an antiviral response by the activation of NF-
B and the induction of beta interferon (2, 236). Human TLR3 is expressed ubiquitously in most human tissues, including dendritic cells and intestinal epithelial cells (235). Although TLR3 is a type I transmembrane protein, some (perhaps a different isoform) might exist within the cytoplasm. Although TLR3 lacks an apparent dsRNA-binding domain, it recognizes very specific structural features in dsRNA, because neither dsDNA, poly(dI:dC), single-stranded RNA poly(rU), nor poly(rC) can induce the TLR3-mediated signaling pathway (236). Intracellular TLR3 can recognize some mRNAs, likely through secondary structure features (167). After TLR3 binds to dsRNA, tyrosine phosphorylation occurs in the intracellular Toll interleukin 1 resistance domain, which is essential for downstream signaling (326) via interaction with a number of adaptor proteins (235, 247).
Other response pathways to long dsRNA It remains possible that other pathways exist by which long cytoplasmic dsRNAs exert effects on cells, but these pathways have not been characterized in molecular detail. For example, dsRNAs appear to be able to directly bind to and inactivate eIF-2, but this effect requires rather high intracellular concentrations of dsRNA (162, 188). In addition, dsRNAs can activate the p38 mitogen-activated protein kinase and JNK pathways, perhaps independently of the PKR or 2',5'-AS/RNase L system (143).
Both immunofluorescence studies and cellular fractionation showed that ADAR1-L is localized in both the cytoplasm and the nucleus, while constitutively expressed ADAR1-S is predominantly present in the nucleus (284). Owing to its cytoplasmic location and its ability to recognize and deaminate dsRNAs, it has been suggested that the cytoplasmic ADAR1-L may play a role in antiviral defense against viruses that replicate in the cytoplasm (18). While no direct evidence for such an activity has yet been reported, it has recently been shown that ADAR1-L activity is especially high in the cytoplasm (414). Thus, a potential important role of ADAR1-L in the cytoplasmic response to long dsRNA cannot be ruled out.
RNAi was first discovered, almost accidentally, in C. elegans by Fire and Mello, who observed that introducing a mixture of sense and antisense RNAs into adult nematodes led to substantially more effective gene silencing than introduction of either strand alone (92, 361). Related phenomena known as cosuppression in many species of plants (264), quelling in Neurospora crassa fungi (314), and posttranscriptional gene silencing in plants (185, 210, 239, 292, 383, 385, 386, 405) have been described. dsRNA also causes specific gene silencing in Trypanosome brucei (265), the hydra (220), zebrafish (402), frogs (270), and the slime mold Dictyostelium discoideum (230). Importantly, RNAi has also been observed in Drosophila melanogaster, cultured mammalian cells, in mouse embryos and even adult mice and rats (26, 32, 34, 42, 81, 119, 127, 171, 240, 416, 425). This broad conservation suggests that RNAi is an ancient and general mechanism for gene regulation which might have evolved to have both developmental and antiviral roles.
With the exception of recent work suggesting a role in nuclear gene silencing via heterochromatin induction (see below), essentially all available data are consistent with RNAi's acting primarily or exclusively in the cytoplasm (for example, see reference 429); however, possible RNAi effects in the nucleus have been reported (25, 212). Most of the RNAi machinery in the cell is located primarily in the cytoplasm, and many laboratories have observed that RNAi-mediated gene silencing is more successful when targeting open reading frames or sequences within spliced mRNAs rather than intronic sequences or transcriptional promoter elements.
Molecular mechanism of RNAi. Although intensive biochemical and genetic studies have been carried out on RNAi during the past several years, its detailed mechanism of action has remained elusive. Figure 3 summarizes the key steps of the RNAi pathway as they are now understood, and Table 1 lists some of the key RNAi components that have been described in a variety of model organisms.
|
|
Dicer cleaves long dsRNAs into siRNAs. The initial cleavage reaction that produces siRNAs (but not the subsequent cleavage of mRNA targets) is mediated by a multidomain RNase III family enzyme, Dicer (21, 269, 362; reviewed in references 36 and 196). Dicer is a large protein of 220 kDa and contains an N-terminal heDExH/DEAH RNA helicase motif/ATPase domain, a PAZ protein-protein interaction domain, two RNase III-like catalytic domains, and a C-terminal dsRNA-binding domain (300) (Fig. 2). This enzyme is relatively well conserved in eukaryotes and has been found both in the nucleus and in the cytoplasm (23). So far, it has been identified in Arabidopsis, Dictyostelium, the fission yeast Schizosaccharomyces pombe, C. elegans, Drosophila melanogaster, and mammalian cells. It has not, however, been found in the budding yeast Saccharomyces cerevisiae. While most organisms have a single identifiable Dicer gene, D. melanogaster has two (dcr-1 and dcr-2), and Arabidopsis has four (331). So far, most biochemical characterizations have been carried out with the Drosophila enzymes, which might have distinct but complementary activities (145, 205, 375). dcr-1 is essential for mature micro-RNA (which is another class of small noncoding RNA, see below) processing, while dcr-2 is essential for the production of functional siRNAs. Consistent with its homology to RNase III class enzymes, Dicer cleaves dsRNAs into 21- to 23-bp double-stranded siRNAs which process two nucleotide 3' extensions and phosphates at the 5' ends.
Dicer function appears to be important for normal development. Mutations in the plant homologue of Dicer caused developmental abnormalities and infertility (107, 142, 149). Similarly, deletion of Dicer in C. elegans led to sterility (174, 183). Drosophila dcr-2 null mutants are viable, fertile, and morphologically normal but have a severe RNAi defect, while dcr-1 mutants displayed no apparent RNAi defect but exhibited morphological defects. This is consistent with a model in which dcr-1 is essential for the micro-RNA pathway while dcr-2 is essential for the RNAi pathway (205). Recently, genetic deletion of the Dicer gene in fission yeast caused defects in chromosome segregation (115). These results further pointed to a nuclear function for this enzyme, as would also be inferred from its proposed activity in processing micro-RNAs, which are generated from precursors transcribed in the nucleus (see below). Recent reports indicate that Dicer is also essential for mouse (22) and zebrafish (406) development.
RISC complex. The composition of RISCs has still not been completely determined and remains somewhat controversial. Biochemical analyses in a number of systems have identified numerous components (37, 40, 41, 67, 85, 120, 144, 216, 253, 269, 291, 363, 410), and recent work has suggested that there are common mechanisms of RNAi between plants and animals (364).
Size estimates of RISC have ranged from less than 200 kDa for a human complex (41), to 360 kDa (269) and 500 kDa (120) for the Drosophila complex, to even larger complexes (216). Possible biochemical components are listed in Table 1 and include Dicer proteins (67, 145, 205, 291, 375), the dsRNA-binding protein R2D2 and its homologues (216, 363), members of the Argonaute gene family in S. pombe (115), plants (AGO1/SGS4) (85), Neurospora crassa (201), Trypanosoma brucei (76), C. elegans (362, 374), D. melanogaster (37, 85, 120, 410), and mammals (67, 370; reviewed in reference 37), RNA helicases (54, 376), components having yet unknown functions (for example, the Drosophila argonaute-2 protein along with the Drosophila homolog of the fragile X syndrome proteins FMRP and FXR1) (41, 144), and putative nucleases that mediate the cleavage of target mRNAs by RISC. The identification of the nuclease that is responsible for mRNA cleavage has been controversial. Recently, the first RISC subunit containing a recognizable nuclease domain (tudor staphylococcal nuclease) was reported (40). However, recent research from another group argued against this possibility (114, 334).
RNAi may be intimately connected to translation. A number of lines of evidence point to a mechanistic connection between the RNAi response and translation, though a direct biochemical connection has not yet been described. First, the cytoplasmic RNAi machinery is commonly found to colocalize with polysomes. For example, in D. melanogaster, components of the RNAi machinery clearly interact with the translation machinery (144). Second, the RNAi response is almost certainly mechanistically related to the micro-RNA pathway, which is thought to regulate translation efficiency (shared components, and micro-RNAs can be mutated to act as siRNAs) (see below). Third, recent work has suggested that RNAs that are not translated are refractory to siRNA inhibition, while those being actively translated are effective targets. Thus, untranslated RNA virus "negative" strands appear to be resistant to RNAi cleavage, while their complementary "positive" strands are sensitive (24, 172). An alternative interpretation is that "resistant" RNAs are simply packaged in such a way as to be refractory to siRNA interactions. In human immunodeficiency virus studies, some researchers found nontranslated, infecting human immunodeficiency virus type 1 RNAs to be resistant to RNAi (139); however, others saw the opposite result (150). Finally, it is possible that RNAi has a connection to the phenomenon of translation-associated nonsense-mediated mRNA decay, since SMG2/ mutants of C. elegans, which are deficient in nonsense-mediated mRNA decay, have also been shown to be deficient in RNAi (51, 68).
Regulation of RNAi. There are a number of ways in which the RNAi response might be regulated. First, individual proteins important for RNAi might be directly regulated in their expression or activity. This aspect of regulation has not yet been systematically evaluated, largely because the list of components of the RNAi machinery is not yet complete. Second, the siRNAs or mRNA targets may contain features that impact recognition by the RNAi machinery. This has been an active area of investigation in many groups and depends not only on which siRNA strand is incorporated into the RISC complex (178, 336) but also on kinetic aspects of target cleavage. Most recently it has been reported that different regions of siRNA have different contributions to cleavage: the 5' ends of siRNAs contribute more to the binding to target mRNA, while the central regions and 3' end of siRNAs contribute to providing a helical geometry which is required for cleavage (114).
Finally, cellular or viral proteins that interfere with the RNAi response pathway might be expressed. Recent work in C. elegans identified a protein, Eri-1, that contains a nucleic acid binding domain and an exonuclease domain which can inhibit RNAi activity. Since Eri-1 is conserved evolutionally, this negative regulation might be general in other species (170). In plants, RNAi has an important antiviral activity. However, a tombaviral protein, p19, has been reported to suppress RNAi in infected hosts (44, 195, 384, 421; reviewed in references 146 and 427).
Persistence and spreading of the RNAi response. In some but not all cells, the RNAi response can persist for multiple cell generations and can even spread from cell to cell (92, 112). This effect is most evident in C. elegans (92, 112) and in plants (277, 387). One way in which the RNAi effect could be maintained for long periods of time would be via the activity of an RNA-dependent RNA polymerase (RdRp) activity, which uses siRNAs as primers to convert RNA into dsRNAs that are degraded to produce new siRNAs, called secondary siRNAs, thereby amplifying the gene-silencing effect in the cells. Another outcome of such an activity would be the spreading of the RNAi effect to sequences upstream or downstream of the originally targeted sequence.
An RdRp enzyme is required for RNA silencing in a number of organisms (53, 214, 230, 347, 354) (Table 1). However, RdRp is probably not required for RNAi in D. melanogaster or mammals. Schwartz et al. (337) showed that RNAs lacking a 3' hydroxyl group cannot be extended by RdRp but can nevertheless generate a robust RNAi response in D. melanogaster. Another group drew the same conclusion based on data obtained in mammalian oocytes (358). Further, RNAi is exon specific in D. melanogaster (42), also supporting the conclusion that RdRp is not required. Most recently, Chi et al. (46, 65) showed, using microarrays and human 293T cells, that siRNA-induced gene silencing is highly gene specific and that secondary siRNA is not detectable.
Both in worms and in plants, the RNAi effect can spread systemically, from cell to cell (92, 112, 277, 387). This is likely due to the presence of specific cell surface receptors for siRNAs. It was recently reported that the C. elegans Sid-1 protein mediates siRNA uptake and is essential for systemic RNAi (88). When expressed in D. melanogaster, this protein also allowed the insect cells to take up dsRNA. A recent genetic screen for the RNAi spreading defect in C. elegans isolated genes rsd2, rsd3, rsd4, and rsd6 (rsd stands for RNAi spreading defective), which are required for systemic gene silencing (373).
22-nucleotide noncoding RNA species in cells. These RNAs are involved in many processes, including regulation of gene expression during development and defense against viruses. While they are not generated from perfect duplex RNA precursors and do not act by perfectly matching their targets through complementary base pairing, micro-RNAs nevertheless must be included in any discussion of RNAi, as they appear to function through the same underlying cellular machinery. There are a number of excellent recent reviews of micro-RNA processing and function (4-6, 14-16, 38, 91, 113, 140, 192, 338, 392). In the past year there has been enormous progress in the biochemical and computational identification of novel micro-RNA species (109, 190, 191, 193, 311). Hundreds of these small RNAs are now known, but the functions of the vast majority of them are still unclear. Recently it has been shown in the mouse that miR-181 is expressed in hematopoietic cells and controls hematopoiesis (45), while the C. elegans lys-6 micro-RNA controls neuronal left-right asymmetry (160). Intriguingly, even viruses may encode micro-RNAs to regulate host or viral gene expression (290).
Pre-micro-RNAs are transcribed as longer precursors which are processed in two steps. In the first step, the primary transcripts (pri-micro-RNA) are cleaved to shorter precursors of
70 nucleotides (pre-micro-RNA) by an RNase III family member related to Dicer, called Drosha, in the nucleus (204) (Fig. 2). These precursors exist as highly structured, imperfect hairpin RNAs that are further processed by Dicer in the cytoplasm to mature
21-nucleotide micro-RNAs (204). Recently, a Ran-dependent importin-ß-related receptor, exportin-5, has been shown to mediate efficient nuclear export of pre-micro-RNAs (181, 223, 424).
Unlike many or most siRNAs, however, only one of the two strands produced by Dicer cleavage is generally assembled into a RISC-like structure. The structural components of micro-RNPs may differ somewhat from those in RISCs (257); however, the two complexes have many common components (72, 158, 257, 388).
While most micro-RNAs have unknown functions, a general picture based on detailed mechanistic studies of several individual species is becoming clear (4, 38). The best-characterized micro-RNAs are lin-4 and let-7 of C. elegans. lin-4 and let-7 regulate endogenous genes involved in developmental timing in C. elegans. let-7 and lin-4 mutant worms show abnormal development (203, 310). lin-4 is antisense to sequences in the 3'untranslated region of mRNAs lin-14 and lin-28, while let-7 is complementary to the 3' untranslated region of lin-41. These micro-RNAs act by inhibiting protein synthesis through an unknown translational repression pathway. Evidence from lin-4 studies suggests that mRNA stability, polyadenylation level, and translational initiation are not affected (310).
This type of regulation likely exists as well in mammalian cells because overexpression of miR-30 and miR-21 can repress gene expression without changing mRNA stability (430, 431). It is reasonable to predict that some novel micro-RNAs, if not all, will turn out to regulate the expression of genes such as lin-4 and let-7. However, we cannot exclude the possibility that some micro-RNAs may target mRNA regions other than the 3' untranslated region, and it remains possible that some micro-RNAs may interact functionally with proteins rather than RNAs.
Although siRNAs and micro-RNAs are very similar in how they are produced and assembled into macromolecular complexes, their effects on gene expression are distinct and in at least some instances appear to be related more to how they interact with their targets than with how they are produced. Thus, mutation of a micro-RNA to be perfectly complementary to a target mRNA can lead to RNA degradation, while mutation of an siRNA to be imperfectly complementary to a target can lead to translational inhibition rather than mRNA degradation (66, 432). Also, in plants, micro-RNAs with imperfect complementarity to their targets can nevertheless mediate mRNA cleavage (276). Finally, micro-RNA can direct the cleavage of HOXB8 mRNA in mammalian cells (422), suggesting that micro-RNAs and siRNAs might in some instances regulate gene silencing in an overlapping way.
RNAi has now become a powerful tool for reverse genetics studies and antiviral studies in the laboratory. Researchers are increasingly using RNAi combined with traditional molecular or genetic methods to characterize the functions of proteins in cell growth and development (42, 164, 222, 227, 396). However, caution is warranted, as it has been reported that high doses of siRNAs can in fact lead to activation of the interferon and PKR pathways (29, 351).
The past few years have witnessed the development of a large number of useful approaches that take advantage of RNAi as a tool to study basic biological processes or in the production of novel therapeutics and antiviral agents. There are a number of useful recent reviews on this subject, for uses in both plants and animals (1, 8, 35, 105, 106, 128, 139, 150, 156, 161, 165, 200, 202, 234, 297, 302, 305, 346, 349, 367, 368, 382, 383, 398, 401, 405, 411, 417).
Particular success has already been achieved in the application of RNAi-based technologies to the inhibition of the replication of a number of viruses, including human immunodeficiency virus-1 (52, 150, 202, 268), other retroviruses (139), Hepatitis B virus (241) and recently, influenza A virus (100).
In this rapidly developing area, a number of powerful new methods have been developed for the introduction of siRNAs into cells or animals or for their production within cells. One can readily purchase synthetic RNAs for use in transfection experiments, or one can produce them in the laboratory by in vitro transcription or by digestion of dsRNAs with recombinant Dicer or RNase III (263, 418). There are a number of reports of the development of expression vectors that produce intracellular hairpin structures that can be processed by Dicer into functional siRNAs (30, 274, 275, 425). Both DNA-based (360) and lentiviral RNA virus-based (7, 316) delivery vectors have been described. Recently, a CRE-lox-based strategy has been developed for temporal or tissue-specific knockdown in animals (422).
Finally, the RNAi pathway may have evolved to be a major antiviral pathway in plants. This is evidenced by the evolution of plant virus genes that target the RNAi machinery as a way of evading the host's innate defenses. For example, Flock house virus infects both plant and D. melanogaster cells, and this virus might also be able to infect mammalian cells. In insect cells, infection by this nodavirus leads to accumulation of siRNAs specific for the viral genome. However, the Flock house virus B2 protein functions to block RNAi silencing in both plant and insect cells (209). These results also serve to point out the extent of conservation of the RNAi pathway between the plant and animal kingdoms.
There are three common ways to make 21- to 23-nucleotide siRNAs: chemical synthesis, in vitro transcription by bacteriophage polymerases, and RNA polymerase III promoter-driven vector-based short heterochromatic RNA. Independent studies showed that siRNAs and short heterochromatic RNAs made by bacteriophage polymerase or RNA polymerase III promoter-driven vectors can induce the interferon and dsRNA response pathways, while chemically synthesized siRNAs are less able to do so (29, 179). However, the interferon response can be alleviated by adjusting polymerase III-driven vector sequences or by eliminating the 5' triphosphate of bacteriophage polymerase-produced transcripts, which appears to play a role in initiating an interferon response (286, 323). Therefore, caution must be exerted in the design and use of siRNAs.
| FATE OF dsRNA IN THE NUCLEUS |
|---|
|
|
|---|
In eukaryotes, a dsRNA unwinding or modifying activity was first discovered in Xenopus laevis (19, 266, 306). The enzyme, ADAR1, was subsequently found to be a member of a small gene family whose members catalyze the conversion of adenosines to inosines within dsRNA (20, 266, 397) by hydrolytic deamination (296). Mammals express two active enzymes, ADAR1 and ADAR2; C. elegans likewise expresses two forms, adr-1 and adr-2; but D. melanogaster has but a single ADAR gene, dADAR. The human ADAR1 and ADAR2 enzymes have slightly distinct but overlapping substrate specificities (206).
Long nuclear dsRNAs are promiscuously edited by ADAR. ADAR editing is highly sensitive to the length of the duplex. Perfect RNA duplexes of less than 15 bp are modified only inefficiently in vitro and perhaps not at all in vivo (266). Optimal activity is generally seen with dsRNAs of at least 25 to 30 bp and preferably greater than 100 bp in length (20, 266). Thus, short RNA stem-loop structures and duplexes are generally refractory to editing, while more extensively base-paired molecules are favored editing substrates. In long perfect duplexes, about 50% of the A's on each strand can be edited in an almost random pattern, with the exception of a clear 5' neighbor preference for A or U (294). The resulting RNAs contain I-U base pairs which make the RNA duplex unstable and may lead to partial or complete unwinding (20). In fact, extensive unwinding of edited duplexes might be considered a likely fate, as RNA helicase activity appears to be closely connected to editing in vivo (28, 308).
In the mouse polyoma virus model system, it was found that at late times in infection, large amounts of long dsRNA are produced in the nucleus (189, 219). These molecules are promiscuously edited by ADAR, but the resulting inosine-containing RNAs are not exported to the cytoplasm (189). Further biochemical analysis revealed that many hyperedited RNAs appeared to bind tightly and specifically to a protein complex that resulted in their retention in the nucleus (433). The highly conserved and abundant nuclear protein p54nrb binds hyperedited RNAs with striking specificity. This protein exists in a complex with the splicing factor PSF and the inner nuclear matrix structural protein matrin 3, which confers highly cooperative binding to inosine-containing RNA and leads to nuclear retention, most likely via attachment to the nuclear matrix (433).
p54nrb is the first identified nuclear RNA-binding protein that requires inosine for high-affinity binding to RNA. This protein has also recently been reported to exist in novel nuclear compartments called paraspeckles (93). These data led to the important conclusion that nuclear antisense RNA leads to hyperediting and subsequent nuclear retention of target transcripts. Messages with only one or a few inosines (resulting from editing of short duplex regions in pre-mRNAs) escape and can be delivered to the cytoplasm. This discrimination between selectively edited RNAs and promiscuously edited RNAs provides the cell a useful way in which antisense RNA can regulate gene expression and is diagrammed in Fig. 4.
|
Among the best-studied examples is the mRNA encoding the mammalian glutamate receptor subunits (135, 339, 355). Interestingly, selective editing in this as well as other genes results from double-stranded secondary structures formed by base pairing between exons and downstream intron elements (79, 131, 221). Transcripts encoding the 2C subtype of the neurotransmitter serotonin receptor also undergo RNA editing events in which genomically encoded adenosine residues are converted to inosines (31, 267). In this system, as for the glutamate receptor, editing requires the interaction of exon sequences with downstream intron sequences. Editing in these systems is not at all promiscuous but rather is directed to specific adenosines which are imbedded in favorable secondary structures, often involving an unpaired adenosine (163, 182, 271, 327, 415). Thus, while 15-bp perfect duplexes cannot be edited by ADAR in vitro, a minimal natural selective editing substrate consisting of a 15-bp dsRNA stem with a single-base mismatch was sufficient for editing, though longer substrates were certainly more optimal (133). Very recent studies combining comparative genomics and experimental approaches allowed the exciting discovery of a large number of new ADAR substrates (137). Interestingly, these new substrates follow the general pattern of intron-exon interactions.
The observation that selective editing involves intron-exon interactions leads to several important conclusions that might be important as well for hyperediting. First, editing must be fast, and ADAR must be present in or near the elongating RNA polymerase II enzyme complex. This is consistent both with studies on the general localization of ADAR enzymes throughout the nucleus and to sites of transcription (though they may sometimes concentrate somewhat in the nucleolus) (78, 359) and with the observation that ADAR exists in large ribonucleoprotein particles containing splicing factors (303). Second, there must exist regulation and coordination of the editing versus splicing events in these genes, since rapid splicing would preclude editing, and failure to resolve RNA secondary structures might interfere with the splicing process. Indeed, the involvement of a specific RNA helicase enzyme, the D. melanogaster maleless protein, and its mammalian homologue, RNA helicase A, in the coordination of editing and splicing has been observed (28, 308).
Editing within noncoding regions might also be important for the regulation of gene expression. Morse et al. (255, 256) developed a method to identify inosine-containing RNAs. Using this method, they found ADAR substrates within 3' untranslated regions, introns, and noncoding RNAs in C. elegans and in human brain. Since repetitive elements such as LINE and Alu sequences were found to contain edited bases, and since these elements are commonly present in the intronic or untranslated regions, this finding also raised the possibility that ADAR might be involved in the regulation of repetitive elements or transposon expression or functions in mammalian genome (254).
ADAR structure and function. The ADARs from all organisms contain variable numbers of dsRNA-binding domains and a highly conserved C-terminal catalytic domain. Figure 2 illustrates the domain structure of the human ADAR1 long and short isoforms. The longer form, ADAR1-L, is found in both the nucleus and the cytoplasm, while the short form, ADAR1-S, is mostly nuclear.
The nucleocytoplasmic distribution of ADAR1-L is modulated by double-stranded RNA-binding domains, a leucine-rich export signal, and a putative dimerization domain (359). The catalytic domain is sufficient for deaminase activity on some selectively edited minimal substrates but not on long dsRNAs (133). Three dsRNA-binding motifs are important for the catalytic activity of ADAR1, but the contributions of the three dsRNA-binding domains are different, with dsRNA-binding domains III being fundamentally important for deaminase activity (217, 218). ADAR1-L (Fig. 2) contains two N-terminal Z-DNA binding domains (335), three double-stranded RNA binding motifs (dsRBD I, II, and III) (180), and a C-terminal deaminase catalytic domain (194, 218). The Z-DNA binding domain is not required for catalytic activity, but it has been suggested that this binding domain might target ADAR1-L to sites of active transcription in the nucleus (132). Recently it has also been shown that ADAR1 is dynamically associated with the nucleolus, from which it can be recruited to sites of editing I the nucleoplasm (63, 325).
The individual ADAR1 dsRNA-binding domains have distinct in vivo localization capabilities, which may be important for chromosomal targeting, substrate recognition, and editing specificity (73). Recent studies also showed that ADAR1 is a nucleocytoplasmic shuttling protein with a nuclear localization signal and nuclear export signals (298). In their active state, ADARs appear to exist as homodimers (47, 98, 151).
Biological importance of ADARs. The importance of ADAR activity for viability and development has been revealed through the construction and analysis of knockout mutant organisms. The first animal in which ADAR was knocked out was D. melanogaster (281). While mutant flies were viable, they exhibited striking defects in adult nervous system function and integrity. C. elegans has two ADAR genes, adr-1 and adr-2, with adr-1 being expressed in most cells of the nervous system and developing vulva (378). Genetic knockouts have shown that, while not essential for viability, both ADARs are important for normal behavior, with mutants showing defects in chemotaxis (378).
More recently, gene knockout studies in transgenic mice have provided very interesting insights into ADAR function in mammals. ADAR1 appears to be more important for development and viability than ADAR2. ADAR2 has only a single essential in vivo target, a CAG codon in the GluR-B gene (134). ADAR2/ mice die of neurological disorders but appear normal if the GluR-B gene is replaced with one in which a single codon is replaced with an edited version (168). An earlier study indicated that ADAR1/ homozygous mice die as embryos, while ADAR1+/ mice have defects in erythropoiesis in the liver (399). More recent studies have shown that ADAR1/ mice die by embryonic day 11.5 with widespread apparent apoptosis. ADAR1/ fibroblasts are prone to apoptosis induced by serum deprivation (400). Also, Hartner et al. (125) found, studying knockout mice in vivo, that ADAR1 selectively edits two of the five known edited adenosines in the serotonin 5-HT2C receptor pre-mRNA. Further, homozygous knockout of ADAR1 leads to embryo-lethal defects in liver structure and hematopoiesis. Thus, ADAR1 is clearly important in the development of nonneuronal tissues.
Possible connections between ADARs, PKR, and RNAi. Finally, some highly structured RNAs point to complex and subtle effects that relate to the interplay between the different dsRNA response pathways of PKR, RNAi, and ADAR editing. The hepatitis delta virus genome is largely but not completely duplex RNA. While it can be edited in a site-selective manner by ADAR and can activate PKR (48, 313), it cannot be cleaved by Dicer (43). On the other hand, fragile X syndrome transcripts encode trinucleotide repeats that can form RNA hairpins that cannot activate PKR but are efficiently cut by Dicer (122). Also, as mentioned above, ADAR knockouts of C. elegans show chemotaxis defects. However, these defects could be rescued specifically by crossing an adar/ strain to RNAi-defective strains carrying rde-1 or rde-4 (377).
|
Recent work has provided direct evidence to connect nuclear siRNAs to heterochromatin assembly. Verdel et al. isolated an RNA-induced initiation of transcriptional gene silencing complex (RITS), which is required for heterochromatin assembly in S. pombe (390). This complex contains Ago1, Chp1, Tas3, and Dicer-cleaved siRNAs. In this complex, Ago1 is known to be a RISC component that binds to siRNAs, while Chp1 has been shown to bind centromeres. Moreover, the siRNAs in the complex were found to be homologous to centromeric repeats. Therefore, there appears to be a direct connection between siRNAs and centromeres. This work suggested a mechanism of epigenetic gene silencing at specific chromosomal loci by siRNAs in S. pombe (80, 111, 390). RITS-mediated epigenetic gene silencing might also be conserved in other systems. For example, in D. melanogaster, Argonaute proteins and polycomb proteins have been shown to be required for repeat-induced transcriptional gene silencing (278, 279). While the molecular mechanism by which nuclear siRNAs can recruit factors necessary for heterochromatin formation remains fairly unclear, interesting genetic screens in C. elegans for mutants defective in RNAi revealed that a large fraction of these mutants encode gene products that are chromatin associated (75).
In S. pombe, small centromeric siRNAs have been observed (309), and Volpe et al. showed that RdRp is bound to the centromeric DNA repeats by chromatin immunoprecipitation (393). These results suggested that RdRp may transcribe the second strand from nascent centromeric transcripts and generate dsRNA. The resulting long dsRNA would then be cleaved by Dicer into RNAi-related siRNAs. The role of RdRp in such gene silencing might be to amplify the RNA signals, leading to maintenance of the repressed state. Martienssen (231) presents a nice model explaining how single-copy elements cannot sustain silencing by this pathway but tandem arrays can. Consistent with this model, silencing at the mating type locus of S. pombe also depends on a sequence similar to centromeric repeats (10) but is not maintained well because there are no tandem repeats. Finally, Schramke et al. (333) recently showed in S. pombe that the introduction of hairpin RNA structures into this organism can induce RNA-mediated, chromatin-based epigenetic gene silencing.
How do the above RNAi-related gene-silencing mechanisms of plants and fission yeast apply to higher eukaryotes? While there is currently no direct biochemical evidence for RNAi-mediated chromatin silencing in higher eukaryotes, there are tantalizing clues that a connection will soon be made. Heterochromatin is commonly associated with chromosomal regions that are rich in repetitive sequences. Transcription from such regions could conceivably generate dsRNAs, which could enter a nuclear RNAi pathway. For a number of years it has been observed that increasing the number of copies of transgenes in mammals often leads to lower rather than higher expression (99) and can lead to regions of locally high concentrations of dsRNA (317). This might result from spurious bidirectional transcription, which in turn leads to gene silencing. Thus, more is not always better when introducing transgenes into cells. Furthermore, it has been reported that RNAi defects relieve the silencing of tandem transgene arrays in Neurospora crassa, Arabidopsis thaliana, C. elegans (53, 176, 258), and D. melanogaster (279).
In the C. elegans germ line, transposons are silenced, but they are mobile in somatic cells (175, 362). Some mutants that cannot silence RNA are also defective in RNAi (348). In D. melanogaster, some mutants carrying mutations in RNAi components lost heterochromatic silencing (279). In one interesting study, antisense RNA from a human gene locus was shown to lead to gene silencing and methylation of CpG islands, suggesting that in mammalian cells nuclear dsRNA can induce transcriptional gene silencing associated with DNA methylation within promoter regions, as has been seen in plants (379). Finally, RNA synthesis has now been found in regions of the genome that were once thought to be transcriptionally silent. For example, it has recently been shown that a human centromere is transcriptionally competent (319).
Role of dsRNA in imprinting and X chromosome inactivation. Finally, there is a possibility that RNAi-related gene silencing mechanisms may even extend to genomic imprinting and X chromosome inactivation in the nucleus. While the mechanisms of genomic imprinting and X chromosome inactivation have not been completely uncovered, in both cases long nuclear dsRNAs have been suggested to play a critical role. Genomic imprinting affects many mammalian genes and results in the expression of those genes from only one of the two parental chromosomes (for reviews, see references 304 and 352). So far, about 20% of known imprinted genes are associated with antisense transcripts, most of which are noncoding RNA and may have regulatory functions. Recent work showed that the Air antisense RNA, which overlaps the maternally expressed Igf2r gene, has an active role and is required for genomic imprinting (353). X chromosome inactivation is the transcriptional silencing of one X chromosome in female mammalian cells (see reference 169 for a recent review). It requires a region of the X chromosome known as the X inactivation center. Within the X inactivation center, there are two noncoding transcripts, Xist and Tsix. Tsix and Xist have the potential to form dsRNAs, and together they regulate the choice of X chromosome inactivation (reviewed in references 293 and 391).
As we have seen, the dsRNA-induced RNA editing machinery appears to be active throughout the nucleus. In higher eukaryotes, therefore, dsRNAs derived from centromeric DNA repeats or tandem transgene arrays are quite likely be efficient substrates for ADARs. In vitro data showed that RNAi is inhibited if the dsRNA is edited by ADAR2 (329). Edited dsRNAs contain mismatched I-U base pairs, which may lead to partial or complete unwinding of duplexes. Such molecules are poor substrates for Dicer cleavage. Also, when C. elegans mutants lacking ADAR activity were examined, it was found that normally expressed transgenes were now silenced by the RNAi machinery, indicating that ADAR activity can modulate the RNAi response in the nucleus (184).
Thus, in order for the RNAi machinery to play a key role in heterochromatin formation in higher eukaryotes, it would have to function in such a way as to overcome the editing machinery or to synergize with it. It is possible that RNA editing does in fact occur in heterochromatic regions; however, some dsRNA sequences might remain relatively refractory to editing owing to the inherent base preferences of the ADAR enzymes, and this subset of dsRNAs would then be targeted for Dicer cleavage.