SUMMARY
Summary: Major aspects of the pathway of de novo arginine biosynthesis via acetylated intermediates in microorganisms must be revised in light of recent enzymatic and genomic investigations. The enzyme N-acetylglutamate synthase (NAGS), which used to be considered responsible for the first committed step of the pathway, is present in a limited number of bacterial phyla only and is absent from Archaea. In many Bacteria, shorter proteins related to the Gcn5-related N-acetyltransferase family appear to acetylate l-glutamate; some are clearly similar to the C-terminal, acetyl-coenzyme A (CoA) binding domain of classical NAGS, while others are more distantly related. Short NAGSs can be single gene products, as in Mycobacterium spp. and Thermus spp., or fused to the enzyme catalyzing the last step of the pathway (argininosuccinase), as in members of the Alteromonas-Vibrio group. How these proteins bind glutamate remains to be determined. In some Bacteria, a bifunctional ornithine acetyltransferase (i.e., using both acetylornithine and acetyl-CoA as donors of the acetyl group) accounts for glutamate acetylation. In many Archaea, the enzyme responsible for glutamate acetylation remains elusive, but possible connections with a novel lysine biosynthetic pathway arose recently from genomic investigations. In some Proteobacteria (notably Xanthomonadaceae) and Bacteroidetes, the carbamoylation step of the pathway appears to involve N-acetylornithine or N-succinylornithine rather than ornithine. The product N-acetylcitrulline is deacetylated by an enzyme that is also involved in the provision of ornithine from acetylornithine; this is an important metabolic function, as ornithine itself can become essential as a source of other metabolites. This review insists on the biochemical and evolutionary implications of these findings.
INTRODUCTION
Microbial genes and enzymes involved in de novo arginine biosynthesis have enjoyed increasing attention over the last fifty years. Early studies focused on the steps of the pathway, but arginine biosynthesis soon became a paradigm for the analysis of regulatory mechanisms; this aspect has remained a major center of interest ever since (reviewed in references 12 and 31). However, further enzymatic and genomic exploration of the microbial world recently disclosed surprises that brought the pathway itself back into focus (Fig. 1). This is the subject of the present review. In order to keep the number of references within reasonable limits, the reader is referred to Caldovic and Tuchman (11) and Charlier and Glansdorff (12) for reviews of most of the early work.
Reticulated view of acetylation steps in the arginine biosynthesis pathway. The scheme focuses on the crossroads where either ArgA or ArgJ is acetylating l-glutamate. The steps of the “classical” pathway (initially described for E. coli) are numbered from 1 to 8. Actually, this classical pathway is never used in many organisms, and N-acetyl-l-ornithine appears now to be also a major crossroads. For instance, in addition to its transformation either into ornithine and acetate through ArgE (step 5) or into ornithine and N-acetyl-l-glutamate through ArgJ (step 5′), acetylornithine can be transformed into N-acetyl-citrulline (step 5") in the presence of carbamoylphosphate (cp) through ArgF′, a homologue of the well-studied ornithine carbamoyltransferase ArgF (step 6). ArgE then allows the process to go directly to citrulline (step 6′) and subsequently to arginine. Box 1 underlines this dual activity of ArgE (EC 3.5.1.16), which can deacetylate either N-acetyl-ornithine or N-acetyl-citrulline (50).
The first committed step (EC 2.3.1.1) of arginine biosynthesis is acetylation of l-glutamate at the N-α position (Fig. 1). Acetylation of the early precursors of arginine distinguishes them from the analogous intermediates in proline biosynthesis and prevents spontaneous cyclization of the semialdehyde arginine precursor. In the classical arginine pathway, the flow of acetylated precursors runs until acetylornithine. The formation of the subsequent intermediate—ornithine—is catalyzed by either of two enzymes, (i) acetylornithine deacetylase (acetylornithinase [AO]) (EC 3.5.1.16) or (ii) ornithine acetyltransferase (OAT) (EC 2.3.1.35), which recycles the acetyl group on glutamate. This acetyl cycle is of obvious energetic significance and is frequently referred to as “more evolved” (see, however, our concluding remarks, below). Ornithine is converted into arginine via citrulline and argininosuccinate. This textbook picture is now challenged at two levels: (i) the very identity, mechanism of action, and origin of the enzymes responsible for glutamate acetylation in different microorganisms and (ii) the extension of the sequence of acetylated intermediates beyond acetylornithine in a number of Bacteria.
De novo arginine biosynthesis via N-acetylglutamate is a feature characteristic of many prokaryotes, fungi, and plants (including unicellular algae) but not of animals (2); Caldovic and Tuchman (11) have reviewed the synthesis and metabolic role of N-acetylglutamate in animals. The present analysis focuses mainly on prokaryotes, where the new discoveries were made. The first comprehensive analysis of genes and enzymes of arginine biosynthesis in plants was published recently (69). In many respects, the situations in plants and in prokaryotes such as Pseudomonas sp. are similar (69).
GLUTAMATE ACETYLATION IN BACTERIA
N-Acetylglutamate SynthaseEarly studies of arginine biosynthesis in Escherichia coli and Pseudomonas aeruginosa had disclosed an enzyme catalyzing the formation of acetylglutamate from acetyl-coenzyme A (CoA) and glutamate (EC 2.3.1.1). N-Acetylglutamate synthase (NAGS), referred to further as “classical NAGS,” is the product of a single gene (argA) encoding two domains: (i) the N-terminal domain contains a carbamate kinase fold also present in acetylglutamate kinase (NAGK) (EC 2.7.2.8, ArgB), which catalyzes the next step in the pathway (58, 59), and (ii) the C-terminal domain contains an acetyl-CoA binding fold present in enzymes of the vast Gcn5-related N-acetyltransferase (GNAT) family that transfer the acetyl group from acetyl-CoA to a variety of N-terminal amino groups (21, 53, 75). This genetic structure suggests that the N-terminal domain of NAGS is responsible for efficient glutamate binding. With most GNAT enzymes—including E. coli ArgA—the reaction proceeds by the formation of a ternary complex between the protein and the substrates (sequential mechanism) and not by the formation of an acetylated enzyme intermediate (ping-pong bi-bi mechanism) (3, 75).
NAGS is also found in fungi and in vertebrates; in the latter its only function appears to be the provision of acetylglutamate as a cofactor of carbamoylphosphate synthase (11). Within each group—bacteria, fungi, or vertebrates—NAGS amino acid sequences display obvious similarities. Between each one of these groups, however, the similarities become very weak, suggesting ancient divergence or perhaps even independent origin. Nevertheless the limited similarities found between the DNA segments encoding the C-terminal regions of mammalian and Neurospora NAGS were used to clone the mouse and human NAGS genes (10, 51). Moreover, a two-domain structure similar to that of bacterial NAGS was suggested for the mammalian enzyme (51). Of note is that fungal NAGS is active only when associated with NAGK (37, 56), whereas the mammalian gene can complement an E. coli argA auxotroph by itself (10). This fact and the lack of similarity between fungal NAGS and NAGK (1) suggest that in fungi it may be NAGK and not NAGS that provides the glutamate binding site. In the plant Arabidopsis thaliana, there are two bimodular NAGS genes similar to those found in E. coli and Pseudomonas but with an insert of 90 codons in the N-terminal domain (69).
Ornithine AcetyltransferaseFigure 1 shows that another option for synthesizing acetylglutamate is OAT (ArgJ). This ornithine N-acetyltransferase (EC 2.3.1.35), discovered in Micrococcus glutamicus (70), was first characterized as an alternative to acetylornithine deacetylase (ArgE, EC 3.5.1.16) because it is a transacetylase that recycles the acetyl group from acetylornithine on glutamate, a reversible reaction (Fig. 1). Much later, complementation tests and biochemical experiments with purified enzyme established that some OATs could also synthesize acetylglutamate de novo from acetyl-CoA and glutamate (43, 45). Not all OATs possess this dual activity, however: some are bifunctional whereas others do not use acetyl-CoA as the acetyl donor (43, 76). In the most studied instance (Geobacillus [formerly Bacillus] stearothermophilus), the acetylation activity is about five times lower with acetyl-CoA as the substrate than with acetylornithine; moreover, in the former case, the reaction is irreversible, whereas the acetyl group exchange between acetylornithine and acetylglutamate is fully reversible (76). Active OAT derives from a preprotein that undergoes self-catalyzed cleavage next to a conserved T residue that becomes the catalytic nucleophile after autoproteolysis (44).
There is no detectable similarity between bifunctional OATs (EC 2.3.1.35/EC 2.3.1.1) and classical NAGS (EC 2.3.1.1). This may not be surprising since, unlike NAGS, OAT proceeds by a ping-pong bi-bi catalytic mechanism requiring the formation of a covalent enzyme-acetyl intermediate involving the above-mentioned invariant T residue. Moreover, in contrast to the substrate recognition pattern observed in GNAT enzymes (21, 53, 75), the CoA moiety of acetyl-CoA does not appear to enter the catalytic site of OAT (76).
Table 1 lists OATs that have been characterized in vitro or in vivo (i.e., by complementation tests) as mono- or bifunctional. It should be stressed that most of the genomic annotations reading “bifunctional OATs” in the public databases are invalid since they have been proposed on the sole basis of sequence similarity with G. stearothermophilus OAT and are not supported by enzymatic or genetic evidence.
Functional assignments for acetylation of glutamate and deacetylation of acetylornithine among Bacteria and Archaea
Mono- or bifunctional OATs have been found in organisms possessing a classical NAGS (81). In the former case (e.g., P. aeruginosa), NAGS assumes the anaplerotic function of priming the acetyl cycle with a first molecule of acetylglutamate. In the latter case (e.g., N. gonorrhoeae), there is functional redundancy, but it should be kept in mind that the acetylornithine-dependent transacetylation may be quantitatively the most important reaction, as in G. stearothermophilus (see above). There is an OAT in Arabidopsis (69), but its functional pattern is not known.
The X-ray crystal structure of a monofunctional OAT of Streptomyces clavuligerus involved in clavulanic acid biosynthesis (19), rather than primary metabolism (23), has been determined, but there is presently no in silico clue allowing us to distinguish monofunctional from bifunctional OATs. The phylogenetic tree (Fig. 2) rooted by the paralogous OATs involved in clavulanic acid biosynthesis does not show a clear separation between monofunctional and bifunctional ArgJ enzymes. Indeed, the monofunctional enzyme of P. aeruginosa and the bifunctional one of N. gonorrhoeae have a common node. However, it appears puzzling that these bacteria share a gene set (the presence of both argA and argJ) and thus appear to be exceptions in their respective classes, as most other monofunctional OATs are accompanied by a short arg(A) gene (see below) and the few other organisms known to use a bifunctional OAT have only argJ. This may explain why they are grouped on a common node.
Phylogenetic tree of the experimentally studied ArgJ enzymes. The amino acid sequences of the ArgJ proteins that have been experimentally studied and characterized as being either mono- or bifunctional were multiply aligned with the program MUSCLE (22). This alignment was then used to reconstruct a phylogenetic tree with Phyml, a fast and efficient program based on a maximum-likelihood analysis of a distance tree (33). The tree was rooted with the two paralogues of ArgJ that are present in Streptomyces clavuligerus (19). The root is marked by a black oval. The presence/absence of enzymatic activities involved in acetyl group transfer (respective EC numbers) is indicated close to each species name. Arg(A) refers to S-NAGS, as discussed in the text. Mono- and bifunctional ArgJ enzymes are bracketed separately. The enzymatic profiles of P. aeruginosa and N. gonorrhoeae are framed to underline their similarity. Note that the genome of S. clavuligerus has not been entirely sequenced.
The existence of bifunctional OATs raises a question of qualitative importance. Can OAT actually replace NAGS? Since the completion of a number of genome sequencing projects, a few instances indeed have emerged where no NAGS, but a bifunctional OAT, appears to be present. This is the case in B. subtilis (reinterpretation of old complementation data [55, 81]), G. stearothermophilus, and Thermotoga spp., assuming that T. maritima, whose genome was completely sequenced, is similar to T. neapolitana, which has a bifunctional OAT (26, 43). The lack of biochemical information on the catalytic properties of most genomically identified OATs precludes further identification. At our present state of knowledge, it is nevertheless safe to assume that a number of microorganisms, perhaps many, depend on a bifunctional OAT for the synthesis of acetylglutamate. Recent observations, however, have revealed that there are other alternatives to OAT for “life without a classical NAGS,” as discussed in the next section.
Substitutes for Classical NAGS: Short NAGSEvidence of substitutes for classical NAGS first came from genetic observations of Campylobacter jejuni, an ε-proteobacterium (35), and two marine psychrophilic γ-Proteobacteria, Moritella abyssi and M. profunda (82).
The genome of C. jejuni does not contain a classical NAGS or OAT gene. The argO gene of C. jejuni is part of an argCOBD operon, and it complements E. coli argA mutants. This argO gene codes a 146-amino-acid polypeptide presenting low similarity to E. coli streptothricin acetyltransferase and being broadly related to the GNAT N-acetyltransferases family. It was able to catalyze the first step of arginine biosynthesis in this organism, although it does not appear to be homologous to the classical argA gene of other bacteria (Fig. 3).
Glutamate acetylation: summary of the various ways of carrying out a single reaction, EC 2.3.1.1. The classical or most likely alternative candidates that encode the EC 2.3.1.1 activity are listed. For each are given the protein name, information on its function, its homology, and its taxonomic distribution. For the various combinations involving the (A) component, the domain structure is also outlined. CK is the carbamate kinase domain homologous to ArgB. NAT is the domain homologous to N-acetyltransferases (in gray).
M. abyssi and M. profunda display an unusual structure for the argH gene that codes for the argininosuccinate lyase catalyzing the last step (EC 4.3.2.1) of arginine biosynthesis (82). The argH gene is extended by a 170-codon stretch able to complement an argA E. coli mutant. In contrast with C. jejuni argO, the cognate amino acid sequence is clearly similar to the C-terminal domain of NAGS (Fig. 3). The gene is present at the end of an argE/CBFGH(A) operon, where argE—coding for an acetylornithinase (AO)—and the rest of the cluster are expressed divergently, a pattern characteristic of many enteric bacteria. After the original report (82), the argH(A) gene was found in similar genetic contexts in closely related bacteria belonging to the Alteromonas-Vibrio group (81). Two of them—Pseudoalteromonas haloplanktis (46) and Idiomarina loihiensis (36)—do not harbor a genetically identifiable NAGS or OAT (Fig. 4). Since both bacteria are Arg+, the ArgH(A) protein appears as a new type of arginine biosynthetic enzyme able to catalyze in vivo both the first committed step of the pathway and the last one. Functional redundancy occurs in some organisms of this group: several have a gene for a putative classical NAGS and one of them, Colwellia psychrerythraea, has adjacent genes for an ArgH(A) fusion and a classical NAGS (Fig. 4); it also has an OAT (47). More surprisingly, Pseudoalteromonas atlantica displays an argC to argH cluster containing argA next to argH. Therefore, the comparison between the two Pseudoalteromonas species and their close relatives Idiomarina and Colwellia allows us to propose scenarios explaining the argH(A) fusion (Fig. 4; see concluding remarks for details).
Possible scenarios for the fusion event creating the argH(A) gene. The scenarios derive from a comparison of the genetic contexts of argA and argH in the closely related species Pseudoalteromonas sp., I. loihiensis, and C. psychrerythraea. This comparison was made with data extracted from two online resources, Integrated Microbial Genomes (http://img.jgi.doe.gov/cgi-bin/pub/main.cgi) and The SEED (http://theseed.uchicago.edu/FIG/index.cgi), as well as from unpublished tools designed by F. Lemoine, O. Lespinet, and B. Labedan to visualize synteny blocks in many multiply-aligned bacterial genomes. Box 1 shows a simplified species tree based on 16S RNA sequences compared with those of Shewanella and Moritella, where argA is not closely linked to the argC-to-argH cluster.
Screening of complete genomes for homologues to the arg(A) part of argH(A) by looking for virtual argH/arg(A) fusions brought to light a large array of organisms, including some Archaea, where such a sequence is present though not necessarily fused with argH or linked to it. In Thermus thermophilus and Deinococcus spp. (both D. radiodurans and D. geothermalis), the sequence is adjacent to an argGH operon and the genetic context suggests that it is part of it. Remarkably, most of these organisms do not display a homologue of classical argA but have an argJ gene. In at least two organisms, T. thermophilus and S. coelicolor, OAT is monofunctional (Fig. 2). Therefore, both probably depend on this short version of acetylglutamate synthase for arginine synthesis (81). Interestingly, B. subtilis has a bifunctional ArgJ (like G. stearothermophilus) but no Arg(A) homologue (a functionally meaningful combination), whereas other Bacilli have an Arg(A) homologue and an OAT of unknown specificity (Table 1). The related G. stearothermophilus has a bifunctional OAT, but the genome is not yet known in its entirety.
Independently of these observations, Errey and Blanchard (24) characterized in Mycobacterium tuberculosis a short version of NAGS which is actually retrieved from the genome of Mycobacterium species when databases are screened for arg(A) homologues, as outlined above (81). In this organism as well, there is an OAT that appears to be monofunctional on the basis of genetic information.
Short NAGSs (S-NAGSs) thus also constitute an option for the synthesis of acetylglutamate. Since they lack the N-terminal domain of classical NAGS, and as the cognate M. tuberculosis protein displays an extremely high Km for glutamate (>600 mM), it is possible that an efficient glutamate binding site is provided by another protein, such as NAGK, that would associate with NAGS to form an active acetylating complex. This hypothesis is reminiscent of the association between yeast NAGS and NAGK, without which NAGS remains inactive (see above). In the case of ArgH(A), it is possible that association with ArgH somehow stabilizes the protein and/or enhances its activity. Obviously, further studies will be necessary to determine the properties of these proteins and what their molecular context is in vivo.
Retrieval of close homologues of the C-terminal domain of classical NAGS likely identifies only a fraction of bacterial acetylglutamate synthases, as already shown by the argO gene of Campylobacter (see above). In the absence of functional data, it is not possible to predict which of the many Gcn5-related N-acetyltransferases could play this role. In some cases, however, the genetic context indicates promising candidates. For example, in Xanthomonas campestris, a gene for a putative acetyltransferase of the GNAT family was identified within a cluster of arginine biosynthetic genes between argB and argC (17). However, although this gene has been called argA, it does not appear to be homologous to arg(A).
Feedback Inhibition of Ornithine SynthesisThe metabolic flow of acetylated ornithine precursors is regulated by feedback inhibition. In microorganisms using an acetylornithinase but no acetyl recycling enzyme, it is NAGS that catalyzes the first committed step of the pathway, and it is inhibited by arginine. When an OAT recycles the acetyl group, NAGK can become flow controlling; it is actually inhibited by arginine in many organisms, whether NAGS is present or not (11, 12, 59, 71; also see below). For example, in Corynebacterium glutamicum and T. maritima, which have no recognizable NAGS or S-NAGS but an OAT (monofunctional in the former and probably bifunctional in the latter, by analogy with T. neapolitana [43]), arginine inhibits NAGK (see references 26 and 63, correcting an earlier report [72] for T. maritima). In P. aeruginosa, where both NAGS and OAT are present, both NAGS and NAGK are inhibited by arginine (34).
However, OAT itself is also a “logical” target for efficient feedback control. In G. stearothermophilus, arginine does not inhibit NAGK (62), but ornithine strongly inhibits both activities of the bifunctional OAT (61). Arginine inhibits OAT in Thermus aquaticus (72).
The group of V. Rubio recently disclosed a structural basis for feedback inhibition of NAGS and NAGK by arginine (59). In T. maritima and P. aeruginosa, the arginine-sensitive NAGK is a ring-like homohexamer where arginine binds to each dimeric subunit at a site close to the interdimeric junction. In E. coli, the arginine-insensitive NAGK is a homodimer, whereas the arginine-sensitive NAGS is a homohexamer, as is NAGK in T. maritima and P. aeruginosa. Since NAGK is homologous to the N-terminal domain of NAGS, Ramon-Maiques et al. (59) were able to localize the molecular signature for arginine inhibition in that domain. However, Mycobacterium S-NAGS, where this domain is absent, is also inhibited by arginine (24). Both the arginine and glutamate binding sites thus remain to be identified in S-NAGS.
In the yeast Saccharomyces cerevisiae, both NAGS and NAGK are inhibited by arginine, but NAGS must be associated with NAGK to remain both functional and sensitive to arginine. Deletion of the NAGS gene decreases the sensitivity of the kinase, while making the kinase insensitive renders NAGS insensitive as well (1, 56). The situation appears to be similar in Neurospora crassa (83).
The Lysine ConnectionIn a seminal paper, Nishida et al. (54) described a novel pathway for lysine biosynthesis in T. thermophilus. As in fungi, this pathway uses aminoadipic acid (AAA) as an intermediate but, instead of proceeding to lysine via saccharopine, it converts AAA to lysine by a series of acetylated intermediates analogous to those leading from glutamate to ornithine (Fig. 1, steps 1 to 5) and differing by one CH2 group only. Thermus was known to have two unlinked genes annotated argC, one of them being part of an arginine operon (4), as well as two genes annotated argB, one argD homologue (alias lysJ) and one argE homologue (alias lysK). The latter two were shown to code for bifunctional enzymes, acting on substrates of both the new AAA pathway and ornithine biosynthesis (48, 49). In E. coli, N-acetylornithine transaminase (ArgD) and N-succinyl-l,l-diaminopimelate:alpha-ketoglutarate aminotransferase (DapC), an enzyme of lysine biosynthesis, are one and the same protein (40).
Of particular interest from the point of view of glutamate acetylation is the protein LysX, which was shown by genetic disruption to catalyze the synthesis of the first intermediate in the conversion of AAA to lysine in Thermus, plausibly N-acetyl-α-aminoadipate (54, 60). LysX enzymes form a subfamily inside the ATP-dependent carboxylate-amino ligase RimK family. Considering the chemical similarity between the substrates and products of the reactions catalyzed by NAGS or S-NAGS and LysX, respectively (even if the mechanisms of those reactions are different since the LysX-catalyzed reaction involves ATP), the question arises whether a LysX homologue could not be involved in glutamate acetylation in some organisms (Fig. 3). The substrate specificity of Thermus LysX remains to be investigated; disruption of the cognate gene does not bring about arginine auxotrophy, but acetylglutamate is probably synthesized by the short version of ArgA, coregulated with an argG-argH operon (81) (and not by ArgJ, as erroneously assumed in reference 54, since Thermus ArgJ is monofunctional [4]). LysX homologues have been detected in several organisms (mostly Archaea; see the next section), but their exact function remains to be defined. Some of them appear in a genetic context that suggests a possible involvement in arginine biosynthesis, such as in Chloroflexus aurantiacus, somewhat related to Thermus. In Lactobacillus plantarum (which has no argA gene), the presence of a LysX homologue closely linked to an argC argBE cluster, distinct from the bipolar carA argCJDBF operon (7), is intriguing (see Table 2).
LysX homologues detected in Archaea and Bacteria and belonging to various subfamilies of the RimK family
GLUTAMATE ACETYLATION IN ARCHAEA
Much less information on glutamate acetylation is available for Archaea than for Bacteria. No classical NAGS gene was detected in the archaeal genomes sequenced so far, but argJ is present in several euryarchaeota, mainly methanogens and Archaeoglobus fulgidus (Table 1). The only archaeal OAT to have been studied enzymatically (in Methanocaldococcus jannaschii [43]) appears to be monofunctional, but screening the genome of this organism does not yield sequences homologous to the S-NAGS discussed above. The protein responsible for acetylglutamate synthesis in M. jannaschii thus remains elusive. On the other hand, several archaeal genomes (methanogens) that harbor a putative argJ gene also display a homologue of arg(A) encoding S-NAGS (Table 1). In the absence of enzymatic data on the cognate ArgJ proteins, it is not possible to speculate on the metabolic significance of these sequences. One thing is clear, however: the classical NAGS does not appear to belong to the archaeal patrimony.
Nishida et al. (54) opened a new perspective on arginine biosynthesis in Archaea in the paper discussed in the previous section, where they suggested that the novel AAA pathway for lysine biosynthesis discovered in the bacterium T. thermophilus might operate in the euryarchaeote Pyrococcus sp. to synthesize both lysine and ornithine. In fact, not only the pyrococci but also many euryarchaeota and crenarchaeota turned out to contain putative arginine genes (Table 2), from argB to argH (8, 14, 42). According to Nishida et al. (54), the genes annotated as argB, argC, argD, and argE might code for bifunctional proteins acting on molecules differing by one CH2 group only. The exact function of these genes is still uncertain, however; at least some of the genes of the diaminopimelate pathway for lysine biosynthesis appear to be present in the genomes of pyrococci (74). In Sulfolobus, lysine was found to regulate the synthesis of mRNA from a gene cluster annotated argC, argB (lysZ), lysM (lrp-like), lysW, lysX, argD (lysJ), and argE (lysK), but involvement of these genes in arginine biosynthesis could not be excluded, and no enzyme assays were carried out (8).
What concerns us here, however, is whether a homologue of LysX, identified in Thermus by genetic disruption as the first enzyme of the new AAA pathway, could be involved in glutamate acetylation in some Archaea (Fig. 3). LysX homologues have indeed been detected in several archaeal genomes in a genetic context that suggests a possible involvement in arginine biosynthesis (Table 2). For example, the complete sequences of the extreme haloalkaliphile Natronomonas pharaonis (25) and the halophile Haloarcula marismortui (5) show a near complete arg cluster from argB to argH. Apparently, argA and argJ homologues are missing, but between argH and argC, one finds a short gene encoding a putative protein and a lysX homologue called argX (annotated as a putative regulator for no clear reason in reference 25). Often, lysX homologues are adjacent to putative argC, argB, and argD genes, as in thermococci and Sulfolobales. In Haloarcula marismortui, as in N. pharaonis, the operon is almost complete with the structure argF (annotated arcB, normally reserved for catabolic ornithine carbamoyltransferase [OTC]), argE, argD, argB, argC, lysX/argX, unknown sequence, argH, and argG. Table 2 further shows that there are in archaeal methanogens other homologues belonging to the RimK family, such as members of the subfamilies MptN (tetrahydromethanopterin:α-l-glutamate ligase) and CofF (γ-F420-2:α-l-glutamate ligase). However, in both cases, no arg genes are found in the neighborhood of the encoding genes. Note that M. jannaschii has both mptN and cofF genes, but it is impossible with the present data to determine whether one of them could be involved in acetylglutamate synthesis.
The pattern of glutamate acetylation in Archaea may thus turn out to be as varied as it is in Bacteria: Arg(A)-like polypeptides (S-NAGS), other putative acetyltransferases, bifunctional OATs (yet to be identified in this domain, however), and LysX homologues are candidates for this metabolic function (Fig. 3).
THE FATE OF ACETYLATED PRECURSORS BEYOND ACETYLORNITHINE
Our textbook vision of arginine biosynthesis beyond acetylornithine was shattered by the recent discovery of acetylornithine carbamoyltransferase (AOTC) (ArgF′) in the γ-proteobacterium Xanthomonas campestris (50, 65). AOTC catalyzes the formation of acetylcitrulline from acetylornithine but not that of citrulline from ornithine (Fig. 1, Box 1). The enzyme ArgF′ (EC 2.1.3.9) is a distant homologue of the classical OTC (EC 2.1.3.3) but lacks an OTC-specific motif, the SMG motif, and the substrate-binding mechanism of this enzyme appears to be different from that of both ornithine and aspartate carbamoyltransferases (68). X. campestris AOTC is presently the only carbamoyltransferase known to catalyze this reaction in the pure form, but genes similar to argF′, also lacking the SMG motif, were identified in other species: X. axonopodis, Xylella fastidiosa, Bacteroides thetaiotaomicron, Cytophaga hutchinsonii, Tannerella forsythensis, and Prevotella ruminicola (50). Obviously, if acetylcitrulline is formed instead of citrulline, a downstream acetylated compound has to be deacetylated in order to provide arginine as an end product for protein synthesis; Shi et al. (66) actually purified from X. campestris a deacetylase active on acetylcitrulline. Morizono et al. (50) showed that this enzyme, encoded by a gene formerly annotated argE because of its similarity with the cognate E. coli gene, is in fact not a novel deacetylase specific for acetylcitrulline, as postulated by Shi et al. (66), but is active with acetylornithine as well. Moreover, since argF′ complements an OTC-deficient E. coli mutant but the cognate protein does not carbamoylate ornithine in vitro (66), it is likely that E. coli ArgE itself also deacetylates acetylcitrulline (Fig. 1, Box 1). Acetylornithinases are known to be active with a number of acetylated amino acids (see reference 12).
B. fragilis argF′ was shown to be essential to its host, but despite its apparently close relationship to the X. campestris homologue, no AOTC activity could be detected in vitro (50, 64). Very recently, this B. fragilis ArgF′ protein was crystallized and shown to be a novel N-succinyl-l-ornithine transcarbamylase (SOTC) involved in the arginine biosynthesis of this bacterium (67). Since the disruption of the encoding gene, argF′Bf, renders B. fragilis auxotrophic for arginine, it appears that the arginine pathway in Bacteroides is different from the canonical pathway of most organisms (67). Interestingly, a single mutation converts B. fragilis SOTC into an AOTC (67). The occurrence of succinylornithine as an intermediate of arginine catabolism in certain Bacteria is well known (41, 73), but no carbamoyltransferase using this substrate had been reported until now.
The fact that X. campestris acetylcitrulline deacetylase is also active on acetylornithine should be emphasized, since a metabolism bypassing ornithine would probably be deficient; indeed, ornithine itself is a pivotal metabolite (Fig. 5). It is required for incorporation in hydroxamate siderophores (77) and in lipids found in abundance in Bacteria, some of which are related to species utilizing the AOTC pathway (16, 28). It was even suggested that these lipids may play an important role in organisms pathogenic for eukaryotes; this is precisely the case for Xylella, Xanthomonas, Bacteroides, and Tannerella. Moreover, under certain circumstances, ornithine may become the main source of putrescine, a precursor of polyamines, which are vital metabolites (see reference 12). It can be noted that, in principle, an OAT presenting the broad substrate specificity pattern observed in G. stearothermophilus (76) could play the same metabolic role as AO in organisms using an AOTC if it proved to be active on acetylcitrulline.
Ornithine is a pivotal compound in cell metabolism. Arginine biosynthesis and known arginine catabolic pathways are outlined. The name of each pathway, framed with the type of line used to draw it, is identified by the enzyme initiating the pathway. Only key intermediates are mentioned, including the six main products deriving from ornithine. cp, carbamoylphosphate. For a recent review of arginine metabolism, see reference 41.
CONCLUDING REMARKS ON THE EVOLUTION OF ARGININE BIOSYNTHESIS
Recent discoveries have led to a reappraisal of glutamate acetylation in microorganisms. From established functional evidence, at least three options for this metabolic step exist: the classical NAGS, the so-called S-NAGS (fused with ArgH or independent), and the bifunctional OAT (Fig. 3). S-NAGSs belong to the vast GNAT family but are not all homologous to the C-terminal domain of classical NAGS (Fig. 3). In some Bacteria or in Archaea, other, as-yet-undetected acetyltransferases may be responsible for the synthesis of acetylglutamate.
Classical two-domain NAGS does not appear to be the ancestral glutamate acetylating enzyme; it occurs in Proteobacteria but is absent from many major divisions of Bacteria, including those proposed to branch off early in different phylogenies (Thermotoga, Aquifex, or Planctomycetes [Table 1; also see below]). NAGS has not been found in Archaea either. Due to its wide distribution, including early branching bacterial divisions, OAT appears to have preceded classical NAGS, but it is not known whether the ancestral OAT was bifunctional, a prerequisite to qualify as putative primeval glutamate acetylase. On the other hand, the short, Arg(A)-like precursor of NAGS found in several Bacteria and Archaea (Table 1) is probably related to an ancestral acetyltransferase that was present in the last universal common ancestor (LUCA). It may have preceded OAT in the role of acetylglutamate synthetase. In the bacterial domain, this short acetyltransferase would have given rise to the bimodular, classical NAGS by fusion with NAGK (Fig. 3). In members of the Alteromonas-Vibrio group, it would have become fused with argH either directly, as originally proposed (see Fig. 7 in reference 81), or, perhaps more likely, as the result of a deletion occurring in an organism carrying an argA gene close to argH, as the comparison of Colwellia and two different Pseudoalteromonas spp. suggests (Fig. 4). Indeed, in P. atlantica, argH is followed by a complete argA, but in P. haloplanktis, only argH(A) is present; in Colwellia, on the other hand, argH(A) is closely followed by a complete argA. It is conceivable that the P. haloplanktis configuration results from a deletion joining argH to the C-terminal domain of an adjacent argA, whereas the Colwellia configuration could result from the insertion in the chromosome of a circularized argA (resident or acquired horizontally) by a crossover between argH(A) and argA at the level of the argA C-terminal domain. Alternatively, argA could have been duplicated in tandem in the immediate neighborhood of argH before the fusion occurred. At any rate, these events would be ancient, since the two NAT domains present in Colwellia are distantly related (they have only 26% identity).
In many other instances, S-NAGS would have remained a single gene product (Fig. 3), sometimes encoded in an arginine operon-like gene cluster (as in Thermus and Deinococcus), but association of the protein with a glutamate binding site providing NAGK appears likely. These events may have occurred in organisms having lost an ancestral bifunctional OAT or originally devoid of such an enzyme. The origin of NAGS in fungi is not clear, but as acetyl-CoA is a poor substrate for fungal OAT, at least in yeast (15), efficient glutamate acetylation may have necessitated association of NAGS with NAGK, though not as a covalent complex. In vertebrates, the bimodular structure of NAGS (51) is reminiscent of that of Bacteria and may point to a bacterial origin. In plants, the similarity with classical prokaryotic NAGS is even stronger (69).
When attempting to correlate the distribution of known and putative glutamate acetylases with the branching pattern of the bacterial subdivisions, one is confronted at the onset with a problem, namely, that this pattern is very deep and therefore remains controversial: whereas some authors still consider extreme thermophilic Bacteria (Aquificales, Thermotogales) to be the earliest, deepest branches of the bacterial tree, others have emphasized the possibility that in Bacteria extreme thermophily may have been acquired by convergence (79). In keeping with this view, a reappraisal of bacterial phylogeny placed the essentially nonthermophilic Planctomyces spp. at the base of the tree, with the additional, intriguing feature that these organisms appear to possess the equivalent of a nucleus (9, 27). The controversy over the first lines of divergence in the bacterial domain is ongoing (6, 13, 18, 20, 30, 39, 80), and at present, the distribution of known glutamate-acetylating enzymes is compatible with either scheme. Advocates of the ancestry of extreme thermophilic Bacteria might take advantage of the argument of the bifunctionality of OAT in Thermotoga, which has no NAGS, to consider it primeval. On the other hand, there is an OAT in nonthermophilic Planctomyces but its functional pattern is not known; Planctomyces has an arg(A)-like sequence as well and no NAGS, but this is also true of the extreme thermophilic bacterium Aquifex (81).
Finally, it should be emphasized that any attempt at drawing a comprehensive evolutionary history of the early steps in the arginine pathway will have to integrate functional information about lysX. Its presence in several organisms (many Archaea) close to genes annotated as arg determinants (Fig. 3 and Table 2) may indeed suggest a primeval function for LysX, perhaps already in the LUCA. The LUCA was probably a genetically redundant and promiscuous community (29, 38, 78, 80). Several alternatives for glutamate acetylation may therefore have been already present in the LUCA population.
Another conclusion of this review is that in many Bacteria, OAT (ArgJ) appears to have predated AO (ArgE), since the latter emerges among Proteobacteria; it may have been recruited from a pool of deac(et)ylases, perhaps after the loss of OAT or independently, or it may even have been acquired laterally. The traditional view that OAT is “more evolved” than AO should therefore be qualified from the chronological point of view.
The presence of an AO in organisms endowed with an AOTC (Fig. 1) rather than an OTC may, however, have a deeper origin than in other bacterial subdivisions. Indeed, the carbamoyltransferase phylogenetic tree suggests that X. campestris AOTC and the related sequences form a family that arose very early in the course of evolution (52), perhaps even before OTC, since AOTC-like proteins branch off close to the root of the tree (see Fig. 1 in reference 52). If primordial ornithine and lysine biosynthesis proceeded from aminoadipate and glutamate by the bifunctional AAA/ornithine pathway postulated by Nishida et al. (54), N-acetyllysine and N-acetylornithine would constitute the forelast intermediates of the pathway; a deacetylase of this ancestral pathway (ArgE) would be required to produce both lysine and ornithine but also to deacetylate the acetylcitrulline that would be produced if the primordial carbamoyltransferase was an AOTC rather than an OTC. In this view, OTC could have evolved from AOTC. Today, AOTC-like sequences occur mostly in plant pathogens (Xanthomonas, Xylella) and in intestinal or mouth commensals (Bacteroides, Tannerella), which, at first sight, could suggest that AOTC evolved from OTC, perhaps to escape the action of OTC-inhibiting substances produced by the host or by the bacterium itself, as in the case of Pseudomonas syringae ArgK, an OTC insensitive to the phaseolotoxin produced by the bacterium (57). However, AOTC may have been reacquired in these pathogens from an ancestral state of arginine biosynthesis surviving in some cell lines. Hopefully, further investigations on the phylogeny of carbamoyltransferases and deacetylases will throw some light on this question. It is worth pointing out that the transacetylation rates of G. stearothermophilus OAT for both lysine and ornithine are very similar (76). Therefore, in principle at least, the deacetylase required to deliver both lysine and ornithine from a putative AAA/ornithine pathway could be replaced in some organisms by an OAT with broad substrate specificity.
This hypothesis of an ancestral AAA/ornithine pathway could have another virtue: it contrasts with the idea that the formation of acetylated arginine precursors could have arisen by recruiting enzymes from the classical, non-acetylated proline biosynthetic pathway. If this had occurred in the LUCA, we would expect the proline pathway to be present in Archaea; in fact, it is mostly absent, except in a few methanogens and in N. pharaonis (where it might constitute a late acquisition), whereas most Archaea appear to synthesize proline by cyclization of ornithine (32). The latter may have derived from an ancestral AAA/ornithine pathway; being a source of proline may therefore have been an essential function of ornithine very early in metabolic evolution.
OUTLOOK
The novel information discussed in this review raises a number of questions that further genomic research and enzyme characterization could resolve. Whether S-NAGS associates in vivo with other glutamate-binding proteins to utilize glutamate efficiently is one of them. How the activity of S-NAGS is feedback inhibited by arginine is another. The enzymology and structure of the ArgH(A) enzyme is a new topic which is presently being explored; the presence of this protein in human pathogens (V. parahaemolyticus and V. vulnificus) and perhaps in pathogens of fishes or other marine organisms may lead to medical and ecological applications. The functional annotation of most OATs requires substantiation; as many organisms possessing an OAT also feature an S-NAGS gene, identifying the protein actually responsible for glutamate acetylation in vivo will require complementation tests with E. coli argA and argE mutants as well as biochemical analyses. OAT structural studies could also identify the possible “signature” of bifunctional OATs. Still other questions are as follows. Which enzymes are responsible for glutamate acetylation in microorganisms where no short or classical NAGS could be identified and where OAT, when present, does not use acetyl-CoA as a substrate? How ancient is the AOTC pathway, and was it already present in the LUCA? The perspectives opened by comparing arginine biosynthesis with the new lysine biosynthetic pathway discovered in Thermus are fascinating. The new phylogeny proposed for Bacteria, with Planctomyces as the earliest possible branch, close to LUCA, enhances the interest of these investigations.
ACKNOWLEDGMENTS
N.G. is grateful to G. T. Taylor for his hospitality at the Marine Sciences Research Center of Stony Brook University during the preparation of the manuscript.
B.L. thanks the CNRS and ANR for financial support.
- American Society for Microbiology