SUMMARY
A comprehensive classification system for transmembrane molecular transporters has been developed and recently approved by the transport panel of the nomenclature committee of the International Union of Biochemistry and Molecular Biology. This system is based on (i) transporter class and subclass (mode of transport and energy coupling mechanism), (ii) protein phylogenetic family and subfamily, and (iii) substrate specificity. Almost all of the more than 250 identified families of transporters include members that function exclusively in transport. Channels (115 families), secondary active transporters (uniporters, symporters, and antiporters) (78 families), primary active transporters (23 families), group translocators (6 families), and transport proteins of ill-defined function or of unknown mechanism (51 families) constitute distinct categories. Transport mode and energy coupling prove to be relatively immutable characteristics and therefore provide primary bases for classification. Phylogenetic grouping reflects structure, function, mechanism, and often substrate specificity and therefore provides a reliable secondary basis for classification. Substrate specificity and polarity of transport prove to be more readily altered during evolutionary history and therefore provide a tertiary basis for classification. With very few exceptions, a phylogenetic family of transporters includes members that function by a single transport mode and energy coupling mechanism, although a variety of substrates may be transported, sometimes with either inwardly or outwardly directed polarity. In this review, I provide cross-referencing of well-characterized constituent transporters according to (i) transport mode, (ii) energy coupling mechanism, (iii) phylogenetic grouping, and (iv) substrates transported. The structural features and distribution of recognized family members throughout the living world are also evaluated. The tabulations should facilitate familial and functional assignments of newly sequenced transport proteins that will result from future genome sequencing projects.
“To know truly is to know by causes.” Francis Bacon
“To me life consists simply in this, in the fluctuation between two poles, in the hither and thither between the two foundation pillars of the world.” Herman Hesse
Transport systems serve the cell in numerous capacities (118-123). First, they allow entry of all essential nutrients into the cytoplasmic compartment and subsequently into organelles, allowing metabolism of exogenous sources of carbon, nitrogen, sulfur, and phosphorus. Second, they provide a means for the regulation of metabolite concentrations by catalyzing the excretion of end products of metabolic pathways from organelles and cells. Third, they mediate the active extrusion of drugs and other toxic substances from either the cytoplasm or the plasma membrane. Fourth, they mediate uptake and efflux of ionic species that must be maintained at concentrations that differ drastically from those in the external milieu. The maintenance of conditions conducive to life requires a membrane potential, requisite ion concentration gradients, and appropriate cytoplasmic concentrations of all essential trace minerals that participate as cofactors in metabolic processes. Such conditions are required for the generation of bioelectricity as well as for the maintenance of enzymatic activities. Fifth, transporters participate in the secretion of proteins, complex carbohydrates, and lipids into and beyond the cytoplasmic membrane, and these macromolecules serve a variety of biologically important roles in protection against environmental insult and predation, in communication with members of the same and different species, and in pathogenesis. Sixth, transport systems allow the transfer of nucleic acids across cell membranes, allowing genetic exchange between organisms and thereby promoting species diversification. Seventh, transporters facilitate the uptake and release of pheromones, alarmones, hormones, neurotransmitters, and a variety of other signaling molecules that allow a cell to participate in the biological experience of multicellularity. Finally, transport proteins allow living organisms to conduct biological warfare, secreting, for example, antibiotics, antiviral agents, antifungal agents, and toxins of humans and other animals that may confer upon the organism producing such an agent a selective advantage for survival purposes. Many of these toxins are themselves channel-forming proteins or peptides that serve a cell-disruptive transport function. Thus, from a functional standpoint, the importance of molecular transport to all facets of life cannot be overestimated.
The importance of transport processes to biological systems was recognized more than half a century ago (43, 82). Thanks largely to concerted efforts on the part of Jacques Monod and his coworkers at the Pasteur Institute in Paris, who studied the mechanism of action of the Escherichia coli lactose permease, the involvement of specific carrier proteins in transport became established (22, 113). Since these early studies, tremendous progress has been made in understanding the molecular bases of transport phenomena, and the E. coli lactose permease has frequently been at the forefront (45, 60, 143). Initially, transport processes were characterized from physiological standpoints using intact cells. Cell “ghosts” in which the cytoplasmic contents had been released by osmotic shock proved useful, particularly as applied to human red blood cells and later to bacteria. Work with such systems provided detailed kinetic descriptions of transport processes, and by analogy with chemical reactions catalyzed by enzymes, the proteinaceous nature of all types of permeases became firmly established (reviewed by Kaback [58]).
With the advent of gene-sequencing technologies, the primary structures of permeases first became available. Hydrophobicity analyses of these sequences revealed the strikingly hydrophobic nature of various types of integral membrane transporters (19, 68, 70, 95). Current multidisciplinary approaches are slowly yielding three-dimensional structural information about transport systems. However, since only a few such systems have yielded to X-ray crystallographic analyses (see, for example, references 26, 140, and142 as well as Table 21 below), we still base our views of solute transport on molecular models that provide reasonable pictures of transport systems and the processes they catalyze without providing absolute assurance of accuracy (45, 59, 143).
It is well recognized that any two proteins that can be shown to be homologous (i.e., that exhibit sufficient primary and/or secondary structural similarity to establish that they arose from a common evolutionary ancestor) will in general prove to exhibit strikingly similar three-dimensional structures (32), although a few exceptions have been noted (127). Furthermore, the degree of tertiary structural similarity correlates well with the degree of primary structural similarity. For this reason, phylogenetic analyses allow application of modeling techniques to a large number of related proteins and additionally allow reliable extrapolation from one protein member of a family of known structure to others of unknown structure. Thus, once three-dimensional structural data are available for any one family member, these data can be applied to all other members within limits dictated by their degrees of sequence similarity. The same cannot be assumed for members of two independently evolving families or for any two proteins for which common descent has not been established.
Similar arguments apply to mechanistic considerations. Thus, the mechanism of solute transport is likely to be similar for all members of a permease family, and variations on a specific mechanistic theme will be greatest when the sequence divergence is greatest. By contrast, for members of any two independently evolving permease families, the transport mechanisms may be strikingly different. Knowledge of these considerations allows unified mechanistic deductive approaches to be correctly applied to the largest numbers of transport systems, even when evidence is obtained piecemeal from the study of different systems.
The capacity to deduce and extrapolate structural and mechanistic information illustrates the value of phylogenetic data. However, another benefit that may result from the study of molecular phylogeny is to allow an understanding of the mechanistic restrictions that were imposed upon an evolving family due to architectural constraints. Specific architectural features may allow one family to diversify in function with respect to substrate specificity, substrate affinity, velocity of transport, polarity of transport, and even mechanism of energy coupling. By contrast, the architectural constraints imposed on a second family may not allow functional diversification. Knowledge of the architectural constraints imposed on a permease family provides a clear clue as to the reliability of functional predictions for uncharacterized but related gene products revealed, for example, by genome sequencing. Conversely, the functional diversity of the members of a permease family must be assumed to reflect architectural constraints, and thus phylogenetic and functional analyses lead to architectural predictions.
Finally, phylogenetic analyses provide valuable information about the evolutionary process itself. One can sometimes glean clues regarding the time of appearance of a family, the organismal type in which the family arose, and the pathway taken for the emergence of the family during evolutionary history. Occasionally, it is also possible to ascertain whether or not two distinct families arose independently of each other.
Over the past decade, my laboratory has devoted considerable effort to the phylogenetic characterization of permease families (118-120). This work has led us to formulate a novel classification system superficially similar to that implemented years ago for enzymes by the Enzyme Commission. The transporter classification (TC) system has been reviewed and recommended for adoption by a panel of experts chaired by A. Kotyk of the International Union of Biochemistry and Molecular Biology (IUBMB). In contrast to the Enzyme Commission, which based its classification system solely on function, we have chosen to classify permeases on the basis of both function and phylogeny. In this review, I describe our proposal, point out some of its strengths, and emphasize its flexibility for the future inclusion of yet-to-be-discovered transporters. We hope that the TC classification system will prove to be as useful as the enzyme classification system. Earlier treatises concerning the TC system and transport protein evolution have appeared (121-123, 127).
A detailed description of the TC system can be found on our World Wide Web site (http://www-biology.ucsd.edu/∼msaier/transport/). This site will be continuously updated as new relevant physiological, biochemical, genetic, biophysical, and sequence data become available. Thanks to the participation of Andrei Lupas and the SmithKline-Beecham bioinformatics group (5), the TC system is being automated so that new sequences will automatically appear in multiple alignments and phylogenetic trees with minimal human intervention. The system will also provide a user-friendly search tool, called TransBase, so that the TC system can be readily accessed by keyword, TC number, gene name, protein name, sequence, and sequence motif. These advances will render the TC system increasingly accessible to the entire scientific community worldwide. In return, members of the scientific community are strongly encouraged to communicate novel findings and corrections to me by E-mail, phone, fax, or snail mail.
TRANSPORT NOMENCLATURE
Communication of concepts relevant to transmembrane transport phenomena generally depends upon the use of a uniform, well-defined and accepted, universally understood set of terms that can be used by the international community of scientists regardless of national origin or discipline of training. In this section I therefore present the terms currently in use in the field and mention which of these terms have been recommended for adoption by the TC panel of the IUBMB. It is anticipated that the acceptance of these terms will greatly facilitate the interchange of information by scientists and students of transport internationally.
Almost all transmembrane transport processes are mediated by integral membrane proteins, sometimes functioning in conjunction with extracytoplasmic receptors or receptor domains as well as with cytoplasmic energy-coupling and regulatory proteins or protein domains (51, 112, 130, 139). Each such complex of these proteins and/or protein domains is referred to as a transport system, transporter, porter, permease system, or permease. These are all equivalent terms that are in general use by members of the transport community. A permease (porter) is a protein or protein complex that catalyzes a vectorial reaction, irrespective of whether or not it also catalyzes a chemical or electron transfer reaction that drives the vectorial process. Thus, many transport systems can be thought of as catalytic proteins or protein complexes analogous to enzymes or enzyme complexes. By definition, transporters facilitate vectorial rather than, or in addition to, chemical reactions. The preferred terms for these transport systems are transporters or porter.
Permease-mediated transport can occur by any one of three distinct but related processes. First and simplest is facilitated, equilibrative, or protein-mediated diffusion, a process that is not coupled to metabolic energy and therefore cannot give rise to concentration gradients of the transported substrate across the membrane. Two primary modes of facilitated transport have been recognized in biological systems: channel type and carrier type (Fig. 1). In channel-type facilitated diffusion, the solute passes in a diffusion-limiting process from one side of the membrane to the other via a channel or pore that is lined by appropriately hydrophilic (for hydrophilic substrates), hydrophobic (for hydrophobic substrates), or amphipathic (for amphipathic substrates) amino acyl residue moieties of the constituent protein(s). The structures of several such channel proteins have now been examined and elucidated by X-ray crystallographic techniques (see below). In carrier-type facilitated diffusion, some part of the transporter is classically presumed to pass through the membrane together with the substrate (143, 151). Whether or not this presumption is correct is not known, as no classical carrier has yet yielded to the analytical tools of the X-ray crystallographer.
Scheme illustrating the currently recognized primary types of transporters found in nature. These proteins are initially divided into channels and carriers. Channels are subdivided into α-helical protein channels, β-barrel protein porins (mostly in the outer membranes of gram-negative bacteria and eukaryotic organelles), toxin channels, and peptide channels. Carriers are subdivided into primary active carriers, secondary active carriers (including uniporters), and group translocators that modify their substrates during transport. Primary sources of chemical energy that can be coupled to transport include pyrophosphate bond (i.e., ATP) hydrolysis, decarboxylation, and methyl transfer. Oxidation-reduction reactions, light absorption, and mechanical devices can also be coupled to transport (see text). Secondary active transport is driven by ion and other solute (electro)chemical gradients created by primary active transport systems. The only well-established group-translocating system found in nature is the bacterial phosphoenolpyruvate:sugar PTS, which phosphorylates its sugar substrates during transport.
Carriers usually exhibit rates of transport that are several orders of magnitude lower than those of channels. Moreover, in contrast to most channels, they exhibit stereospecific substrate specificities. Although both channels and carriers may exhibit the phenomenon of saturation kinetics, this is a more common characteristic of carriers. Very few carriers have been shown to be capable of functioning by a channel-type mechanism, and the few that exhibit this capacity generally do so only after the protein has been modified, either by covalent or noncovalent ligand binding or by imposition of a large membrane potential. Moreover, while most channels are oligomeric complexes, many carriers can function as monomeric proteins. These observations led to the suggestion that channels and carriers are fundamentally, not superficially, different.
If energy expenditure is coupled to transmembrane solute translocation, then a system catalyzing facilitated diffusion can become an active transporter. Such a system is considered to be a primary active transporter if a primary source of energy (i.e., a chemical reaction, light absorption, or electron flow) is coupled to the process. It is considered to be a secondary active transporter if a secondary source of energy (i.e., an ion electrochemical gradient, termed the proton motive force [PMF] in the case of protons or the sodium motive force [SMF] in the case of sodium ions), generated at the expense of a primary energy source, is coupled to the process. The transport panel considered all of these terms to be acceptable.
Active transporters (or porters) can function by uniport, symport, or antiport. Uniporters (the preferred term), also called single-species transporters or facilitated diffusion carriers (the less-preferred terms), catalyze the transport of a single molecular species, and transport therefore occurs independently of the movement of other molecular species. Symporters (the preferred term), also classically called cotransporters, catalyze the transport of two or more molecular species in the same direction. The fact that a single point mutation in a symporter can convert a carrier into a uniporter (41, 62, 66,75, 147) emphasizes the superficial distinction between these two types of carriers. Antiporters (the preferred term), also called countertransporters, exchange transporters, and exchangers, catalyze the exchange of one or more molecular species for another. Antiport processes can be subdivided into two categories: antiport of like molecules (i.e., solute-solute antiport) and antiport of unlike molecules (i.e., solute-cation antiport). Many uniporters and symporters also catalyze solute-solute antiport, sometimes at rates that are substantially greater than those of uniport or symport. Some carriers catalyze solute-solute antiport at rates that exceed those of uniport or symport by 103- to 105-fold, and uniport via these carriers is of little or no physiological consequence (110). Such systems are said to be obligatory antiporters or exchangers.
Accelerative solute-solute antiport or countertransport has long been considered to be a diagnostic characteristic of carriers. Early transport kineticists concluded that its demonstration eliminated the possibility that a transporter functions by a channel-type mechanism and suggested that clear boundaries exist between carriers and channels (79, 135). Subsequent observations that certain “carriers” could apparently be converted into “channels” by chemical treatment (16, 17, 28, 29, 56), by imposition of large membrane potentials (131, 132, 149), or by ligand binding (13) led many students of transport to consider these boundaries indistinct. Our in silico phylogenetic and protein structural analyses suggest that these examples may be special cases and tend to reemphasize the importance of the channel-versus-carrier distinction (123, 127).
A few carriers modify their substrates during transport. The best-characterized such system is the bacterial phosphotransferase system (PTS), which phosphorylates its sugar substrates using phosphoenolpyruvate as the phosphoryl donor. Sugars taken up from the external milieu via the PTS are thus released into the cytoplasm as sugar-phosphates. Any process in which the substrate is modified during transport is termed group translocation. Although originally proposed in different form by Peter Mitchell as a general mechanism, its occurrence appears to be highly restricted in nature.
CONSIDERATIONS FOR THE SYSTEMATIC CLASSIFICATION OF TRANSMEMBRANE SOLUTE TRANSPORTERS
The introduction of Linnaeus of a universal classification system for living organisms allowed the rationalization of the tremendous complexity of biological relationships into an evolutionary framework. Similarly, the introduction by the international Enzyme Commision of a universal enzyme classification system greatly increased our conception of the functional relationships of these proteins. Although protein-domain classification systems have been suggested, no comparable classification system has yet been proposed for proteins that catalyze vectorial reactions rather than (or in addition to) chemical reactions. In this section I describe the proposal for a universal system of classification for transporters based on both function and phylogeny.
As noted above, enzymes have long been classified in accordance with the directives and recommendations of the Enzyme Commission (31). The commission developed its directives decades ago, long before protein sequence data became available. Their system of classification was based solely on function. It was tacitly assumed that proteins of similar catalytic function would be closely related and that they therefore should be grouped together. We now know, however, that two different enzymes catalyzing exactly the same reaction sometimes exhibit completely different amino acid sequences and three-dimensional structures, function by entirely different mechanisms, and apparently evolved independently of each other, converging only with respect to the final reactions catalyzed. The enzyme classification system is thus limited in that it reflects only the reactions catalyzed by and the substrate specificities of the enzymes. It does not recognize the phylogenetic origins of these proteins and therefore does not reflect structural or mechanistic features.
As has been extensively documented, molecular phylogeny provides a reliable guide to protein structure and mechanism of action. It also provides an indication (albeit less definitive) of the specific process catalyzed and the substrate acted upon (127). Since the former characteristics are fundamental traits of a protein while the latter characteristics are more superficial traits, sometimes merely reflective of single amino acyl residue changes in a protein, it would be reasonable to suggest that as more and more sequence and phylogenetic data become available, these data should be used to provide the most reliable basis for protein classification. Since single amino acyl residue substitutions in permeases can give rise to different substrate-binding specificities (15, 23, 44, 94), these characteristics should be used in the final level of classification rather than in a primary level. We conclude that recognition of the evolutionary process provides a reliable guide to structure, mechanism, and function, although a few exceptions may exist (102, 127). If molecular phylogenetic studies can accurately retrace the evolutionary process, they should be used as a basis for any rational system of protein classification.
Some of the enzymes classified within the enzyme classification system are asymmetrically situated within an anisotropic, hydrophobic lipid membrane that separates two aqueous compartments. The resultant asymmetry allows these enzymes to catalyze vectorial as well as chemical modification reactions, as clearly enunciated decades ago by Peter Mitchell (83-85). Some of these integral membrane enzymes do, in fact, catalyze transmembrane transport of ions or other small solutes. However, most currently recognized solute permeases do not catalyze a chemical reaction and consequently are not included within the enzyme classification system. The comprehensive system of permease classification proposed here has the potential to encompass all types of transporters, both those that are currently recognized and those that are yet to be discovered.
THE TC SYSTEM
Early studies revealed that transport proteins could be grouped into families based exclusively on the degrees of similarity observed for their amino acid sequences (118). The significance of family assignment remained questionable until the study of internal gene duplications that had occurred during the evolution of some of these families established that these families had arisen independently of each other, at different times in evolutionary history, following different routes (119). In this section I will evaluate and utilize both function and molecular phylogeny for the purpose of conceptualizing transport protein characterization and classification (see also reference 120).
According to the proposed classification system, now recommended by the transport nomenclature panel of the IUBMB, transporters are grouped on the basis of five criteria, and each of these criteria corresponds to one of the five entries within the TC number for a particular permease. Thus, a permease-specific TC number has five components, V, W, X, Y, and Z. V corresponds to the transporter class, while W corresponds to the subclass (see Table 1). X specifies the permease family (or superfamily), while Y represents the subfamily in a family (or the family in a superfamily) in which a particular permease is found. Finally, Z delineates the substrate or range of substrates transported as well as the polarity of transport (in or out). Any two transport proteins in the same subfamily of a permease family that transport the same substrate(s) using the same mechanism are given the same TC number, regardless of whether they are orthologs (i.e., arose in distinct organisms by speciation) or paralogs (i.e., arose within a single organism by gene duplication). The mode of regulation proves not to correlate with phylogeny and was probably superimposed on permeases late in the evolutionary process. Regulation is therefore not used as a basis for classification.
Classes and subclasses of transporters in the TC systema
There are four recognized classes of transporters: channels, porters, primary active transporters, and group translocators (Table 1). Sequenced homologs of unknown function or mechanism and functionally characterized permeases for which sequence data are not available are included in a distinct class, class 9. Deficiencies in our knowledge will presumably be eliminated with time as more sequenced permeases become characterized biochemically and as sequences become available for the functionally but not molecularly characterized permeases. One additional class (class 8) is reserved for auxiliary transport proteins. It should be noted that each subclass of transporters has a two-digit TC number (V.W); each family has a three-digit TC number (V.W.X); each subfamily has a four-digit TC number (V.W.X.Y); and each permease type has a five-digit TC number (V.W.X.Y.Z).
As mentioned above, the primary level of classification in the TC system is based on mode of transport and energy-coupling source. The classes and subclasses of transporters currently recognized are listed below.
1.A. α-Type channels.Transmembrane channel proteins of this class are ubiquitously found in the membranes of all types of organisms from bacteria to higher eukaryotes. These transporters usually catalyze the movement of solutes by an energy-independent process by passage through a transmembrane aqueous pore without evidence for a carrier-mediated mechanism. These channel proteins consist largely of α-helical spanners, although β-strands may be present and may even contribute to the channel. Outer membrane porin-type channel proteins are excluded from this class and are instead included in class 1.B.
1.B. β-Barrel porins.These proteins form transmembrane pores that usually allow the energy-independent passage of solutes across a membrane. The transmembrane portions of these proteins consist exclusively of β-strands that usually form β-barrels. Porin-type proteins are found in the outer membranes of gram-negative bacteria, mitochondria, plastids, and possibly acid-fast gram-positive bacteria.
1.C. Pore-forming toxins.These proteins and peptides are synthesized by one cell and secreted for insertion into the membrane of another cell, where they form transmembrane pores. They may exert their toxic effects by allowing the free flow of electrolytes and other small molecules across the membrane, or they may allow entry into the target cell cytoplasm of a toxin protein that ultimately kills or controls the cell. Both protein (large) and ribosomally synthesized peptide (small) toxins are included in this category.
1.D. Non-ribosomally synthesized channels.These molecules, often chains of l- and d-amino acids as well as other small molecular building blocks such as hydroxy acids (i.e., lactate and β-hydroxybutyrate), form oligomeric transmembrane ion channels. Voltage may induce channel formation by promoting assembly of the oligomeric transmembrane pore-forming structure. These “depsipeptides” are often made by bacteria and fungi as agents of biological warfare. Other substances, completely lacking amino acids, may also be capable of channel formation.
2.A. Porters (uniporters, symporters, and antiporters).Transport systems are included in this subclass if they utilize a carrier-mediated process to catalyze uniport (a single species is transported either by facilitated diffusion or in a membrane potential-dependent process if the solute is charged), antiport (two or more species are transported in opposite directions in a tightly coupled process, not coupled to a direct form of energy other than chemiosmotic energy), and/or symport (two or more species are transported together in the same direction in a tightly coupled process, not coupled to a direct form of energy other than chemiosmotic energy).
2.B. Non-ribosomally synthesized porters.These substances, like non-ribosomally synthesized channels, may be depsipeptides or non-peptide-like substances. Such a porter complexes a solute such as a cation in its hydrophilic interior and facilitates translocation of the complex across the membrane by exposing its hydrophobic exterior and moving from one side of the bilayer to the other. If the free porter can cross the membrane in the uncomplexed form, the transport process can be electrophoretic (the charged molecule moves down its electrochemical gradient), but if only the complex can cross the membrane, transport may be electroneutral, because one charged substrate is exchanged for another.
2.C. Ion gradient-driven energizers.Normally, outer membrane porins (1.B) of gram-negative bacteria catalyze passive transport of solutes across the membrane, but coupled to “energizers,” they may accumulate their substrates in the periplasm against large concentration gradients. These energizers use the PMF across the cytoplasmic membrane, probably by allowing the electrophoretic transport of protons and conveying conformational change to the outer membrane receptors or porins. Homologous energizers drive bacterial flagellar motility (A. Lupas et al., unpublished results). The mechanism is poorly understood, but these energizers undoubtedly couple proton (H+) or sodium (Na+) fluxes through themselves in order to energize the process.
Category 3: Primary Active TransportersThese transporters use a primary source of energy to drive active transport of a solute against a concentration gradient. A secondary ion gradient is not considered a primary energy source because it is created by the expenditure of a primary energy source. Primary energy sources known to be coupled to transport are chemical, electrical, and solar.
3.A. Diphosphate bond hydrolysis-driven transporters.Transport systems are included in this subclass if they hydrolyze the diphosphate bond of inorganic pyrophosphate, ATP, or another nucleoside triphosphate to drive the active uptake and/or extrusion of a solute(s). The transport protein may or may not be transiently phosphorylated, but the substrate is not phosphorylated. These transporters are found universally in all living organisms.
3.B. Decarboxylation-driven transporters.Transport systems that drive solute (e.g., ion) uptake or extrusion by decarboxylation of a cytoplasmic substrate are included in this subclass. These transporters are currently thought to be restricted to prokaryotes.
3.C. Methyl transfer-driven transporters.A single characterized multisubunit protein family currently falls into this subclass, the Na+-transporting methyltetrahydromethanopterin:coenzyme M methyltransferase. These transporter complexes are currently thought to be restricted to members of the Archaea.
3.D. Oxidoreduction-driven transporters.Transport systems that drive transport of a solute (e.g., an ion) energized by the exothermic flow of electrons from a reduced substrate to an oxidized substrate are included in this subclass. These transporters are universal, although some families are restricted to one domain or another.
3.E. Light absorption-driven transporters.Transport systems that utilize light energy to drive transport of a solute (e.g., an ion) are included in this subclass. One family (fungal and archaeal rhodopsin) is found in archaea and eukaryotes, but the other (photosynthetic reaction center) is found only in bacteria and chloroplasts of eukaryotes.
4.A. Phosphotransfer-driven group translocators.Transport systems of the bacterial phosphoenolpyruvate:sugar PTS are included in this class. The product of the reaction, derived from extracellular sugar, is a cytoplasmic sugar-phosphate. No porters of the PTS have been identified in the archaeal or eukaryotic domain.
8.A. Auxiliary transport proteins.Proteins that in some way facilitate transport across one or more biological membranes but do not themselves participate directly in transport are included in this class. These proteins always function in conjunction with one or more established transport systems. They may provide a function connected with energy coupling to transport, play a structural role in complex formation, serve a biogenic or stability function, or function in regulation.
9.A. Transporters of unknown biochemical mechanism.Transport protein families of unknown classification are grouped in this subclass and will be classified elsewhere when the transport mode and energy-coupling mechanism have been characterized. These families include at least one member for which a transport function has been established, but either the mode of transport or the energy-coupling mechanism is not known.
9.B. Putative but uncharacterized transport proteins.Putative transport protein families are grouped in this subclass and will either be classified elsewhere when the transport function of a member becomes established or be eliminated from the TC system if the proposed transport function is disproven. These families include a member(s) for which a transport function has been suggested, but evidence for such a function is not yet compelling.
9.C. Functionally characterized transport proteins with unidentified sequences.Transporters of particular physiological significance will be included in this category even though a family assignment cannot be made. When their sequences are identified, they will be assigned to an established family. This is the only protein subclass that includes individual proteins rather than protein families.
FAMILIES OF TRANSPORTERS
The current index of transport system families is presented in Table 2. There are more than 250 entries, each of which usually describes a single family. Some of these families are actually large superfamilies with more than a thousand currently sequenced members (e.g., the voltage-gated ion channel (VIC) family (TC 1.A.1) (88); the major facilitator superfamily (MFS) (TC 2.A.1) (96, 125), and the ATP-binding cassette (ABC) superfamily (TC 3.A.1) (130, 139)). Others are very small families with only one or a few currently sequenced members. Most families, however, are currently of intermediate sizes, with between 5 and 500 sequenced members.
Complete index of families of transport proteins in the TC systema
All of the families included in Table 2 will undoubtedly expand with time, and new families will be identified. The availability of new protein sequences will occasionally allow two or more currently recognized families to be placed together under a single TC number. In a few cases, two families are already known for which some evidence is available suggesting that they are related, e.g., the monovalent cation:proton antiporter-1 (CPA1) and CPA2 families (TC 2.A.36 and 2.A.37), the nucleobase-cation symporter-1 (NCS1) and NCS2 families (TC 2.A.39 and 2.A.40), as well as the l-lysine exporter, resistance to homoserine/threonine, and cadmium resistance families (TC 2.A.75, 2.A.76, and 2.A.77, respectively) (124, 148). Such evidence is usually based on limited sequence and/or sequence motif similarities, common function, and/or similar protein size, topology, and structure. When “missing link” sequences or three-dimensional structural data become available so that proteins of two families can be unequivocally grouped together within a single family, the lower TC number will be adopted for all of the family members, and the higher TC number will be abandoned.
The rigorous criteria used to delimit a family have been defined previously (121, 122). Briefly, in order for two proteins to belong to the same family, they must exhibit a region of 60 residues or more, in comparable portions of the two proteins, that have a comparison score in excess of 9 standard deviations (27). At this value, the probability that the degree of sequence similarity observed for these two proteins occurred by chance is less than 10−19 (25). It is considered that this degree of sequence similarity could not have arisen either by chance or by a convergent evolutionary process (32, 118). A minimum of 60 residues was arbitrarily selected because many protein domains in water-soluble proteins are of about this size.
The complete TC system is available on our web site (http://www-biology.ucsd.edu/∼msaier/transport/), where the descriptions, primary references, and list of functionally characterized protein members of all families are provided. The whole-genome analysis data upon which this classification system was initially based are found on an included subportion of this web site, which was constructed under the guidance of Ian Paulsen (100,101). This site will be updated continuously as new information becomes available. Anyone noting errors or incomplete listings is encouraged to contact me to provide the missing information and references.
As noted above, members of a transporter family generally utilize a single mode of transport and energy-coupling mechanism, thus justifying the use of these functional categories as the primary basis for classification. However, a few exceptions have been noted. First, the arsenite efflux permease (ArsAB; TC 3.A.4) of E. coliconsists of two proteins, ArsA and ArsB. ArsB is an integral membrane protein that presumably provides the transport pathway for the extrusion of arsenite and antimonite (134, 153). ArsA is an ATPase that energizes ArsB-mediated transport. However, when ArsB alone is present, as in the case of the arsenical resistance pump ofStaphylococcus aureus, transport is driven by the PMF (14). Expression of the E. coli arsB gene in the absence of the arsA gene similarly gives rise to PMF-driven transport. The presence or absence of the ArsA protein thus determines the mode of energy coupling.
The ArsB protein is a member of a large superfamily of ion transporters, the ion transporter superfamily, in which at least two families exhibit the unusual capacity of being able to incorporate auxiliary constituents that alter the transport characteristics of the carrier (107, 127). Such promiscuous use of energy is exceptionally rare and has been documented in only a very few instances. When such an effect is reported, we shall usually classify the permease in accordance with the more complicated energy-coupling mechanism (in this case, as an ATP-driven primary active transporter [class 3] rather than as a secondary carrier [class 2]). However, in this unique case, the TC nomenclature panel of the IUBMB has recommended that a second family describing the PMF-driven ArsB homologs be included in the TC system (TC 2.A.45), as many ArsB homologs function by ATP-independent, ArsA-independent mechanisms.
Examples of secondary carrier families in which promiscuous transport modes have been reported include the mitochondrial carrier family (TC 2.A.29) and the triose phosphate/nucleotide sugar transporter (TP-NST) family (TC 2.A.50). Proteins of both families are apparently restricted to eukaryotic organelles. Members of these families normally catalyze carrier-mediated substrate-substrate antiport and are therefore classified as secondary carriers. However, treatment of mitochondrial carrier family members with chemical reagents, such asN-ethylmaleimide or Ca2+ (16, 17, 28, 29,56), or imposition of a large membrane potential (ΔΨ) across a membrane into which a TP-NST family member has been incorporated (131, 132, 149), has been reported to convert these antiport-catalyzing carriers into anion-selective channels capable of functioning by uniport. Another secondary carrier that may be capable of exhibiting channel-like properties is the KefC protein of E. coli (13), which is a member of the CPA2 family (TC 2.A.37). “Tunneling” of ions and other solutes through carriers with little or no conformational change has been discussed (42). Again, the more complicated carrier-type mechanism, which appears to be relevant under most physiological conditions, provides the basis for classifying these proteins (i.e., as class 2 carriers rather than class 1 channels).
CHARACTERISTICS OF THE FAMILIES
Table 3 summarizes some of the key characteristics of most of the transporter families that we have identified. Categories 1.D and 2.B (non-ribosomally synthesized channels and carriers, respectively), 8 (auxiliary transport proteins), and 9.B and 9.C (putative but uncharacterized transporters) have been omitted (compare Table 1 with Table 3). Table 3provides the family TC numbers, the abbreviations of the families, and the substrates of transporters included within each family. Substrates that are common to one transporter are separated by commas, while substrates of different transporters within the family are separated by semicolons. Thus, in the major intrinsic protein (MIP) family (TC 1.A.8), aquaporins generally transport water but not organic compounds, while glycerol facilitators generally transport short, straight-chain polyols but not water. A few members of the family may transport both (see reference 97 for a review). A recent report has provided evidence that a member of the MIP family can accommodate anions (154), but this observation is of uncertain physiological significance.
Properties of families of transport systems included in the TC system
Table 3 also includes the size ranges of the individual protein members of the families and the numbers of (putative) transmembrane α-helical segments (TMSs) included within the permease polypeptide chains. All members of a family usually exhibit similar topological features, although several exceptions have been noted. When a homo- or heterooligomeric structure has been established for an intact permease, this fact is also indicated. Finally, the kingdoms in which members of the family have been identified, the approximate number of members that have been identified in each family, and representative examples of well-characterized members are also provided. The table is largely self-explanatory, but detailed information as well as primary and secondary references are provided on our web site and may be available in book form in the near future (Saier et al., unpublished data).
CROSS-REFERENCING PERMEASES BY ACCESSION NUMBER
Protein accession numbers can generally be used to find protein sequences of any sequenced protein referred to in the TC system. An accession number never changes once entered into a database. It therefore provides a quick and easy means of identifying a specific protein sequence. Moreover, it allows access to the database description of the sequenced protein, including structural, topological, and functional information. SwissProt (SP) database entries provide the most detailed information about the proteins, and SwissProt accession numbers are therefore provided when available. When not available, other accession numbers will be provided.
The accession numbers of all representative transport proteins included in the tables of the current TC system can be found on our web site. Accession numbers usually consist of one or two letters followed by four, five, or six digits. A given letter is used by only one database: GenBank (GB), SP, or Protein Information Resource (PIR). Thus, for example, O, P, and Q are used exclusively by SP; D, J, K, L, M, U, X, Y, and Z are used exclusively by GB; and A, B, C, H, I, and S are used exclusively by PIR. However, when AB, AE, or AF is followed by a six-digit number, this is an alternative GB accession number, and when JC, JH, or JN is followed by a four-digit number, this is a PIR accession number. It should be noted that a single sequenced protein may have multiple accession numbers, but no SP or PIR accession number refers to more than one protein. Because a GB accession number refers to a nucleotide sequence that may encode multiple proteins, a GB accession number may provide the sequences of several proteins.
A table entitled Cross-Referencing Permeases by Accession Number is included in our web site. In this table, accession numbers for most of the proteins included in the TC system as of June 1999 are tabulated in alphabetical and numerical order. These may be of general utility to the student of transport, as their availability allows one to easily search all databases using the various BLAST search tools (3). Knowledge of a TC number allows one to quickly identify (i) the protein referred to, (ii) the transport system of which that protein is a constituent, (iii) the substrate specificity of that system, (iv) the family to which that permease belongs, (v) the mode of transport used, (vi) the energy-coupling mechanism used, and (vii) many of the characteristics of that permease family. Thus, cross-referencing by accession number is useful when trying to identify the family to which a newly sequenced protein belongs.
As noted above, one needs only to conduct a BLAST search, and all sufficiently similar homologs will be displayed. When the accession number of any one of these retrieved homologs is shown to correspond to one of the established members of a family, the family to which the newly sequenced protein belongs is immediately known. Furthermore, by identifying the proteins with the highest BLAST scores (smallestP values), one can immediately recognize the closest homologs. This information provides an indication of the most likely substrate specificity, energy-coupling mechanism, and physiological function of the newly sequenced permease protein. Cross-referencing of accession numbers and TC numbers therefore provides a simple and rapid approach to the initial characterization of a newly sequenced porter. I and my colleagues, working with Andrei Lupas (SmithKline-Beecham), are currently developing a search tool based on the TC system that will allow anyone to search the complete TC system using sequence, sequence motif, accession number, gene name, protein name, family name, etc. (A. Lupas et al., unpublished data).
GROUPING TRANSPORT SUBSTRATES BASED ON BIOLOGICAL SIGNIFICANCE
In 1993 and again in 1996, Monica Riley presented an extensive tabulation of E. coli gene products (114, 115). To facilitate this endeavor, enzymes were classified based on the nature of the molecule(s) (substrates) acted on. In order to cross-reference transport systems based on substrate specificities, a basis for classifying potential substrates had to be devised. We have done so, creating a system that includes virtually all currently recognized transport substrates. This system of cross-referencing transporters is described here.
All known transport substrates have been classified into eight categories (Table 4). These categories are I, inorganic molecules; II, carbon compounds, III, amino acids and derivatives; IV, bases and derivatives; V, vitamins, cofactors, signaling molecules, and their precursors; VI, drugs, dyes, sterols, and toxic substances; VII, macromolecules; and VIII, miscellaneous compounds.
Classification of transport system substrates based on biological significance
Most inorganic molecules (category I) are cationic or anionic. However, some channel proteins are nonselective, and aquaporins of the MIP family (TC 1.1) transport water selectively. The four defined subcategories for category I therefore include A, nonselective; B, water; C, cations; and D, anions. Inorganic compounds not falling into one of these subcategories are classified as others (subcategory E), and this subcategory can be subdivided in the future if desired.
Carbon compounds (category II) have similarly been grouped into four defined subcategories: A, sugars, polyols, and their derivatives; B, monocarboxylates; C, di- and tricarboxylates; and D, noncarboxylic organic anions (organophosphates, phosphonates, sulfonates, and sulfates). Subcategory E (others) encompasses all other carbon compounds.
Amino acids and their derivatives (category III) have been subdivided into A, amino acids and conjugates; B, amines, amides, and polyamines; C, peptides; D, other related organocations; and E, others. Bases and their derivatives (category IV) have been subcategorized into A, nucleobases; B, nucleosides; C, nucleotides; D, other related derivatives; and E, others. Vitamins, cofactors, and cofactor precursors (category V) have been subcategorized into A, vitamins and vitamin or cofactor precursors; B, enzyme and redox cofactors; C, siderophores and siderophore-iron complexes; D, signaling molecules; and E, others. Drugs, dyes, sterols, and toxics (category VI) have been subcategorized into A, multiple drugs and dyes; B, specific drugs; C, bile salts and conjugates; D, sterols and conjugates; and E, others. Category VII is devoted to macromolecules: A, complex carbohydrates; B, proteins; C, nucleic acids; D, lipids; and E, others. Finally, category VIII (miscellaneous) encompasses any transport substrate that does not fall into categories I to VII. So far no transport substrate has been relegated to category VIII, and very few of those in categories I to VII have fallen into the “other” category.
A few compounds belong to more than one category. For example, bile acids fall into both category II.B and category VI.C. Theoretically, oligosaccharides (e.g., lactose, raffinose, and maltooligosaccharides) could be classified either in II.A or in VII.A. We have elected to put oligosaccharides into category II.A and reserve category VII.A for larger molecules such as polysaccharides, teichoic acids, and lipooligosaccharides. Thus, category II.A generally refers to smaller carbohydrates normally taken up by cells for purposes of carbon catabolism, while category VII.A refers to structural carbohydrates that are synthesized by cells and exported intact.
Some permease systems transport a range of compounds that fall into more than one category. For example, a single ABC export system may catalyze efflux of multiple drugs (VI.A) and peptides (III.C), and it may also facilitate phospholipid flipping between the two bilayers of a membrane (VII.D). Such permeases are rare, but when they do occur, they will be included in all applicable categories.
DISTRIBUTION OF TRANSPORTER TYPES BASED ON SUBSTRATE SPECIFICITY
As described above, Table 4 groups potential transport substrates according to structure and biological significance. This system of substrate classification has been used to cross-reference transport systems according to the types of substrates and the specific substrate(s) transported.
Table 5 presents the distribution of transporter types based on substrate specificity. In this table, permeases are categorized into four groups: α-type channels, β-type porins, primary carriers (regardless of the primary source of energy utilized and including PTS-type group translocators), and secondary carriers (including uniporters, symporters, and antiporters). Transporter types of unknown mode of transport or energy-coupling mechanism (categories 9.A and 9.B) were not included in Table 5.
Distribution of transporter families based on substrate specificity
α-Type channel proteins (TC 1.A) generally either are nonselective (13 types) or function in the transport of inorganic ions (I.C and I.D; 18 families) or proteins (VII.B; 5 families). One type (aquaporins in the MIP family; TC 1.A.8) transports water, while another type of the same family (glycerol facilitators of the MIP family) transports straight-chain polyols and small organic molecules such as urea. Some MIP family proteins may transport both water and small, neutral organic molecules, but with the possible exception of a single MIP family member (154), none of the MIP family channel proteins have been shown to be selective for ions or larger molecules. Besides MIP family members, only two other recognized channel families include members that are specific for organic compounds. These families are the urea transporter family (TC 1.A.44) and the phospholemman (PLM) family (TC 1.A.27). The PLM family includes members that transport organic anions. Channel-forming toxins (TC 1.C) are generally nonspecific, or they exhibit weak charge selectivity (i.e., anion selective or cation selective).
Porins (TC 1.B) are pore-forming proteins that exhibit β-barrel structures or variations on the β-barrel structural theme. They are localized to the outer membranes of gram-negative bacteria, mitochondria, and chloroplasts. They exhibit a wider range of substrate selectivities than do the α-type channel proteins cited above (Table5). However, like channel protein types, most porins either are nonselective, exhibit some degree of anionic or cationic selectivity, or function in the export of macromolecules across the outer membranes of gram-negative bacteria. Many porins allow passage of any molecule smaller than a certain cutoff size (usually about 500 to 1,000 Da). Of the macromolecular export porin types, more than half export proteins, but several transport complex carbohydrates, and at least one functions in DNA transport.
Examination of the specificities of porins for organic substrates reveals a wide variety of specificities. The maltoporin of E. coli (TC 1.B.3.1.1) and the raffinose porin of E. coli(TC 1.B.15.1.1) are both inducible by their sugar substrates, but while maltoporin is quite specific for maltooligosaccharides, the raffinose porin transports a variety of oligosaccharides. Other porins have been reported to preferentially transport organophosphates (TC 1.B.1.1.2), fatty acids (TC 1.B.9.1.1), nucleosides (TC 1.B.10.1.1), or organic solvents such as toluene (TC 1.B.9.2.1). Still another type apparently exhibits specificity for short-chain amides and urea (TC 1.B.16.1.1). Members of the outer membrane receptor family (TC 1.B.14) import vitamin B12 and a variety of iron-siderophore complexes in a process that is coupled to the PMF via the TonB-dependent energy-coupling system (TC 2.C.1). To what degree these proteins are biochemically selective is not always clear, although they are often encoded within operons that exhibit specific induction properties, and these systems are constituents of transenvelope transport complexes.
Primary carriers are in general highly specific for one or a few related substrates, and like channels, they are almost always selective for inorganic ions or macromolecules. Those specific for organic molecules of small or intermediate sizes belong to either of two superfamilies, the ABC superfamily (TC 3.A.1) or the PTS functional superfamily (TC 4.A.1-6). The PTS is actually a group translocating system, since it phosphorylates its substrates using phosphoenolpyruvate as the phosphoryl donor. For the purpose of tabulating substrate specificities as presented in Table 5, we have grouped this functional superfamily together with the active transporters. The energy-coupling mechanisms used for the transport of ions are diverse, involving pyrophosphate bond hydrolysis, decarboxylation, methyltransfer, oxidoreduction (both hydride shift and electron flow), and light absorption (Fig. 1). By contrast, all macromolecular primary active transporters use ATP or GTP hydrolysis to drive export, although a few macromolecular secondary active exporters use the PMF as the energy source for transport.
Secondary carrier types (TC class 2, subclass 2.A) exhibit a very different spectrum of substrate specificities. None is nonselective or water selective, but many are selective for specific inorganic cations or anions, and a few appear to function in the export of lipids, proteins, or complex carbohydrates, as is characteristic of primary active transporters. Others function in the transport of the many different types of small organic molecules found in biological systems. Thus, every class of molecules included in Table 5 is transported by one or more currently identified secondary carrier(s). For example, members of four transporter families are known to transport sugars and polyols; 13 types transport monocarboxylates, and 12 types transport di- and tricarboxylates. Eighteen types function in the transport of amino acids and their conjugates. It is clear that secondary carriers are primarily responsible for the transport of small organic molecules in virtually all living organisms.
The last column in Table 5 reveals the total numbers of families involved in the transport of the various types of biologically important compounds. About equal numbers of families are concerned with transport of inorganic and organic compounds, with most of the 105 families for inorganic molecules transporting ionic species. Thirty-three families are concerned with carbon source uptake, while 34 are concerned with the uptake of nitrogen-containing compounds (amino acids, bases, and their derivatives). Only 14 families include members that take up compounds in category V (i.e., vitamins, cofactors, signaling molecules, and related compounds), while only 8 families are concerned with transport of hydrophobic substances (category VI). Macromolecules are exported via the transporters of 26 families. While inorganic molecules and macromolecules are transported by all four types of systems, small organic substances are transported almost exclusively by secondary carriers (Table 5).
SUBSTRATE SELECTIVITIES
Cytoplasmic Membrane Channel Proteins (Excluding Porins)α-Type channel proteins (TC category 1.A) and pore-forming toxins (TC category 1.C) are largely responsible for the diffusion-limiting flux of inorganic ions between the cell cytoplasm and the external milieu or between intracellular compartments of eukaryotic cells. α-Type channel proteins are ubiquitous, but they are particularly prevalent in animals that use electrical signaling for purposes of neuronal signaling and the control of muscle contraction. Thus, in contrast to all other types of transporters, α-type ion channel protein families are primarily restricted to animals. Pore-forming toxins are most frequently produced by bacteria, but they can target the membranes of either prokaryotic or eukaryotic cells.
Table 6 provides a more detailed breakdown of the substrate selectivities of channel-type proteins. The majority of these channel types either are nonselective or merely exhibit a charge preference, preferring inorganic cations over anions or anions over cations. Many of these nonselective channel types are toxic proteins or peptides (CAPs) that are secreted by one cell in order to kill another. Most other channel types exhibit a striking degree of specificity. Seven of these are selective for chloride and other anions, two are specific for Na+, and two are selective for K+. Five channel types are specific for Ca2+. Only MIP and urea transporter family members transport small neutral molecules, as noted above. Five families include members that preferentially transport proteins, but one of these, the holin functional superfamily, consists of 16 families of functionally (but not phylogenetically) similar proteins. The holins (156) and Bcl-2 (1) are bacterial and animal proteins, respectively, that promote cell suicide or apoptosis. Members of the diphtheria toxin and botulinum and tetanus toxin families transport bacterial toxins into target animal cells (69,86). The proteins of the MscL family may provide protection against osmotic downshift, but they have been shown to be capable of catalyzing protein export as well (2, 8, 12).
Substrate selectivities of cytoplasmic membrane channel proteins (excluding porins)
Table 7 provides a detailed breakdown of channel-type proteins according to their substrate specificities. The individual families represented in each category are tabulated according to TC number. The name or abbreviation, source, and mode of regulation (when known) are also provided. Among the nonselective channels is the MscL channel of E. coli, which has been reported to exhibit a slight preference for cations over anions and to also transport proteins, while the MscS channel of E. colihas been reported to exhibit a slight anionic preference. Holins function primarily in protein export, while connexins and innexins function to form tight junctions between adjacent animal cells in vertebrates and invertebrates, respectively. Otherwise, all nonselective channel proteins or peptides are designed for export from the cell of synthesis for the purpose of biological warfare. These proteins and peptides are derived from phages, bacteria, and eukaryotes. Archaeal protein and peptide toxins that function by pore formation have not yet been characterized functionally.
Classification of channel proteins (excluding porins) according to substrate specificity
Low-specificity cation-selective channels include both acetylcholine- and serotonin-activated ligand-gated ion channel family members, ATP-sensitive ATP-gated cation channel family members, glutamate-gated ion channel family members (all of animal origin), and the MscL family proteins noted above (of bacterial origin). Other cation-selective channels include the members of the nonselective cation channel 1 (NSCC1) and NSCC2 families. Two families, the influenza virus matrix-2 channel and CybB families, selectively transport protons. Still others exhibit selectivity for a particular cation, Na+, K+, NH4 +, or Ca2+.
Three of the anion-selective channel types listed appear to transport Cl− selectively, although at least three additional families include members that transport other anions as well. Thus, chloride channel (ClC) family proteins transport a variety of inorganic anions, while members of the PLM family transport a wide range of monovalent inorganic and organic anions (Table 7). Three types of ligand-gated channel-forming members of the ligand-gated ion channel family specifically transport chloride, and a high degree of selectivity may be a characteristic of the voltage-regulated organellar ClC (O-ClC) proteins. The organellar ClC family is not related to the ClC family. Proteins of the large and ubiquitous ClC family are found in all three domains of life (bacteria, archaea, and eukarya), but only eukaryotic members have been functionally characterized. The epithelial ClC family includes members that can transport a range of anions.
Bacterial Outer Membrane PorinsAll characterized integral membrane proteins in the outer (lipopolysaccharide-containing) membranes of gram-negative bacteria are believed to consist largely of β-structure rather than α-structure, and structural features may provide a targeting signal for the outer membrane (18, 47). Among these proteins are the oligomeric (mostly trimeric) porins, several of which have been structurally characterized (48, 55, 81) (see below). These proteins can transport small molecules nonselectively, or they can be highly selective for a single class of molecules (18, 90, 150). They are found in the outer membranes of mitochondria and plant plastids (11, 40) and may be present in the outer mycolic acid-containing membranes of acid-fast gram-positive bacteria, such as species of mycobacteria and Nocardia (61, 109,133). Because of their unique subcellular locations and structures, outer membrane β-barrel porins are classified separately from the α-type channel proteins.
Lipopolysaccharide-containing outer membranes of gram-negative bacteria provide an unusually effective barrier against hydrophobic dyes, detergents, and hydrophobic and amphipathic drugs. However, by virtue of the presence of β-barrel-type porins in these structures, the membranes are generally permeable to hydrophilic molecules smaller than 650 Da (91). While some of these porins are essentially nonspecific, others appear to exhibit a high degree of selectivity. Tables 8 and9 summarize the substrate specificities of various recognized outer membrane porins. All of these proteins are derived from gram-negative bacteria except for members of the mitochondrial and plastid porin family (TC 1.B.8). Table 8 presents the breakdown of substrate selectivities. The conclusions reached are in some cases based on detailed biochemical analyses, but in other cases, physiological data were used to derive the conclusions presented. Thus, not all of the porins represented may prove to be as selective as indicated. A significant percentage of the porins are either nonselective or selective only with respect to the charge of the transported species. Among the anion-selective porins, some function primarily in the transport of phosphate, pyrophosphate, nucleotides, and/or fatty acids. Other small-substrate-selective porins have been reported to exhibit specificity for nucleosides, oligosaccharides, short-chain amides, or toluene. Still other porin-like proteins are designed for the export of drugs and heavy metals, while others function in the import of iron complexes and the vitamin precursor cobalamine. Several porin types apparently function in the export of complex carbohydrates and proteins. In most of these cases, the degree of specificity exhibited by these porins has not been studied extensively.
Substrate selectivities of bacterial outer membrane porins
Classification of outer membrane porins (most from gram-negative bacteria) according to substrate specificity
Table 9 provides a detailed breakdown of porin types according to their physiologically relevant substrate specificities. The individual TC numbers and families are presented, allowing the reader to trace the proteins cited and to identify the primary references so as to be able to examine the experimental evidence concerning their specificities.
Secondary CarriersSecondary carriers catalyze (i) the transmembrane transport of a single molecular species (uniport), (ii) the cotransport of a solute with a cation (symport), (iii) the countertransport of a solute against a cation (antiport), or (iv) the exchange of one solute for another (solute-solute antiport). Several can catalyze more than one such process (i.e., uniport or symport as well as solute-solute antiport), and single mutations can interconvert uniporters and symporters (143, 151). Some can cotransport several cations while countertransporting other cations. These proteins exhibit a wide variety of topological types and substrate specificities. They are responsible for the transport of most organic solutes across biological membranes, particularly those of eukaryotes that lack nutrient uptake permeases of the ABC superfamily (130).
Table 10 tabulates secondary carriers according to substrate. Secondary carriers are known as those that transport almost any inorganic ion of biological importance. Virtually all inorganic mono-, di-, and trivalent cations as well as a wide variety of biologically important inorganic anions are substrates of these transporters. In addition, all classes of organic molecules are transported by secondary carriers (Table 10). Only a few secondary carriers are believed to function in the export of macromolecules (complex polysaccharides, lipids, and proteins).
Substrate selectivities of secondary carriers
Table 11 provides a detailed breakdown of secondary transporters according to substrate, but in contrast to Table 10, Table 11 provides TC number, family, energy-coupling mechanism, and organismal distribution. Monovalent cations are generally transported either in symport with or by antiport against one or more other cations. Thirteen families are primarily concerned with the catalysis of monovalent cation transport. Di- and trivalent cations are probably taken up by uniport or by H+ or Na+ symport, and efflux is probably mediated by H+ antiport. In many of these cases, the energy-coupling mechanism is not well established. Members of 18 families mediate the transport of these ions, and several of these families include members that can catalyze either uptake or efflux.
Classification of secondary carriers according to substrate specificitya
A large variety of inorganic anions bearing one, two, or three negative charges can be accommodated by secondary carriers, some functioning with inwardly directed polarity and others with outwardly directed polarity (entry I.D). Sometimes fairly close homologs function with opposite polarity, as noted above for multivalent cation permeases. The mechanisms of energy coupling are known for most of these permeases. Sixteen families are represented under entry I.D.
Sugars and polyols are most frequently transported by MFS permeases, and 6 of the 29 MFS families are concerned with sugar transport. However, three other families (solute-sodium symporter, glycoside-pentoside-hexuronide, and l-rhamnose transporter [RhaT]) are also represented under entry II.A. Of these three families, the glycoside-pentoside-hexuronide family appears to be distantly related to the MFS, based on PSI-BLAST results (125) as well as hydropathy analyses. The same could not be demonstrated for the solute-sodium symporter and RhaT families. The RhaT family may be distantly related to proteins of another superfamily, the drug-metabolite transporter superfamily (D. L. Jack and M. H. Saier, Jr., unpublished results). Monocarboxylates are most often taken up by H+ symport (76), although other mechanisms are sometimes operative. Protein members of 14 families catalyze monocarboxylate transport. Di- and tricarboxylates are also usually accumulated in the cell cytoplasm by H+symport, and 13 families are involved. Surprisingly, members of just 2 of the 14 families that transport monocarboxylates also transport dicarboxylates. Thus, 12 families are monocarboxylate specific, while 11 are di- and tricarboxylate specific. Organophosphates are the only noncarboxylic organic anions represented under entry II.D (Table 11), and only two families, the TP-NST family and the MFS, are involved. Inorganic phosphate antiport is the primary mechanism believed to be operative under most physiological conditions for organophosphate ester transport via members of both of these families.
Amino acids and their conjugates (entry III.A) can be taken up by H+ or Na+ symport or by substrate-substrate antiport. Twenty characterized families are involved in amino acid transport. Three of these families (amino acid, polyamine-organocation, amino acid/auxin permease, and hydroxy/aromatic amino acid permease) appear to be distantly related to each other, and they constitute the putative amino acid transporter superfamily (155; D. L. Jack, I. T. Paulsen, and M. H. Saier, Jr., submitted for publication). Three families (l-lysine exporter, resistance to homoserine/threonine, and carboxylate/amino acid/amine transporter) appear to be concerned with amino acid efflux in prokaryotes. Amines, amides, and polyamines (entry III.B) are substrates of permeases from nine distinct families, and the same energy-coupling mechanisms observed for amino acids are operative. All of these families include members that can transport amino acids and are therefore listed under entry III.A as well as entry III.B. Four families of secondary carriers appear to mediate peptide uptake, and the mechanism involved is probably proton symport for members of all four families. Two of these families (MFS and proton-dependent oligopeptide transporter) may be distantly related to each other, as indicated by the results of PSI-BLAST searches with iterations (96, 125). Only the resistance-nodulation-cell division (RND) family of secondary permeases have been shown to catalyze peptide export.
Nucleobases (entry IV.A in Tables 10 and 11) are taken up by the two possibly related families, NCS1 and NCS2. The proteins of these two families are of similar sizes and topologies, transport similar substrates, and exhibit limited sequence similarity. The multidrug endosomal transporter family includes members that may transport nucleobases into endosomes of animals. Nucleosides (entry IV.B) are transported by H+ or Na+ symport or by uniport, and seven families are involved. Only two families of obligatory antiporters (entry IV.C) appear to mediate nucleotide transport.
Vitamins and their precursors (entry V.A in Tables 10 and 11) and intact cofactors (entry V.B) are taken up into cells by cation symport or product antiport, and 12 families have been identified that provide these functions. Two of these families include members that take up both vitamins and intact enzyme or redox cofactors, but two additional families that transport the latter compounds do not transport vitamins or cofactor precursors. One family within the MFS has recently been shown to transport iron-siderophore complexes (72), and three families have been shown to include members that may transport bacterial signaling molecules such as homoserine lactone derivatives (Table 11).
Drugs and other toxic substances (entries VI.A to VI.C in Tables 10 and11) appear to be expelled from cells exclusively by proton antiport, and nine families of secondary carriers appear to mediate these processes. Surprisingly, Na+ antiport has not been demonstrated for any such system. Three of these families include members that can exhibit a high degree of specificity for a single compound. One additional family, the bile acid:Na+symporter family, appears to include members that transport bile salts but not drugs. Two families, the proton-dependent oligopeptide transporter family and the MFS, include members that have been shown to catalyze drug uptake. This fact may reflect the accidental usage of a carrier designed to transport one substrate for transport of another due to low degrees of specificity. Only four families of secondary carriers are involved in the export of macromolecules (category VII). One of these, the polysaccharide transporter family, is specific for complex carbohydrates, while twin-arginine-targeting family members are specific for redox proteins.
Primary Carriers for Inorganic IonsPrimary carriers may function by either a carrier-type mechanism or a channel-type mechanism, but by definition, the transmembrane transport process is always energized by a primary source of energy (chemical, electrical, or solar energy). These pumps are exceptionally important in biological systems because they are responsible for establishing the ion gradients and membrane potentials upon which secondary carriers are dependent for energization. Primary active transporters are believed to be mechanistically more complex than channels or secondary carriers because their transport activities depend on superimposed catalytic activities that break chemical bonds, pass electrons from a donor molecule to an acceptor, or result in the absorption of light energy. The vast majority of these transport systems function either for the pumping of inorganic ions or for the secretion of macromolecules.
Data regarding the substrate specificities of primary carriers for inorganic ions are summarized in Table12. Protons and Na+ ions are each transported by four distinct energy-coupling mechanisms, and two of these mechanisms (ATP hydrolysis and electron flow) are known to be utilized for the transport of both ions. Both ions are transported by primary pumps exclusively in the outward direction. Protons can additionally be extruded by hydride transfer (an unusual type of redox reaction for the energization of a vectorial process) and by light absorption (mediated by bacteriorhodopsin and its homologs in archaea and by photosynthetic reaction centers in bacteria and chloroplasts). Na+ extrusion can additionally be driven by decarboxylation of a carboxylic acid in bacteria and perhaps in archaea and by methyl transfer in archaea. Light-driven ion transport via bacterio- or halorhodopsin and methyl transfer-driven Na+ efflux via a methyl coenzyme M-dependent mechanism are so far restricted to the archaeal domain, and each of these processes is restricted to just one small group of archaea. Decarboxylation-driven Na+ efflux has to date been characterized exclusively in bacteria, but homologs of the decarboxylase subunits, including the Na+-transporting integral membrane β-subunits of these decarboxylases, are found in the archaeon Archaeoglobus fulgidus. The functions of the archaeal subunits have not yet been ascertained. Plants, protozoans, archaea, and bacteria possess proteins that belong to a unique family of vacuolar H+-transporting pyrophosphatases. In plants, these enzymes pump protons into the vacuolar lumen, thereby generating a transmembrane PMF. It has been suggested that these enzymes may be relics of ancient systems that existed before the advent of ATP (6).
Substrate selectivities of primary carriers for inorganic ions
Permease proteins of three families function in ATP or pyrophosphate hydrolysis-driven proton efflux, and six different families probably mediate electron flow-driven proton extrusion. Three families have been shown to mediate ATP hydrolysis-dependent Na+pumping, and four may catalyze electron flow-dependent Na+expulsion. The Na+-transporting NADH dehydrogenase family is not homologous or related to the H+- or Na+-transporting NADH dehydrogenase family. Recently published evidence has shown that the proteins of the latter family are capable of replacing Na+ with H+(64). The commonly assumed equivalence of H+ and Na+ as substrates of primary carriers often, but perhaps not always, applies.
Only one family of primary carriers apparently mediates K+ active transport, and members of this family, the P-type ATPase family, occur in various structural forms (Table 12). These pumps function by K+:Na+ or K+:H+ antiport in animals but possibly by K+ uniport in bacteria. An Na+ extrusion P-type ATPase is found in Saccharomyces cerevisiae. In spite of major differences in substrate recognition and subunit composition for the various P-type ATPases, the mechanisms of transport and energy coupling are likely to be similar. However, since the bacterial K+-transporting ATPases and the eukaryotic Na+-K+ ATPases cluster on completely different segments of the phylogenetic tree (4, 36), significant mechanistic differences can be expected.
Primary pumps that drive divalent cation efflux or uptake always utilize ATP hydrolysis, and either two or three families may be involved (Table 12). Closely related P-type ATPases specific for Cu2+ can function with either inwardly or outwardly directed polarity, depending on the system. Bacterial Cd2+-transporting P-type ATPases have been shown to catalyze efflux of several heavy metal ions (Zn2+, Co2+, Ni2+, and Pb2+) as well as Cd2+ (7, 49, 80).
Anion transport can be driven by ATP hydrolysis either via ArsAB systems (TC 3.A.4), which catalyze efflux, or via ABC systems (TC 3.A.1), which catalyze uptake. In the case of chloride, halorhodopsin can utilize light absorption to drive Cl− uptake into the halobacterial cell (92, 144). A single amino acid substitution can convert the outwardly directed proton pump of bacteriorhodopsin into an inwardly directed chloride pump (129,145). The aspartate-for-threonine substitution at position 85 in bacteriorhodopsin appears to alter both the ion selectivity and the direction of transport. Bacteriorhodopsin and halorhodopsin thus have a common transport mechanism, as expected from their high degree of sequence similarity (52, 67), and a single residue in these proteins strongly influences the ionic specificity.
Table 13 summarizes the varied substrate specificities of ABC permeases. These primary pumps are surprisingly versatile with respect to both the substrate transported and the polarity of pumping. Phylogenetic analyses have revealed that the uptake permeases cluster separately from the efflux permeases (130). ABC transporters can recognize almost any type of substrate that might be of biological interest, regardless of whether it is organic or inorganic, small, intermediate, or large. The architectural basis for this remarkable degree of versatility is likely to prove extremely interesting.
Varied specificities of ABC permeases
Table 14 provides a detailed summary of the pumping activities of well-characterized primary active transporters and group translocators. Although the variation in substrate specificity is extensive, much of this versatility is due to the activities of ABC-type permeases, as noted above. Excluding this one superfamily and the group translocating PTS-type sugar permeases, almost all primary pumps are specific either for inorganic ions or for macromolecules. Macromolecular pumps will be discussed in the next section.
Classification of primary carriers according to substrate specificity (excluding macromolecular transporters)
CELLULAR MACROMOLECULAR EXPORT SYSTEMS
Table 15 tabulates the transport systems that catalyze the export of macromolecules. The majority of these systems utilize ATP hydrolysis to drive transport, but several also appear to exhibit a dependency on the PMF. PMF-dependent exporters for complex carbohydrates may include those of the polysaccharide transporter family, while those for proteins include members of the twin-arginine-targeting family. Bacterial holins and certain channel-forming toxins are probably energy-independent protein exporters and importers, respectively. Bacterial MscL channels and mammalian Bcl-2 channels probably also function by energy-independent mechanisms. While three recognized families participate in polysaccharide export, 13 tabulated families participate in protein transport. The mitochondrial and chloroplast envelope protein transport systems can be thought of either as matrix uptake systems or as cytoplasmic export systems. It should be noted that the protein-specific holins and ABC exporters as well as the diphtheria and the botulinum and tetanus toxin importers are relatively simple in structure. ABC export systems may function with trans-envelope protein complexes (157, 158). The more general systems, which transport many proteins, however, consist of large complexes of multiple protein constituents. Only a single type of export system tabulated, the type IV secretory pathway family (TC 3.A.7), mediates export of nucleoprotein complexes. However, another type of system, the bacterial competence-related DNA transformation transporter family (TC 3.A.11), mediates uptake of naked single-stranded DNA in bacteria competent for natural transformation, and a poorly characterized family of systems, the septal DNA translocator family (TC 9.A.16), may function in the transmembrane transport of double-stranded DNA. Two types of active transport systems (ABC and P-type ATPases) are believed to mediate phospholipid flipping from the inner leaflet to the outer leaflet of a biomembrane, although the anion exchanger and RND families of secondary carriers include members that have been reported to do the same (4, 46, 137) (Table 15). These last-mentioned transport systems represent the only examples in which macromolecular export systems are ubiquitous, being found in eukaryotes as well as prokaryotes.
Classification of cellular macromolecular export systems (excluding porins)
CLASSIFICATION OF TRANSPORTERS OF UNKNOWN MECHANISM
Several families of proteins are known in which one or more members have been shown to function as transporters, but either the mode of transport (channel versus carrier) or the energy source driving solute accumulation or expulsion has not been determined. Consequently, it is not possible to assign the transporter family to a defined category (1-4). Such families fall into TC category 9.A. Additionally, families of proteins in which no member of the family has been shown to be a transporter are known, although some indirect experimental evidence, or inferences based on topological analyses and/or operon gene product analyses, supports such a possibility. Such families fall into TC category 9.B. Finally, functionally characterized transporters lacking an identified sequence fall into TC category 9.C. The families listed in these categories will either be transferred to one of the established categories when their transport mechanism becomes defined or be eliminated from the TC system if it is shown that these proteins are not actual transporters. In this section, the families that constitute TC class 9.A will be discussed. Those of classes 9.B and 9.C will not be considered further here.
Table 16 tabulates families of known transporters for which no member has yet been clearly defined in terms of either its mode of transport (channel or carrier) or its energy-coupling mechanism. Many of these permeases belong to families that include members which are specific for inorganic ions. Eleven families are inorganic ion specific, and 10 of these are cation specific. Most of these families include members that are specific for a single ion or a few closely related ions. However, one family (low-affinity cation transporter, TC 9.A.20) transports a variety of cations, exhibiting unexpectedly broad specificity.
Classification of transport systems functioning by an unknown mechanism (class 9.A) according to substrate specificity
Some of the category 9.A permeases (belonging to six distinct families) exhibit specificity for small organic compounds. These compounds vary from amides and amines, including urea and uric acid, to peptides and vitamin precursors. Thus, a variety of organocations, organoanions, and neutral molecules are transported. One family (polysaccharide transporter) transports complex polysaccharides, probably by a PMF-dependent mechanism, but the energy-coupling mechanism is still poorly defined. Considerations to be discussed in the next section allow prediction of the modes of action of several of these systems. Putative transporters (category 9.B) are not discussed here but can be evaluated by consideration of the information provided in our web site.
PREDICTIONS OF TRANSPORT MODE BASED ON PROTEIN TOPOLOGY
Examination of the topologies of families of recognized α-type channels (TC 1.A) and secondary carriers (TC 2.A) reveals that these two functional types of transporters differ fundamentally both in polypeptide structure and in oligomeric composition. This fact suggests that there are fundamental differences between these two functional types of transporters and that channels and carriers truly represent distinct types of proteins. This structural distinction between the two principal functional types of transporters is evaluated in this section.
As illustrated in Fig. 2, most families of cellular integral membrane α-type channel proteins include members that possess three or fewer TMSs per polypeptide chain (Fig. 2A), while almost all families of secondary carriers include members that possess eight or more TMSs (Fig. 2B). When permease families of unknown transport mode are examined (Fig. 2C), some are found to fall into the 1 to 3 TMS range observed for most channel families, while others fall into the 8 to 14 TMS range observed for most carrier families. It can be anticipated that most of the former proteins will prove to be channels, while the latter will mostly prove to be carriers. The disproportionate number of families of unknown mechanism of action with about 6 TMSs leads to the possibility that new types of transporters, not yet characterized, may be found among these families.
Established or predicted topologies for channel proteins (A), carrier proteins (B), and proteins of unknown transport mode (C). The proteins included in A are the channel proteins of TC category 1.A, while the carriers represented in B are the families of TC category 2.1A. Because most primary carriers of categories 3 consist of heterooligomers, many of very complex structure, these were not included in the analyses depicted.
Interestingly, very few carriers have been shown to be capable of functioning as channels under any experimental set of conditions. Two of those that do exhibit this unusual property prove to consist of polypeptide chains that have 6 TMSs. Families of such transporters include the mitochondrial carrier family (TC 2.A.29) and the TP-NST family (TC 2.A.50) (17, 28, 29, 132). The E. coliKefB and KefC proteins of the CPA2 family (TC 2.A.37) also seem to have the capacity to function either by a carrier-type K+:H+ antiport mechanism or by a K+-specific channel-type mechanism (38, 39). Proteins of this last-mentioned family exhibit 10 to 14 putative TMSs and therefore have the topology of a typical carrier. They exhibit channel-type activities following treatment with certain chemicals. Ambivalent modes of transport for members of a few other secondary carrier families have also been noted (see reference121 for further consideration of this point).
On the basis of all of the observations summarized in this section, we propose that, with only a few exceptions, channel proteins are structurally and functionally different from carriers. Channels are proposed to generally consist of oligomeric structures in which the monomeric protein subunits exhibit ≤3 TMSs. Some exceptions to this rule have resulted from the fusion of non-transport-regulatory domains to the channel-forming constituents of transporters (88). Regardless of topology, however, the channel generally results from the proper association of multiple channel-forming subunits or domains. Carriers, on the other hand, are proposed to generally consist of functional monomers that exhibit 8 to 14 TMSs or, less frequently, of functional dimers that have 4 to 7 TMSs. In these situations, the transport pathway requires the participation of just one or, at most, two polypeptide chains. The numbers of known exceptions to this topological rule are small (Fig. 2).
RECOGNIZED DISTRIBUTION OF TRANSPORTER FAMILIES IN THE THREE DOMAINS OF LIFE
Our studies of the distribution of proteins within the various families of transporters have revealed that most families are restricted to just one of the major domains of life, bacteria, archaea, or eukarya. Other families are ubiquitous, being found in all three domains. If lateral transfer and fixation of genetic material occurred appreciably between these three domains during the past two billion years, one would expect many families to be ubiquitous. Our observations have therefore led to the suggestion that the ubiquitous families are among the oldest families and that they existed before divergence of the three major domains of organisms, some three billion years ago. The domain-specific families are therefore those that arose late, after the “great split.” Alternatively, some of these families may have diverged in sequence from their ancestral system at rates that exceed those observed for the recognized ubiquitous families. Even if this occurred, however, the lack of recognizable homologs in the other domains suggests the absence of appreciable lateral transfer.
Table 17 summarizes the distribution of the identified families of various channel types, secondary carriers, primary carriers, group translocators, and transporters of unknown mechanism in the three domains of life: bacteria, archaea, and eukaryotes. Regardless of transporter category, the distribution is simple. Thus, many families of transport systems are found exclusively in either bacteria or eukaryotes, and four have been identified only in archaea. Many of these families may prove to exist in only one of the three major domains of life, and most such families probably arose within that kingdom after the three domains of living organisms separated from each other. It is also possible that some ancient families that existed prior to the divergence of archaea and eukaryotes from bacteria will prove to be restricted to just one or two domains because a particular transport mode or energy-coupling mechanism is incompatible with (or disadvantageous to) the organisms within a particular domain. It is particularly noteworthy in this regard that although hundreds of genes of the PTS (TC 4.A) have been sequenced from bacteria and many of the genes encoding the cytoplasmic constituents of the PTS function in regulation rather than in transport, not a single such gene has yet been found within an archaeal or eukaryotic genome (J. Reizer and M. H. Saier, Jr., unpublished results). Similarly, although ABC-type efflux pumps are universal, extracytoplasmic receptor-dependent ABC-type uptake permeases (TC 3.A.1) as well as receptor-dependent tripartite ATP-independent periplasmic transporter-type uptake permeases (TC 2.A.56) are found only in prokaryotes (107). These observations have led us to conclude that horizontal transmission and fixation of genetic material across the three domains of life has occurred rarely, at least in the case of genes encoding many types of transporters, during the past two billion years.
Distribution of transporter families in the three domains of lifea
Several families are found ubiquitously in all three domains of living organisms or are found in at least two of these domains (Table 17). We predict that many (but not necessarily all) of the latter families will prove to be represented in all three domains. The lower representation of transporter types in the archaeal domain presumably reflects, at least in part, the paucity of both sequence data and functional analyses reported for this domain. It should be noted that the vast majority of ubiquitous families (about two-thirds) are families of secondary carriers. The distribution of transporter types and the identification of the relevant families are presented in Table18 for the various channel types, in Table 19 for the various carrier families, and in Table 20 for the various types of primarily active transporters.
Kingdom distribution of channel families: α-type (1.A), porins (1.B), and toxins (1.C)
Kingdom distribution of secondary carrier families (TC 2.A)
Kingdom distribution of primary active transporter families (including group translocators)
Several interesting conclusions derived from the data in Tables 17 to20 can be tentatively made. First, of the families of channels, only three families (MIP, VIC, and ClC) are ubiquitous. A fourth such family may prove to be the metal ion transporter family (TC 9.A.17). Second, except for these families and two families (MscL and MscS) specific to bacteria, all families of α-type protein channels are found exclusively in eukaryotes. Third, the vast majority of these eukaryotic families are restricted to animals. Finally, the vast majority of protein and peptide toxin families, holin families, and β-strand-type porin families are restricted to bacteria. In the case of the toxin families, this unequal distribution may reflect, at least to some extent, my greater focus (and that of research scientists in general) on bacterial toxins rather than those of eukaryotes.
The distribution of secondary carriers is not so polarized. Thus, 31 carrier families (40%) are ubiquitous, compared to 25 (32%) and 14 (19%) that are specific to bacteria and eukaryotes, respectively. Only eight (10%) are found in two of the three kingdoms. If lateral transfer of genetic material coding for transporters has been minimal, as we have proposed (119, 120), then it would appear that a large proportion of the secondary carrier families came into existence early, before the split between the three domains, compared to channel or primary carrier families.
Primary carrier families are found solely in bacteria, ubiquitously, and in bacteria plus eukaryotes, in decreasing numbers in that order. The relatively large percentage of systems in the last category is due to the presence of three families of H+-pumping electron or hydride-transferring carriers that are found only in mitochondria and/or chloroplasts of eukaryotes in addition to bacteria. Since both of these eukaryotic organelles are believed to have arisen from bacteria long after the split between bacteria and eukaryotes (93), the actual proportion of primary active transporter families specific to bacteria may be considered substantially greater, while that of carriers shared by bacteria and eukaryotes may be smaller (see lightface values in parentheses in Table 17). Finally, one family, the fungal-archaeal rhodopsin family (TC 3.E.1), is unusual in that although these light-driven ion transporters are restricted to one small group of archaea, homologs that may not function in transport are found in yeasts and other fungi. This may represent one of those rare examples where distant homologs of a transporter family have evolved to serve very different functions (see below). The results summarized in Tables 18 to 20 provide a detailed breakdown of channels, secondary carriers, and primary carriers, respectively, and the TC numbers of the families in each category are provided so that the reader can easily identify the relevant families.
It has been noted that archaeal metabolic enzymes and transporters frequently resemble the homologous bacterial sequences more than those of the corresponding eukaryotic proteins, although archaeal proteins of DNA replication, transcription, and translation are more similar to those of eukaryotes (20, 34). This observation has been interpreted to suggest that archaea are mosaic organisms, with nucleic acid and protein-biosynthetic enzymes derived primarily from an early eukaryotic precursor cell, while transport and metabolic functions are derived primarily from a primordial bacterium. If such a “fusion” event was responsible for the generation of the archaeal lineage, a significant number of transporter families should prove to be restricted to bacteria and archaea but lacking in eukaryotes. The availability of four complete archaeal genome sequences has allowed resolution of this question. Of the 200 families represented in Table17, only 7 (3.5%) are shared by bacteria and archaea but not by eukaryotes. Similarly, very few families are represented in bacteria and eukaryotes but not archaea. Moreover, some of these last-mentioned families are represented only in eukaryotic organelles, suggesting a more recent bacterial origin. Thus, very few families may prove to be restricted to just two of the three domains of living organisms. An alternative view concerning the origin of archaea, such as that proposed recently by Poole et al. (103), may be worth considering.
TRANSPORT PROTEINS FOR WHICH THREE-DIMENSIONAL STRUCTURAL DATA ARE AVAILABLE
An ultimate understanding of transport will depend upon detailed structural data for each of the major classes of transport systems. Until recently, few or no such data were available. The approach of X-ray crystallography has yielded very significant advances in understanding the three-dimensional structures of certain classes of integral membrane proteins. Most of these proteins are of prokaryotic origin, and they do not yet include the major classes represented by secondary carriers, group translocators, and ATP-driven primary pumps. However, channel-type proteins and both light- and electron flow-driven proton pumps are now structurally understood at high resolution.
Table 21 lists the transport proteins for which high-resolution three-dimensional structural data are available. Four types of channel proteins (α-helix-forming channels, β-barrel porins, peptide channels, and protein toxin channels) are represented, as are electron flow-driven and light absorption-driven proton pumps. The structures of these and other membrane proteins have been discussed by Sakai and Tsukihara (128). The fact that no chemically driven primary carriers, no facilitators or secondary carriers, and no group translocators are represented means that we are currently far from a structural understanding of transport. Although the structures of several water-soluble domains (receptors or energy-coupling proteins) of some of these systems have been determined (i.e., ABC-type receptors, the transhydrogenase hydride transfer domains and pumps, and the energy-coupling proteins of the PTS) (105, 106, 126), the structures of the integral membrane constituents of these systems are still unsolved. In fact, high-resolution structures are not available for a single transport system within one of these categories, even though these types of transporters represent the major types found in nature. Much work will be required before molecular transport can be put on a firm structural basis.
Transporters for which three-dimensional structural data have been reporteda
TRANSPORTER FAMILIES INCLUDING NONTRANSPORTING HOMOLOGS
Of the currently recognized 250-plus families of established transporters, we have noted that only 7 include transmembrane proteins that have been shown to function in a capacity other than transport. Of these seven families, four include homologs that are believed to serve as receptors (Table 22). In the case of the ammonium transporter family of NH3 (or NH4 +) transporters, a yeast homolog, Mep2p, acts as both a sensor and a transporter. In the MFS and amino acid-polyamine-organocation superfamilies, the putative transcriptional regulatory sensors have not been shown to be incapable of transporting their ligands, although the available evidence is against it (57,65). In the case of the MFS receptors, protein domains that interfere with transport function may be required to convert a transporter into a signaling receptor (57, 65). This scenario is reminiscent of the sensory rhodopsins, for which interaction with transducer proteins blocks proton transport (159). In the case of the RND superfamily, a homologous integral membrane domain serves as a sterol-binding domain, and this domain is found in several receptors, and even an enzyme, 3-hydroxy-3-methylglutaryl (HMG)-coenzyme A reductase (Table 22). This provides one of the best-documented examples of a family of transporters that has truly diverged in function. As noted above, the established bacteriorhodopsin family includes sensory rhodopsins that mediate phototaxis as well as homologs in S. cerevisiae that probably do not function in transport (52, 67, 159). Indirect evidence suggests that most of these yeast proteins are integral membrane heat shock or organic solvent shock proteins (53, 108). They lack the conserved lysine to which retinal binds in Schiff's base linkage in the archaeal proteins. However, a homologous retinal-containing photoreceptor has recently been identified in Neurospora crassa (9). It is possible that the fungal chaperone proteins contain noncovalent retinal and/or energize protein folding by catalyzing proton transport through themselves.
Transporter families including nontransporter homologs
Finally, water-soluble constituents, and possibly also the integral membrane transporter domains of both the ABC and PTS superfamilies, have been shown to function in various nontransport capacities (50, 126, 136). Thus, the extracytoplasmic receptors of ABC permeases have homologs that are domains within bacterial transcription factors (90) as well as eukaryotic neurotransmitter receptors of the glutamate-gated ion channel family (TC 1.7) (87,141). Similarly, a few homologs of the ATP-hydrolyzing ABC proteins function in catalysis of bacterial cytoplasmic processes, and some ABC permease homologs function in regulation of other transporters (21). Bacterial PTS II.A proteins and protein domains function in regulatory processes, sometimes in addition to their transport functions and sometimes instead of their transport function (126). A few PTS transporters (II.C constituents) also serve as sensory transducers (24, 71, 78). Nevertheless, it seems surprising that so few transporter homologs function in a nontransport capacity. This fact greatly facilitates the annotation and functional assignment of putative proteins whose sequences are (or will be) revealed by genome sequencing. It also suggests that transporters evolved as a class of proteins independently of other protein types, such as enzymes, structural proteins, and regulatory proteins.
AUXILIARY TRANSPORT PROTEINS
Proteins that in some way facilitate transport across one or more biological membranes but do not themselves participate directly in transport are classified as auxiliary proteins (Table23). These proteins by definition always function in conjunction with one or more transport proteins. They may provide a function connected with energy coupling to transport, play a structural role in complex formation, or serve a regulatory function (see section 8.A in Table 2). Examples include the membrane fusion proteins (TC 8.A.1), which provide a periplasmic bridge between primary, energy-coupled efflux permeases in the cytoplasmic membranes of gram-negative bacteria, and outer membrane factors (TC 1.B.17), which provide porin-type channel functions across the latter structures (63, 99, 138, 157, 158). Membrane fusion protein family proteins allow solute export across both membranes of the gram-negative bacterial cell in a single energy-coupled step (10,30, 73).
Families of auxiliary transport proteins
Other proteins that span the cytoplasmic membrane with large domains in the extracytoplasmic space of the gram-negative or gram-positive bacterial cell and sometimes function with additional cytoplasmic domains include members of the cytoplasmic membrane-periplasmic auxiliary 1 (MPA1; TC 8.A.3) and MPA2 (TC 8.A.4) families (33, 98,152). These proteins are believed to function directly in export and possibly also in the regulation of complex carbohydrate export by virtue of the protein tyrosine kinase activities that are associated with their cytoplasmic domains (147).
A most interesting set of auxiliary transport proteins is the TonB family (TC 2.C.1) of heterotrimeric protein complexes that allow transmission of energy in the form of the PMF across the inner membranes of gram-negative bacteria to energize uptake of iron-siderophore complexes and vitamin B12 across outer membranes via proteins of the outer membrane receptor (TC 1.B.14) family. The latter proteins exhibit structural features superficially resembling those of outer membrane porins (37, 74, 99). However, they differ from typical porins in being monomeric and exhibiting 22 antiparallel β-strands in the β-barrel structure. The heterooligomeric TonB-ExbBD complex may prove to transport protons, explaining their capacity to respond to the PMF. Limited sequence similarity of these proteins to the MotAB proteins (TC 1.A.45) (A. Lupas, personal communication) further suggests this possibility.
Other auxiliary proteins include the energy-coupling proteins of the bacterial phosphoenolpyruvate-dependent sugar-transporting PTS (categories 4.A.1 to 4.A.6). Enzymes I and HPr proteins (TC 8.A.7 and 8.A.8, respectively) serve as phosphoryl transfer proteins, thereby providing both energy-coupling and enzyme-catalytic functions (104, 111). The enzymes I are homologous to phosphoenolpyruvate synthases and pyruvate:phosphate dikinases that normally function in phosphoenolpyruvate synthesis (117). We have suggested that the PTS evolved relatively late and depended on the conversion of preexisting phosphoenolpyruvate synthases into phosphoenolpyruvate-dependent phosphoryl transfer enzymes of the PTS (112).
Finally, many proteins are clearly implicated in transport, but they appear to play indirect and ill-defined roles in the process. These proteins include the rBAT (TC 8.A.9) and MinK (TC 8.A.10) family members. rBAT and MinK are believed to function in conjunction with amino acid carriers and potassium ion channels, respectively (35,77, 116, 146). They may play roles in stability and subcellular targeting.
Many additional auxiliary proteins are included in the tables describing porters of TC categories 1 to 4. Because of their tight association with particular transport systems, they are described as constituents of these systems rather than as auxiliary proteins of the 8.A class.
CONCLUSIONS AND PERSPECTIVES
In this article I have described a comprehensive classification system for transport proteins that has the theoretical potential to include all transmembrane transport systems found in all living organisms on Earth. We have attempted to design this system so that it can accommodate new information and incorporate new systems as these become available with minimal alteration in structure. We have designated this system the transporter classification (TC) system of the Transport Commission of the IUBMB. This system is based on a combination of functional and phylogenetic characteristics of transporters and their constituents. The incorporation of phylogenetic data is a departure from the classification system devised by the Enzyme Commission years ago for the classification of enzymes, but the use of phylogenetic information provides many advantages, as discussed in the introductory section. Thus, phylogeny provides the most reliable guide to structure, function, and mechanism, and it provides valuable information concerning the evolutionary history of a family. The TC system should be capable of incorporating any novel type of molecular transporter that may be discovered in the future as well as the ever-increasing numbers of novel transporters that fall into existing families. Rules have been presented that allow the systematic consolidation of families as evolutionary links between them become available. Our goal is to eventually automate the incorporation of novel transporters into the system without (or with minimal) human intervention. Since the classification system is based on both function and phylogeny, achievement of this goal will require automation of tree construction as each new sequence becomes available in public databases, as well as the incorporation of biochemical, genetic, and physiological data as these become part of the scientific literature. As additional genomes are sequenced, the achievement of this goal will also require that screening techniques and annotation of novel families and family members be streamlined. In conjunction with Andrei Lupas and the bioinformatics group at SmithKline-Beecham (5), automation is now being implemented. Our web site will soon serve as a search tool that, for the analysis of transport proteins, will hopefully prove to be as useful as the BLAST search tools of the National Center for Biotechnology Information. Continual revamping of in silico methods for achieving these goals represents a major challenge that will require cooperation on the parts of computational scientists, molecular biologists, and cell physiologists. Exactly how these goals should best be achieved cannot easily be anticipated, as they are likely to be tightly coupled to technological advances through the years.
Currently recognized transporters include simple proteins as well as large multisubunit complexes that either facilitate passive diffusion of molecules across membranes or use one or more types of energy to drive transport. A large number of potential energy-yielding reactions have already been shown to be coupled to transport. These include several distinct chemical reactions, such as bond breakage reactions (e.g., decarboxylation and pyrophosphate bond hydrolysis), chemical group transfer reactions (e.g., hydride and methyl transfer), and electron flow. In addition, light absorption and the flow of ions down electrochemical gradients can be used to drive transport. In the reverse direction, ion transport can function to drive flagellar rotation, ATP synthesis, or active transport across the outer membranes of gram-negative bacteria. Variations on the established themes as well as entirely new themes are likely to be revealed by the efforts of future investigators. Perhaps, as three-dimensional structural data become available for the major classes of primary and secondary active carriers as well as group translocators, we will be able to delineate the mechanistic details of these processes. As totally new transport modes, not yet imagined, may be revealed, the transport biologist has exciting new discoveries to look forward to. The classification system proposed here, based on both function and phylogeny, is designed to accommodate any such discoveries and will hopefully aid in delineating the applicability of the structural, mechanistic, and evolutionary principles established with a few model systems to the hundreds of transporter types currently recognized and yet to be discovered.
ACKNOWLEDGMENTS
I wish to acknowledge valuable discussions with R. Apweiler, A. Bairoch, A. Goffeau, A. Kotyk, A. Lupas, H. Nikaido, I. T. Paulsen, J. Reizer, J. Schroeder, M. K. Sliwinski, and T.-T. Tseng. I am particularly grateful to Milda Simonaitis, Donna Yun, Monica Mistry, Francisco Solis and Mary Beth Hiller for their assistance with the preparation of the manuscript. Finally, I am indebted to the many students in my laboratory who conducted phylogenetic analyses, designed novel software, constructed our web site, and provided me with an unlimited source of information and inspiration. Without their invaluable participation, the classification system described in this review would never have been formulated.
Work in my laboratory was supported by USPHS grants 5RO1 AI21702 from the National Institute of Allergy and Infectious Diseases and 9RO1 GM55434 from the National Institute of General Medical Sciences, as well as by the M. H. Saier, Sr., Memorial Research Fund.
I wish to dedicate this treatise to my mother, Lucelia Bates Saier, in gratitude for her love, encouragement, confidence, and support.
ADDENDUM IN PROOF
Our recent unpublished results have defined a novel superfamily of secondary carriers consisting of 13 families. We have designated this superfamily the drug/metabolite transporter (DMT) superfamily (TC no. 2.A.7) (D. L. Jack and M. H. Saier, Jr., unpublished data). Some of the families included in the DMT superfamily had been included in our previous TC system, but others were previously unrecognized. The 13 currently recognized families of the DMT superfamily are as follows:
2.A.7.1— the 4 TMS small multidrug resistance (SMR) family (previously the SMR family, 2.A.7)
2.A.7.2— the 5 TMS bacterial/archaeal transport (BAT) family (previously unrecognized)
2.A.7.3— the 10 TMS drug/metabolite exporter (DME) family (previously the CAAT family, 2.A.78)
2.A.7.4— the 10 TMS plant carboxylate/amine transporter (P-CAT) family (previously unrecognized)
2.A.7.5— the 10 TMS glucose/ribose uptake (GRU) family (previously part of the RhaT family, 2.A.9)
2.A.7.6— the 10 TMS l-rhamnose transporter (RhaT) family (previously part of the RhaT family, 2.A.9)
2.A.7.7— the 10 TMS RarD (RarD) family (previously unrecognized)
2.A.7.8— the 10 TMS Caenorhabditis elegans ORF (CEO) family (previously unrecognized)
2.A.7.9— the 6-8 TMS triose-phosphate transporter (TPT) family (previously part of the TPNST family, 2.A.50)
2.A.7.10— the 10-12 TMS UDP-N-acetylglucosamine:UMP antiporter (UAA) family (previously part of the TP-NST family, 2.A.50)
2.A.7.11— the 10-12 TMS UDP-galactose:UMP antiporter (UGA) family (previously part of the TP-NST family, 2.A.50)
2.A.7.12— the 10-12 TMS CMP-sialate:CMP antiporter (CSA) family (previously part of the TP-NST family, 2.A.50)
2.A.7.13— the 10 TMS GDP mannose:GMP antiporter (GMA) family (previously part of the TP-NST family, 2.A.50)
As homology has been established for all of these members of the DMT superfamily, they will be included under TC entry 2.A.7, and TC entry numbers 2.A.9, 2.A.50, and 2.A.58 will be assigned to other families of secondary carriers (see our website).
Recently, UreI of Helicobacter pylori (spQ09068) was functionally characterized (D. L. Weeks, S. Eskandari, D. R. Scott, and G. Sachs, Science 287:482–485, 2000). UreI (and AmiS ofPseudomonas aeruginosa [spQ51417]) are members of the putative amide transporter (Ami) family, previously designated TC no. 9.A.15 (Tables 2 and 3). Members of this family were known to be encoded within operons that also encode amidases and ureases, and consequently these proteins were assumed to transport urea and short-chain aliphatic amides such as acetamide: (S. A. Wilson, R. J. Williams, L. H. Pearl, and R. E. Drew, J. Biol. Chem.270:18818–18824, 1995). Weeks et al. have shown that UreI of H. pylori, a 6 TMS protein of 195 amino acyl residues, forms an H+-gated urea channel. A histidyl residue (His 123), localized to a periplasmic loop of the protein, is essential for H+ stimulation of channel activity. UreI-mediated urea transport is urea specific, passive, nonsaturable, relatively temperature independent, and nonelectrogenic. It is the H+-gated urea channel that regulates cytoplasmic urease, the enzyme that allows survival and colonization of the stomach byH. pylori. The Ami family (TC no. 9.A.15 in Tables 2 and 3) has therefore been renamed the urea/amide channel (UAC) family and assigned TC no. 1.A.45. The TC number of the Mot family has been changed from 1.A.45 to 1.A.46.
A.-M. Marini, J.-Y. Springael, W. B. Frommer, and B. André (Mol. Microbiol. 35:378–385, 2000) have recently provided convincing evidence that the soybean SAT1 protein, which had been characterized as an NH4 + channel on the basis of its ability to complement an NH4 + transport defect in a mutant strain of Saccharomyces cerevisiae, is not in fact an NH4 + channel protein but instead is probably a transcription factor. SAT1 apparently restores NH4 + uptake in the yeast mutant strain by interfering with inhibition of one of the three NH4 + transporters of S. cerevisiae, Mep3 (Marini et al.). Mep3 is a member of the ammonium transporter (Amt) family (TC no. 2.A.49). TC no. 1.A.26 is therefore no longer assigned to the SAT family and has been reassigned to the plant plasmodesmata (PPD) family (see our website).
Considerable evidence is accumulating for the presence of multiple porins in the outer mycolate-containing membranes of certain high-G+C gram-positive bacteria. These bacteria include Mycobacterium tuberculosis (B. Kartman, S. Stengler, and M. Niederweis, J. Bacteriol. 181:6543–6546, 1999; R. Senaratne et al., J. Bacteriol. 180:3541–3547, 1998), Mycobacterium smegmatis (M. Niederweis et al., Mol. Microbiol.33:933–945, 1999; C. Raynaud et al., Microbiology145:1359–1367, 1999), Mycobacterium bovis (T. Lichtinger et al., FEBS Lett. 454:349–355, 1999),Nocardia farcinica (F. G. Riess et al., Mol. Microbiol.29:139–150, 1998), Nocardia asteroides (F. G. Riess et al., Arch. Microbiol. 171:173–182, 1999), andRhodococcus erythropolis (T. Lichtinger, G. Reiss, and R. Benz, J. Bacteriol. 182:764–770, 2000). One of these proteins is the OmpATb protein of M. tuberculosis, which has been reported to be a member of the OmpA-OmpF porin (OOP) family (TC no. 1.B.6.1.3; see our website); MspA of M. smegmatis, another such protein, is a member of a novel family which we have called the mycobacterial porin (MBP) family (TC no. 9.B.24) (M. Niederweis et al., 1999). A third such protein is a partially sequenced protein from Rhodococcus erythropolis which we have provisionally referred to as the R. erythropolis porin (REP; TC no. 9.C.3) (Lichtinger et al., 2000). The partial sequence available for the latter protein does not exhibit significant similarity to any sequence in the current databases.
The available sequence data suggest that the outer membrane porins of gram-positive bacteria will prove to belong to several distinct families. Although the few fully sequenced proteins currently available from mycolate-containing membranes have been placed under category 1.B (β-barrel porins), it should be noted that structural data are not yet available for any of these proteins. Consequently, they may prove to be more appropriately assigned to a different category in the future.
- Copyright © 2000 American Society for Microbiology