Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 201 to 300 out of 38750 for *

Category restricted to ProteinDomain (x)

0.011s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Chromo domain
Type: Domain
Description: The CHROMO (CHRromatin Organization MOdifier) domain [ , , , ] is a conserved region of around 60 amino acids, originally identified in Drosophila modifiers of variegation. These are proteins that alter the structure of chromatin to the condensed morphology of heterochromatin, a cytologically visible condition where gene expression is repressed. In one of these proteins, Polycomb, the chromo domain has been shown to be important for chromatin targeting. Proteins that contain a chromo domain appear to fall into 3 classes. The first class includes proteins having an N-terminal chromo domain followed by a region termed the chromo shadow domain, with weak but significant sequence similarity to the N-terminal chromo domain [ ], eg. Drosophila and human heterochromatin protein Su(var)205 (HP1). The second class includes proteins with a single chromo domain, eg. Drosophila protein Polycomb (Pc); mammalian modifier 3; human Mi-2 autoantigen and several yeast and Caenorhabditis elegans hypothetical proteins. In the third class paired tandem chromo domains are found, eg. in mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1.Functional dissections of chromo domain proteins suggests a mechanistic role for chromo domains in targeting chromo domain proteins to specific regions of the nucleus. The mechanism of targeting may involve protein-protein and/or protein/nucleic acid interactions. Hence, several line of evidence show that the HP1 chromo domain is a methyl-specific histone binding module, whereas the chromo domain of two protein components of the drosophila dosage compensation complex, MSL3 and MOF, contain chromo domains that bind to RNA in vitro [ ].The high resolution structures of HP1-family protein chromo and chromo shadow domain reveal a conserved chromo domain fold motif consisting of three β-strands packed against an α-helix. The chromo domain fold belongs to the OB (oligonucleotide/oligosaccharide binding)-fold class found in a variety of prokaryotic and eukaryotic nucleic acid binding protein [ ].
Protein Domain
Name: Retrotransposon gag domain
Type: Domain
Description: Transposable elements (TEs) promote various chromosomal rearrangements more efficiently, and often more specifically, than other cellular processes. Retrotransposons are structurally similar to retroviruses and are bounded by long terminal repeats. This entry represents eukaryotic Gag or capsid-related retrotranspon-related proteins, including Retrotransposon-derived protein PEG10 from Mus musculus, which binds its own mRNA and self-assembles into virion-like capsids [, ]. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved [].
Protein Domain
Name: Integrase, catalytic core
Type: Domain
Description: The retroviral integrase is the enzyme responsible for the insertion of a DNA copy of the viral genome into host DNA, an essential step in the replication cycle of viruses [ ]. Integrases comprise three functional and structural domains: the central core domain, which contains the catalytic residues, an N-terminal zinc finger and a C-terminal DNA binding domain [].The integrase catalytic domain catalyses a series of reactions to integrate the viral genome into a host chromosome. In the first step, it removes two 3' end nucleotides from each strand of the linear viral DNA, leaving overhanging CA-OH ends. In the second step, the processed 3' ends are covalently joined to the 5' ends of the target DNA. In the third step, which probably involves additional cellular enzymes, unpaired nucleotides at the viral 5' ends are removed and the ends are joined to the target site 3' ends, generating an integrated provirus flanked by five base-pair direct repeats of the target site DNA [ ].The crystal structure of the catalytic domain shows a dimeric structure, with each monomer containing a five-stranded β-sheet and six α-helices [ ]. This fold is characteristic of the polynucleotidyltransferase superfamily whose members include RNase H, the bacteriophage Mu transposase, and the E. coli Holliday junction resolving enzyme, RuvC []. The catalytic domain of integrase contains the DD35E triad motif. As in other DNA-binding proteins containing this motif, these acidic residues coordinate a divalent Mg2+ in the resting enzyme. Substituting any one of these residues abolishes both processing and integration activities of integrase.The integrase catalytic domain is also found in various transposase proteins.
Protein Domain
Name: Leucine-rich repeat, typical subtype
Type: Repeat
Description: Leucine-rich repeats (LRR) consist of 2-45 motifs of 20-30 amino acids in length that generally folds into an arc or horseshoe shape [ ]. LRRs occur in proteins ranging from viruses to eukaryotes, and appear to provide a structural framework for the formation of protein-protein interactions [, ].Proteins containing LRRs include tyrosine kinase receptors, cell-adhesion molecules, virulence factors, and extracellular matrix-binding glycoproteins, and are involved in a variety of biological processes, including signal transduction, cell adhesion, DNA repair, recombination, transcription, RNA processing, disease resistance, apoptosis, and the immune response [, ].Sequence analyses of LRR proteins suggested the existence of several different subfamilies of LRRs. The significance of this classification is that repeats from different subfamilies never occur simultaneously and have most probably evolved independently. It is, however, now clear that all major classes of LRR have curved horseshoe structures with a parallel beta sheet on the concave side and mostly helical elements on the convex side. At least six families of LRR proteins, characterised by different lengths and consensus sequences of the repeats, have been identified. Eleven-residue segments of the LRRs (LxxLxLxxN/CxL), corresponding to the β-strand and adjacent loop regions, are conserved in LRR proteins, whereas the remaining parts of the repeats (herein termed variable) may be very different. Despite the differences, each of the variable parts contains two half-turns at both ends and a "linear"segment (as the chain follows a linear path overall), usually formed by a helix, in the middle. The concave face and the adjacent loops are the most common protein interaction surfaces on LRR proteins. 3D structure of some LRR proteins-ligand complexes show that the concave surface of LRR domain is ideal for interaction with α-helix, thus supporting earlier conclusions that the elongated and curved LRR structure provides an outstanding framework for achieving diverse protein-protein interactions []. Molecular modeling suggests that the conserved pattern LxxLxL, which is shorter than the previously proposed LxxLxLxxN/CxL is sufficient to impart the characteristic horseshoe curvature to proteins with 20- to 30-residue repeats []. This entry represents a most populated subfamily of leucine-rich repeats.
Protein Domain
Name: Leucine rich repeat 4
Type: Repeat
Description: This entry represents 2 copies of a leucine rich repeat. Leucine rich repeats are short sequence motifs present in a number of proteins with diverse functions and cellular locations. These repeats are usually involved in protein-protein interactions. Each leucine rich repeat is composed of a beta-alpha unit. These units form elongated non-globular structures. Leucine rich repeats are often flanked by cysteine rich domains [ ].
Protein Domain
Name: Thioredoxin, conserved site
Type: Conserved_site
Description: Thioredoxins [ , , , ] are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein [].Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding β-strand 4, which makes contact with the active site cysteines, and is important for stability and function [ ]. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase []. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI ( ) [ , , ] is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding []. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent []. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity []. The various forms of PDI which are currently known are:PDI major isozyme; a multifunctional protein that also function as the beta subunit of prolyl 4-hydroxylase ( ), as a component of oligosaccharyl transferase ( ), as thyroxine deiodinase ( ), as glutathione-insulin transhydrogenase ( ) and as a thyroid hormone-binding protein ERp60 (ER-60; 58 Kd microsomal protein). ERp60 was originally thought to be a phosphoinositide-specific phospholipase C isozyme and later to be a protease.ERp72.ERp5.Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins include:Escherichia coli DsbA (or PrfA) and its orthologs in Vibrio cholerae (TtcpG) and Haemophilus influenzae (Por).E. coli DsbC (or XpRA) and its orthologues in Erwinia chrysanthemi and H. influenzae.E. coli DsbD (or DipZ) and its H. influenzae orthologue.E. coli DsbE (or CcmG) and orthologues in H. influenzae.Rhodobacter capsulatus (Rhodopseudomonas capsulata) (HelX), Rhiziobiacae (CycY and TlpA).This entry represents a conserved site found in the thioredoxin domain. This site contains two cysteines that form the redox-active disulphide bond.
Protein Domain
Name: Thioredoxin-like fold
Type: Domain
Description: Several biological processes regulate the activity of target proteins through changes in the redox state of thiol groups (S2 to SH2), where a hydrogen donor is linked to an intermediary disulphide protein. Such processes include the ferredoxin/thioredoxin system, the NADP/thioredoxin system, and the glutathione/glutaredoxin system [ ]. Several of these disulphide proteins share a common structure, consisting of a three-layer alpha/beta/alpha core.
Protein Domain
Name: Thioredoxin domain
Type: Domain
Description: This entry represents the thioredoxin domain.Thioredoxins [ , , , ] are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein [].Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding β-strand 4, which makes contact with the active site cysteines, and is important for stability and function [ ]. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase []. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI ( ) [ , , ] is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding []. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent []. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity []. The various forms of PDI which are currently known are:PDI major isozyme; a multifunctional protein that also function as the beta subunit of prolyl 4-hydroxylase ( ), as a component of oligosaccharyl transferase ( ), as thyroxine deiodinase ( ), as glutathione-insulin transhydrogenase ( ) and as a thyroid hormone-binding protein ERp60 (ER-60; 58 Kd microsomal protein). ERp60 was originally thought to be a phosphoinositide-specific phospholipase C isozyme and later to be a protease.ERp72.ERp5.Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins include:Escherichia coli DsbA (or PrfA) and its orthologs in Vibrio cholerae (TtcpG) and Haemophilus influenzae (Por).E. coli DsbC (or XpRA) and its orthologues in Erwinia chrysanthemi and H. influenzae.E. coli DsbD (or DipZ) and its H. influenzae orthologue.E. coli DsbE (or CcmG) and orthologues in H. influenzae.Rhodobacter capsulatus (Rhodopseudomonas capsulata) (HelX), Rhiziobiacae (CycY and TlpA).
Protein Domain
Name: Thioredoxin
Type: Family
Description: Thioredoxins [ , , , ] are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein [].Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding β-strand 4, which makes contact with the active site cysteines, and is important for stability and function []. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase []. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI ( ) [ , , ] is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding []. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent []. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity []. The various forms of PDI which are currently known are:PDI major isozyme; a multifunctional protein that also function as the beta subunit of prolyl 4-hydroxylase ( ), as a component of oligosaccharyl transferase ( ), as thyroxine deiodinase ( ), as glutathione-insulin transhydrogenase ( ) and as a thyroid hormone-binding protein ERp60 (ER-60; 58 Kd microsomal protein). ERp60 was originally thought to be a phosphoinositide-specific phospholipase C isozyme and later to be a protease.ERp72.ERp5.Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins include:Escherichia coli DsbA (or PrfA) and its orthologs in Vibrio cholerae (TtcpG) and Haemophilus influenzae (Por).E. coli DsbC (or XpRA) and its orthologues in Erwinia chrysanthemi and H. influenzae.E. coli DsbD (or DipZ) and its H. influenzae orthologue.E. coli DsbE (or CcmG) and orthologues in H. influenzae.Rhodobacter capsulatus (Rhodopseudomonas capsulata) (HelX), Rhiziobiacae (CycY and TlpA).This entry represents the thioredoxin protein family.
Protein Domain
Name: Multicopper oxidase, C-terminal
Type: Domain
Description: Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.Multicopper oxidases oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre; dioxygen binds to the trinuclear centre and, following the transfer of four electrons, is reduced to two molecules of water [ ]. There are three spectroscopically different copper centres found in multicopper oxidases: type 1 (or blue), type 2 (or normal) and type 3 (or coupled binuclear) [, ]. Multicopper oxidases consist of 2, 3 or 6 of these homologous domains, which also share homology to the cupredoxins azurin and plastocyanin. Structurally, these domains consist of a cupredoxin-like fold, a β-sandwich consisting of 7 strands in 2 β-sheets, arranged in a Greek-key β-barrel []. Multicopper oxidases include:Ceruloplasmin ( ) (ferroxidase), a 6-domain enzyme found in the serum of mammals and birds that oxidizes different inorganic and organic substances; exhibits internal sequence homology that appears to have evolved from the triplication of a Cu-binding domain similar to that of laccase and ascorbate oxidase. Laccase ( ) (urishiol oxidase), a 3-domain enzyme found in fungi and plants, which oxidizes different phenols and diamines. CueO is a laccase found in Escherichia coli that is involved in copper-resistance [ ].Ascorbate oxidase ( ), a 3-domain enzyme found in higher plants. Nitrite reductase ( ), a 2-domain enzyme containing type-1 and type-2 copper centres [ , ].Fission yeast fio1 (also known as SpAC1F7.08), a multicopper oxidase that contains three cupredoxin domains and may function together with Frp1 in iron and copper uptakes in S. pombe [ ].In addition to the above enzymes there are a number of other proteins that are similar to the multi-copper oxidases in terms of structure and sequence, some of which have lost the ability to bind copper. These include: copper resistance protein A (copA) from a plasmid in Pseudomonas syringae; domain A of (non-copper binding) blood coagulation factors V (Fa V) and VIII (Fa VIII) [ ]; yeast FET3 required for ferrous iron uptake [] and yeast FET5 (YFL041w), which similarly to FET3 it is an iron transport multicopper oxidase required for Fe2 ion high affinity uptake []. It is targeted to vacuole via AP-3 pathway []. This entry represents the C-terminal domain of multicopper oxidase.
Protein Domain
Name: Multicopper oxidase, N-terminal
Type: Domain
Description: Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.Multicopper oxidases oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre; dioxygen binds to the trinuclear centre and, following the transfer of four electrons, is reduced to two molecules of water [ ]. There are three spectroscopically different copper centres found in multicopper oxidases: type 1 (or blue), type 2 (or normal) and type 3 (or coupled binuclear) [, ]. Multicopper oxidases consist of 2, 3 or 6 of these homologous domains, which also share homology to the cupredoxins azurin and plastocyanin. Structurally, these domains consist of a cupredoxin-like fold, a β-sandwich consisting of 7 strands in 2 β-sheets, arranged in a Greek-key β-barrel []. Multicopper oxidases include:Ceruloplasmin ( ) (ferroxidase), a 6-domain enzyme found in the serum of mammals and birds that oxidizes different inorganic and organic substances; exhibits internal sequence homology that appears to have evolved from the triplication of a Cu-binding domain similar to that of laccase and ascorbate oxidase. Laccase ( ) (urishiol oxidase), a 3-domain enzyme found in fungi and plants, which oxidizes different phenols and diamines. CueO is a laccase found in Escherichia coli that is involved in copper-resistance [ ].Ascorbate oxidase ( ), a 3-domain enzyme found in higher plants. Nitrite reductase ( ), a 2-domain enzyme containing type-1 and type-2 copper centres [ , ].Fission yeast fio1 (also known as SpAC1F7.08), a multicopper oxidase that contains three cupredoxin domains and may function together with Frp1 in iron and copper uptakes in S. pombe [ ].In addition to the above enzymes there are a number of other proteins that are similar to the multi-copper oxidases in terms of structure and sequence, some of which have lost the ability to bind copper. These include: copper resistance protein A (copA) from a plasmid in Pseudomonas syringae; domain A of (non-copper binding) blood coagulation factors V (Fa V) and VIII (Fa VIII) [ ]; yeast FET3 required for ferrous iron uptake [] and yeast FET5 (YFL041w), which similarly to FET3 it is an iron transport multicopper oxidase required for Fe2 ion high affinity uptake []. It is targeted to vacuole via AP-3 pathway []. This entry represents the N-terminal domain (or coupled binuclear) of multicopper oxidase.
Protein Domain
Name: Multicopper oxidase, copper-binding site
Type: Binding_site
Description: The entry represents a conserved region containing the Cu-binding site found in multicopper oxidases. Multicopper oxidases oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre; dioxygen binds to the trinuclear centre and, following the transfer of four electrons, is reduced to two molecules of water [ ]. There are three spectroscopically different copper centres found in multicopper oxidases: type 1 (or blue), type 2 (or normal) and type 3 (or coupled binuclear) [, ]. Multicopper oxidases consist of 2, 3 or 6 of these homologous domains, which also share homology to the cupredoxins azurin and plastocyanin. Structurally, these domains consist of a cupredoxin-like fold, a β-sandwich consisting of 7 strands in 2 β-sheets, arranged in a Greek-key β-barrel []. Multicopper oxidases include:Ceruloplasmin ( ) (ferroxidase), a 6-domain enzyme found in the serum of mammals and birds that oxidizes different inorganic and organic substances; exhibits internal sequence homology that appears to have evolved from the triplication of a Cu-binding domain similar to that of laccase and ascorbate oxidase. Laccase ( ) (urishiol oxidase), a 3-domain enzyme found in fungi and plants, which oxidizes different phenols and diamines. CueO is a laccase found in Escherichia coli that is involved in copper-resistance [ ].Ascorbate oxidase ( ), a 3-domain enzyme found in higher plants. Nitrite reductase ( ), a 2-domain enzyme containing type-1 and type-2 copper centres [ , ].
Protein Domain
Name: Multicopper oxidase, second cupredoxin domain
Type: Domain
Description: Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.Multicopper oxidases oxidise their substrate by accepting electrons at a mononuclear copper centre and transferring them to a trinuclear copper centre; dioxygen binds to the trinuclear centre and, following the transfer of four electrons, is reduced to two molecules of water [ ]. There are three spectroscopically different copper centres found in multicopper oxidases: type 1 (or blue), type 2 (or normal) and type 3 (or coupled binuclear) [, ]. Multicopper oxidases consist of 2, 3 or 6 of these homologous domains, which also share homology to the cupredoxins azurin and plastocyanin. Structurally, these domains consist of a cupredoxin-like fold, a β-sandwich consisting of 7 strands in 2 β-sheets, arranged in a Greek-key β-barrel []. Multicopper oxidases include:Ceruloplasmin ( ) (ferroxidase), a 6-domain enzyme found in the serum of mammals and birds that oxidizes different inorganic and organic substances; exhibits internal sequence homology that appears to have evolved from the triplication of a Cu-binding domain similar to that of laccase and ascorbate oxidase. Laccase ( ) (urishiol oxidase), a 3-domain enzyme found in fungi and plants, which oxidizes different phenols and diamines. CueO is a laccase found in Escherichia coli that is involved in copper-resistance [ ].Ascorbate oxidase ( ), a 3-domain enzyme found in higher plants. Nitrite reductase ( ), a 2-domain enzyme containing type-1 and type-2 copper centres [ , ].Fission yeast fio1 (also known as SpAC1F7.08), a multicopper oxidase that contains three cupredoxin domains and may function together with Frp1 in iron and copper uptakes in S. pombe [ ].In addition to the above enzymes there are a number of other proteins that are similar to the multi-copper oxidases in terms of structure and sequence, some of which have lost the ability to bind copper. These include: copper resistance protein A (copA) from a plasmid in Pseudomonas syringae; domain A of (non-copper binding) blood coagulation factors V (Fa V) and VIII (Fa VIII) [ ]; yeast FET3 required for ferrous iron uptake [] and yeast FET5 (YFL041w), which similarly to FET3 it is an iron transport multicopper oxidase required for Fe2 ion high affinity uptake []. It is targeted to vacuole via AP-3 pathway []. This entry represents the second cupredoxin domain of multicopper oxidases. This domain is also present in proteins that have lost the ability to bind copper.
Protein Domain
Name: CCB3/YggT
Type: Family
Description: This family includes YlmG from bacteria, CCB3 and YlmG homologue proteins (YLMG1-1, YLMG1-2, YLMG2) from Arabidopsis [ ]. This family also includes uncharacterised protein YggT, associated with bacteria outside of the cyanobacteria. YlmG might be involved in chloroplast and cyanobacterial division processes []. Cofactor maturation pathways such as the CCB system (system IV) for cytochrome c-heme attachment are conserved in all organisms performing oxygenic photosynthesis [ ]. The CCB system consists of four proteins: CCB1-4. CCB2 and CCB4 are paralogues derived from a unique cyanobacterial ancestor []. Orthologues are conserved in higher plants [].
Protein Domain
Name: Aspartate/ornithine carbamoyltransferase
Type: Family
Description: This family contains two related enzymes:Aspartate carbamoyltransferase ( ) (ATCase) catalyses the conversion of aspartate and carbamoyl phosphate to carbamoylaspartate, the second step in the de novobiosynthesis of pyrimidine nucleotides [ ]. In prokaryotes ATCase consists of two subunits: a catalytic chain (gene pyrB) and a regulatory chain (gene pyrI), while in eukaryotes it is a domain in a multi-functional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD in mammals []) that also catalyses other steps of the biosynthesis of pyrimidines.Ornithine carbamoyltransferase ( ) (OTCase) catalyses the conversion of ornithine and carbamoyl phosphate to citrulline. In mammals, this enzyme participates in the urea cycle [ ] and is located in the mitochondrial matrix. In prokaryotes and eukaryotic microorganisms it is involved in the biosynthesis of arginine. In some bacterial species it is also involved in the degradation of arginine [] (the arginine deaminase pathway).It has been shown [ ] that these two enzymes are evolutionary related. The predicted secondary structure of both enzymes are similar and there are some regions of sequence similarities. One of these regions includes three residues which have been shown, by crystallographic studies [], to be implicated in binding the phosphoryl group of carbamoyl phosphate.
Protein Domain
Name: Aspartate carbamoyltransferase
Type: Family
Description: Aspartate carbamoyltransferase (ATCase) catalyses the formation of carbamoyl-aspartate in the pyrimidine biosynthesis pathway, by the association of aspartate and carbamoyl-phosphate. This is the commitment step in the Escherichia coli pathway and is regulated by feedback inhibition by CTP, the final product of the pathway [ ].The structural organisation of the ATCase protein varies considerably between different organisms. In bacteria such as E. coli, Salmonella typhimurium andSerratia marcescens, the ATCase is a dodecamer of 2 catalytic (c) trimers and 3 regulatory (r) dimers. The catalytic domains are coded for by thepyrB gene [ ], and the regulatory domains by pyrI []. In Gram-positive bacteriasuch as Bacillus subtilis, ATCase exists as a trimer of catalytic subunits, but unlike in E. coli, it neither contains nor binds to regulatory subunits. Ineukaryotes, ATCase is found as a single domain in a multifunctional enzyme that contains activity for glutamine amidotransferase, carbamoylphosphatesynthetase, dihydroorotase, and aspartate carbamoyltransferase.
Protein Domain
Name: Aspartate/ornithine carbamoyltransferase, carbamoyl-P binding
Type: Domain
Description: This entry contains two related enzymes: Aspartate carbamoyltransferase ( ) (ATCase) catalyzes the conversion of aspartate and carbamoyl phosphate to carbamoylaspartate, the second stepin the de novobiosynthesis of pyrimidine nucleotides [ ]. In prokaryotesATCase consists of two subunits: a catalytic chain (gene pyrB) and a regulatory chain (gene pyrI), while in eukaryotes it is a domain in a multi-functional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD in mammals []) that also catalyzes other steps of the biosynthesis ofpyrimidines. Ornithine carbamoyltransferase ( ) (OTCase) catalyzes the conversion of ornithine and carbamoyl phosphate to citrulline. In mammals this enzymeparticipates in the urea cycle [ ] and is located in the mitochondrialmatrix. In prokaryotes and eukaryotic microorganisms it is involved in the biosynthesis of arginine. In some bacterial species it is also involved in the degradation of arginine [] (the arginine deaminase pathway).It has been shown [] that these two enzymes are evolutionary related. Thepredicted secondary structure of both enzymes are similar and there are some regions of sequence similarities. One of these regions includes threeresidues which have been shown, by crystallographic studies [ ], to beimplicated in binding the phosphoryl group of carbamoyl phosphate and may also play a role in trimerization of the molecules [ ]. The carboxyl-terminal, aspartate/ornithine-binding domain is is described by .
Protein Domain
Name: Aspartate/ornithine carbamoyltransferase, Asp/Orn-binding domain
Type: Domain
Description: This family contains two related enzymes: Aspartate carbamoyltransferase ( ) (ATCase) catalyzes the conversion of aspartate and carbamoyl phosphate to carbamoylaspartate, the second stepin the de novobiosynthesis of pyrimidine nucleotides [ ]. In prokaryotesATCase consists of two subunits: a catalytic chain (gene pyrB) and a regulatory chain (gene pyrI), while in eukaryotes it is a domain in a multi-functional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD in mammals []) that also catalyzes other steps of the biosynthesis ofpyrimidines. Ornithine carbamoyltransferase ( ) (OTCase) catalyzes the conversionof ornithine and carbamoyl phosphate to citrulline. In mammals this enzyme participates in the urea cycle [] and is located in the mitochondrialmatrix. In prokaryotes and eukaryotic microorganisms it is involved in the biosynthesis of arginine. In some bacterial species it is also involved in thedegradation of arginine [ ] (the arginine deaminase pathway).It has been shown [] that these two enzymes are evolutionary related. Thepredicted secondary structure of both enzymes are similar and there are some regions of sequence similarities. One of these regions includes threeresidues which have been shown, by crystallographic studies [ ], to beimplicated in binding the phosphoryl group of carbamoyl phosphate and is described by . The carboxyl-terminal, aspartate/ornithine-binding domain is connected to the amino-terminal domain by two α-helices, which comprise a hinge between domains [].
Protein Domain
Name: Putative S-adenosyl-L-methionine-dependent methyltransferase
Type: Family
Description: This is a family of putative S-adenosyl-L-methionine (SAM)-dependent methyltransferases [ , , ].
Protein Domain
Name: Heavy metal-associated domain, HMA
Type: Domain
Description: Proteins that transport heavy metals in micro-organisms and mammals share similarities in their sequences and structures.These proteins provide an important focus for research, some being involved in bacterial resistance to toxic metals, such as lead and cadmium, while others are involved in inherited human syndromes, such as Wilson's and Menke's diseases [ ]. A conserved domain has been found in a number of these heavy metal transport or detoxification proteins [ ]. The domain, which has been termed Heavy-Metal-Associated (HMA), contains two conserved cysteines that are probably involved in metal binding.Structure solution of the fourth HMA domain of the Menke's copper transporting ATPase shows a well-defined structure comprising a four-stranded antiparallel β-sheet and two α-helices packed in an α-β sandwich fold [ ]. This fold is common to other domains and is classified as "ferredoxin-like".
Protein Domain
Name: Derlin
Type: Family
Description: The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae (Baker's yeast) contains a proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process, and the classes were called der for degradation in the ER. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins, suggesting that the function of the Der1 protein may be specifically required for the degradation process associated with the ER [ ]. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. This family may also mediate degradation of misfolded proteins.
Protein Domain
Name: Tetraspanin/Peripherin
Type: Family
Description: Tetraspanins are a distinct family of proteins, containing four transmembrane domains: a small outer loop (EC1), a larger outer loop (EC2), a small inner loop (IL) and short cytoplasmic tails. They contain characteristic structural features, including 4-6 conserved extracellular cysteine residues, and polar residues within transmembrane domains. A fundamental role of tetraspanins appears to be organizing other proteins into a network of multimolecular membrane microdomains, sometimes called the 'tetraspanin web'. This entry represents tetraspanin proteins. It also recognises a number of peripherins. These are related retinal-specific members of the tetraspanin family which are located at the rims of the photoreceptor disks, where they may act jointly in disk morphogenesis [ ].
Protein Domain
Name: Myb domain, plants
Type: Domain
Description: This DNA-binding domain is restricted to (but common in) plant proteins, many of which also contain a response regulator domain. The domain appears related to the Myb-like DNA-binding domain [ , ].
Protein Domain
Name: Myb domain
Type: Domain
Description: The myb-type HTH domain is a DNA-binding, helix-turn-helix (HTH) domain of approximately 55 amino acids, typically occurring in a tandem repeat in eukaryotic transcription factors. The domain is named after the retroviral oncogene v-myb, and its cellular counterpart c-myb, which encode nuclear DNA-binding proteins that specifically recognise the sequence YAAC(G/T)G [ , ]. Myb proteins contain three tandem repeats of 51 to 53 amino acids, termed R1, R2 and R3. This repeat region is involved in DNA-binding and R2 and R3 bind directly to the DNA major groove. The major part of the first repeat is missing in retroviral v-Myb sequences and in plant myb-related (R2R3) proteins []. A single myb-type HTH DNA-binding domain occurs in TRF1 and TRF2.The 3D-structure of the myb-type HTH domain forms three α-helices [ ]. The second and third helices connected via a turn comprisethe helix-turn-helix motif. Helix 3 is termed the recognition helix as it binds the DNA major groove, like in other HTHs.
Protein Domain
Name: CheY-like superfamily
Type: Homologous_superfamily
Description: CheY is a member of the response regulator family in bacterial two-component signalling systems, where CheY receives the signal from the sensor partner, usually a histidine protein kinase. Signal transduction involves phosphotransfer, whereby the histidine kinase phosphorylates a conserved aspartate in the response regulator to activate responses to environmental signals [ ]. CheY is a single domain protein that folds into a compact globular unit with a flavodoxin-like fold consisting of three-layer alpha/beta/alpha sandwich with 21345 beta topology, where the phosphorylation region lies in a cavity.Other members of the response regulator family contain a CheY-like receiver domain, which is often found N-terminal to a DNA-binding effector domain. Examples include NarL (nitrate/nitrite response regulator), NtrC (nitrogen regulatory protein C), Spo0A and Spo0F (sporulation response) from Bacillus, PhoA and PhoB cyclin-dependent kinases from Aspergillus, among others.AmiR, the positive regulator of the amidase operon in Psuedomonas, is an unusual member of the bacterial response regulator family; AmiR is able to bind RNA and uses ligand-regulated activation rather than phopho-activation. It has a CheY-like fold at its N terminus, but contains two subdomains in a C-terminal extension, one forming a coiled-coil and the other a long α-helix. As such AmiR may represent a new family of RNA-binding response regulators [ ].CheY-like domains can be found in other protein families as well. Examples include the receiver domain of the ethylene receptor (ETR1) from Arabidopsis, which is involved in ethylene detection and signal transduction [ ]; the N-terminal wing' domain of ornithine decarboxylase from Lactobacilli, which catalyses the conversion of ornithine to putrescine at the beginning of the polyamine pathway [ ]. The N-terminal domain of the circadian clock protein, KaiA, from cyanobacteria, acts as a psuedo-receiver domain, but lacks the conserved aspartyl residue required for phosphotransfer in response regulators [].
Protein Domain
Name: Signal transduction response regulator, receiver domain
Type: Domain
Description: Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [ ]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [ , ].Bipartite response regulator proteins are involved in a two-component signal transduction system in bacteria, and certain eukaryotes like protozoa, that functions to detect and respond to environmental changes [ ]. These systems have been detected during host invasion, drug resistance, motility, phosphate uptake, osmoregulation, and nitrogen fixation, amongst others []. The two-component system consists of a histidine protein kinase environmental sensor that phosphorylates the receiver domain of a response regulator protein; phosphorylation induces a conformational change in the response regulator, which activates the effector domain, triggering the cellular response []. The domains of the two-component proteins are highly modular, but the core structures and activities are maintained.The response regulators act as phosphorylation-activated switches to affect a cellular response, usually by transcriptional regulation. Most of these proteins consist of two domains, an N-terminal response regulator receiver domain, and a variable C-terminal effector domain with DNA-binding activity. This entry represents the response regulator receiver domain, which belongs to the CheY family, and receives the signal from the sensor partner in the two-component system.
Protein Domain
Name: Band 7 domain
Type: Domain
Description: The band-7 protein family comprises a diverse set of membrane-bound proteins characterised by the presence of a conserved domain, the band-7 domain, also known as SPFH or PHB domain [ ]. The exact function of the band-7 domain is not known, but examples from animal and bacterial stomatin-type proteins demonstrate binding to lipids and the ability to assemble into membrane-bound oligomers that form putative scaffolds []. A variety of proteins belong to this family. These include the prohibitins, cytoplasmic anti-proliferative proteins and stomatin, an erythrocyte membrane protein. Bacterial HflC protein also belongs to this family.Note: Band 4.1 and Band 7 proteins refer to human erythrocyte membrane proteins separated by SDS polyacrylamide gels and stained with coomassie blue [ ].
Protein Domain
Name: Prohibitin
Type: Family
Description: This entry describes proteins similar to prohibitin (a lipid raft-associated integral membrane protein). Individual proteins of the SPFH (band 7) domain superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes [ ]. These microdomains, in addition to being stable scaffolds, may also be dynamic units with their own regulatory functions [, ]. Prohibitin is a mitochondrial inner-membrane protein which may act as a chaperone for the stabilization of mitochondrial proteins. Human prohibitin forms a hetero-oligomeric complex with Bap-37 (prohibitin 2, an SPFH domain carrying homologue). This complex may protect non-assembled membrane proteins against proteolysis by the m-AAA protease [, ]. Prohibitin and Bap-37 yeast homologues have been implicated in yeast longevity [] and in the maintenance of mitochondrial morphology. Sequence comparisons suggest that the prohibitin gene is an analogue of Cc, a Drosophila melanogaster gene that is vital for normal development [].Genes that negatively regulate proliferation inside the cell are of considerable interest because of the implications in processes such as development and cancer []. Prohibitin acts as a cytoplasmic anti-proliferative protein, is widely expressed in a variety of tissues and inhibits DNA synthesis. Studies have suggested that prohibitin may be a suppressor gene and is associated with tumour development and/or progression of at least some breast cancers [].
Protein Domain
Name: Zinc finger, PHD-finger
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the PHD (homeodomain) zinc finger domain [ ], which is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated transcriptional regulation. The PHD finger motif is reminiscent of, but distinct from the C3HC4 type RING finger.The function of this domain is not yet known but in analogy with the LIM domain it could be involved in protein-protein interaction and be important for the assembly or activity of multicomponent complexes involved in transcriptional activation or repression. Alternatively, the interactions could be intra-molecular and be important in maintaining the structural integrity of the protein. In similarity to the RING finger and the LIM domain, the PHD finger is thought to bind two zinc ions.
Protein Domain
Name: Chromo/chromo shadow domain
Type: Domain
Description: The CHROMO (CHRromatin Organization MOdifier) domain [ , , , ] is a conserved region of around 60 amino acids, originally identified in Drosophila modifiers of variegation. These are proteins that alter the structure of chromatin to the condensed morphology of heterochromatin, a cytologically visible condition where gene expression is repressed. In one of these proteins, Polycomb, the chromo domain has been shown to be important for chromatin targeting. Proteins that contain a chromo domain appear to fall into 3 classes. The first class includes proteins having an N-terminal chromo domain followed by a region termed the chromo shadow domain, with weak but significant sequence similarity to the N-terminal chromo domain [], eg. Drosophila and human heterochromatin protein Su(var)205 (HP1). The second class includes proteins with a single chromo domain, eg. Drosophila protein Polycomb (Pc); mammalian modifier 3; human Mi-2 autoantigen and several yeast and Caenorhabditis elegans hypothetical proteins. In the third class paired tandem chromo domains are found, eg. in mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1.Functional dissections of chromo domain proteins suggests a mechanistic role for chromo domains in targeting chromo domain proteins to specific regions of the nucleus. The mechanism of targeting may involve protein-protein and/or protein/nucleic acid interactions. Hence, several line of evidence show that the HP1 chromo domain is a methyl-specific histone binding module, whereas the chromo domain of two protein components of the drosophila dosage compensation complex, MSL3 and MOF, contain chromo domains that bind to RNA in vitro [ ].The high resolution structures of HP1-family protein chromo and chromo shadow domain reveal a conserved chromo domain fold motif consisting of three β-strands packed against an α-helix. The chromo domain fold belongs to the OB (oligonucleotide/oligosaccharide binding)-fold class found in a variety of prokaryotic and eukaryotic nucleic acid binding protein [ ].
Protein Domain
Name: SNF2, N-terminal
Type: Domain
Description: This domain is found in proteins involved in a variety of processes including transcription regulation (e.g., SNF2, STH1, brahma, MOT1), DNA repair (e.g., ERCC6, RAD16, RAD5), DNA recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) as well as a variety of other proteins with little functional information (e.g., lodestar, ETL1) [ , , ]. SNF2 functions as the ATPase component of the SNF2/SWI multisubunit complex, which utilises energy derived from ATP hydrolysis to disrupt histone-DNA interactions, resulting in the increased accessibility of DNA to transcription factors [].Proteins that contain this domain appear to be distantly related to the DEAX box helicases.
Protein Domain
Name: Helicase superfamily 1/2, ATP-binding domain
Type: Domain
Description: Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) and Walker B(Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs has beenidentified [ ]. These two superfamilies encompass a large number of DNA andRNA helicases from archaea, eubacteria, eukaryotes and viruses that seem to be active as monomers or dimers. RNA and DNA helicases are considered to beenzymes that catalyze the separation of double-stranded nucleic acids in an energy-dependent manner [].The various structures of SF1 and SF2 helicases present a common core with two α-β RecA-like domains [, ]. Thestructural homology with the RecA recombination protein covers the five contiguous parallel beta strands and the tandem alpha helices. ATP binds tothe amino proximal α-β domain, where the Walker A (motif I) and Walker B (motif II) are found. The N-terminal domain also contains motif III (S-A-T)which was proposed to participate in linking ATPase and helicase activities. The carboxy-terminal α-β domain is structurally very similar to theproximal one even though it is bereft of an ATP-binding site, suggesting that it may have originally arisen through gene duplication of the first one.Some members of helicase superfamilies 1 and 2 are listed below: DEAD-box RNA helicases. The prototype of DEAD-box proteins is the translation initiation factor eIF4A. The eIF4A protein isan RNA-dependent ATPase which functions together with eIF4B as an RNA helicase [].DEAH-box RNA helicases. Mainly pre-mRNA-splicing factor ATP-dependent RNA helicases [].Eukaryotic DNA repair helicase RAD3/ERCC-2, an ATP-dependent 5'-3' DNA helicase involved in nucleotide excision repair of UV-damaged DNA.Eukaryotic TFIIH basal transcription factor complex helicase XPB subunit. An ATP-dependent 3'-5' DNA helicase which is a component of the core-TFIIHbasal transcription factor, involved in nucleotide excision repair (NER) of DNA and, when complexed to CAK, in RNA transcription by RNA polymerase II.It acts by opening DNA either around the RNA transcription start site or the DNA.Eukaryotic ATP-dependent DNA helicase Q. A DNA helicase that may play a role in the repair of DNA that is damaged by ultraviolet light or othermutagens.Bacterial and eukaryotic antiviral SKI2-like helicase. SKI2 has a role in the 3'-mRNA degradation pathway, repressing dsRNA virus propagation byspecifically blocking translation of viral mRNAs, perhaps recognizing the absence of CAP or poly(A).Bacterial DNA-damage-inducible protein G (DinG). A probable helicase involved in DNA repair and perhaps also replication [].Bacterial primosomal protein N' (PriA). PriA protein is one of seven proteins that make up the restart primosome, an apparatus that promotesassembly of replisomes at recombination intermediates and stalled replication forks.Bacterial ATP-dependent DNA helicase recG. It has a critical role in recombination and DNA repair, helping process Holliday junctionintermediates to mature products by catalyzing branch migration. It has a DNA unwinding activity characteristic of helicases with a 3' to 5'polarity.A variety of DNA and RNA virus helicases and transcription factorsThis entry represents the DNA-binding domain of classical SF1 and SF2 helicases. It does not recognize bacterial DinG and eukaryotic Rad3 which differ from other SF1-SF2 helicases by the presence of a large insert after the Walker A (see ).
Protein Domain
Name: Helicase, C-terminal
Type: Domain
Description: Helicases have been classified in 5 superfamilies (SF1-SF5). For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs has been identified [ ]. These two superfamilies encompass a large number of DNA and RNA helicases from archaea, eubacteria, eukaryotes and viruses.This entry represents the C-terminal domain found in proteins belonging to the helicase superfamilies 1 and 2. Included in this group is the eukaryotic translation initiation factor 4A (eIF4A), a member of the DEA(D/H)-box RNA helicase family. The structure of the carboxyl-terminal domain of eIF4A has been determined; it has a parallel α-β topology that superimposes, with minor variations, on the structures and conserved motifs of the equivalent domain in other, distantly related helicases [ ].
Protein Domain
Name: NAD kinase/diacylglycerol kinase-like domain superfamily
Type: Homologous_superfamily
Description: ATP-NAD kinases ( ) catalyse the phosphorylation of NAD to NADP utilizing ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. ATP-NAD kinase contains two domains, where domain 1 has an alpha/beta topology that is related in structure to the N-terminal of phosphofructokinase, and domain 2 has an atypical β-sandwich topology made of four structural repeats of beta(3) units [ , ].
Protein Domain
Name: Diacylglycerol kinase, catalytic domain
Type: Domain
Description: The DAG-kinase catalytic domain or DAGKc domain is present in mammalian lipid kinases, such as diacylglycerol (DAG), ceramide and sphingosine kinases, as well as in related bacterial proteins [ , ]. Eukaryotic DAG-kinase () catalyses the phosphorylation of DAG to phosphatidic acid, thus modulating the balance between the two signaling lipids. At least ten different isoforms have been identified in mammals, which form 5 groups characterised by different functional domains, such as the calcium-binding EF hand (see ), PH (see ), SAM (see ) , DAG/PE-binding C1 domain (see ) and ankyrin repeats (see ) [ ]. In bacteria, an integral membrane DAG kinase forms a homotrimeric protein that lacks the DAGKc domain (see ). In contrast, the bacterial yegS protein is a soluble cytosolic protein that contains the DAGKc domain in the N-terminal part. YegS is a lipid kinase with two structural domains, wherein the active site is located in the interdomain cleft, C-terminal to the DAGKc domain which forms an alpha/beta fold [ ]. The tertiary structure resembles that of NAD kinases and contains a metal-binding site in the C-terminal region [, ]. This domain is usually associated with an accessory domain (see ).
Protein Domain
Name: Sugar/inositol transporter
Type: Family
Description: The sugar transporters belong to a superfamily of membrane proteins responsible for the binding and transport of various carbohydrates, organic alcohols, and acids in a wide range of prokaryotic and eukaryotic organisms [ ]. These integral membrane proteins are predicted to comprise twelve membrane spanning domains. It is likely that the transporters have evolved from an ancient protein present in living organisms before the divergence into prokaryotes and eukaryotes []. In mammals, these proteins are expressed in a number of organs [].This family includes sugar transporters and the myo-inositol transporters.
Protein Domain
Name: Sugar transporter, conserved site
Type: Conserved_site
Description: The sugar transporters belong to a superfamily of membrane proteins responsible for the binding and transport of various carbohydrates, organic alcohols, and acids in a wide range of prokaryotic and eukaryotic organisms [ ]. These integral membrane proteins are predicted to comprise twelve membrane spanning domains. It is likely that the transporters have evolved from an ancient protein present in living organisms before the divergence into prokaryotes and eukaryotes []. In mammals, these proteins are expressed in a number of organs [].
Protein Domain
Name: Major facilitator, sugar transporter-like
Type: Family
Description: This entry represents a subfamily of the major facilitator superfamily. Members in this family include sugar transporters, which are responsible for the binding and transport of various carbohydrates, organic alcohols, and acids in a wide range of prokaryotic and eukaryotic organisms [ ]. Most but not all members of this family catalyse sugar transport []. Recent genome-sequencing data and a wealth of biochemical and molecular genetic investigations have revealed the occurrence of dozens of families of primary and secondary transporters. Two such families have been found to occur ubiquitously in all classifications of living organisms. These are the ATP-binding cassette (ABC) superfamily and the major facilitator superfamily (MFS), also called the uniporter-symporter-antiporter family. While ABC family permeases are in general multicomponent primary active transporters, capable of transporting both small molecules and macromolecules in response to ATP hydrolysis the MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients. Although well over 100 families of transporters have now been recognised and classified, the ABC superfamily and MFS account for nearly half of the solute transporters encoded within the genomes of microorganisms. They are also prevalent in higher organisms. The importance of these two families of transport systems to living organisms can therefore not be overestimated [].The MFS was originally believed to function primarily in the uptake of sugars but subsequent studies revealed that drug efflux systems, Krebs cycle metabolites, organophosphate:phosphate exchangers, oligosaccharide:H1 symport permeases, and bacterial aromatic acid permeases were all members of the MFS. These observations led to the probability that the MFS is far more widespread in nature and far more diverse in function than had been thought previously. 17 subgroups of the MFS have been identified [ ].Evidence suggests that the MFS permeases arose by a tandem intragenic duplication event in the early prokaryotes. This event generated a 2-transmembrane-spanner (TMS) protein topology from a primordial 6-TMS unit. Surprisingly, all currently recognised MFS permeases retain the two six-TMS units within a single polypeptide chain, although in 3 of the 17 MFS families, an additional two TMSs are found [ ]. Moreover, the well-conserved MFS specific motif between TMS2 and TMS3 and the related but less well conserved motif between TMS8 and TMS9 [] prove to be a characteristic of virtually all of the more than 300 MFS proteins identified.This family includes sugar and other type of transporters.
Protein Domain
Name: Major facilitator superfamily domain
Type: Domain
Description: Transporters can be grouped in two classes, primary and secondary carriers. The primary active transporters drive solute accumulation or extrusion by using ATP hydrolysis, photon absorption, electron flow, substrate decarboxylation or methyl transfer. If charged molecules are unidirectionally pumped as a consequence of the consumption of a primary cellular energy source, electron chemical potential results. This potential can than be used to drive the active transport of additional solutes via secondary carriers.Among the different transporter the two largest families that occur ubiquitously in all classifications of organisms are the ATP-Binding Cassette (ABC) primary transporter superfamily (see ) and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients [ , ]. They function as uniporters, symporters or antiporters. In addition their solute specificity are also diverse. MFS proteins contain 12 transmembrane regions (with some variations).The 3D-structure of human GLUT1, an archetype of the major facilitator superfamily has been solved [ ]. Helices 1-5, 8, 10-12 are arranged in a 9-member barrel-like manner, delimiting a hydrophilic central channel. Helix 7 is located in the centre of the channel suggesting a role in regulating transport of solutes through the channel.Some proteins known to belong to the MFS superfamily are listed below:Sugar transporters. The largest family, they can function by uniport, solute-solute antiport or solute-cation symport depending on the system or conditions (see ). Drug:H+ antiporters or multidrug transporters. The extrusion of cytotoxic drugs from multidrug resistant cells by overexpressed multidrug transporter is an important cause of failure of the drug-based treatment of patient with cancers or infections by pathogenic microorganisms. Organophosphate:Pi antiporters (OPA). Small permeases restricted to bacteria. Oligosaccharide:H+ symporters (OHS). Permeases restricted to bacteria. Metabolite:H+ symporters (MHS). Nitrate/nitrite symporter (NNP). This family is present in bacteria, fungi and plants. It catalyzes either nitrate uptake or nitrite efflux. Phosphate:H+ symporters (PHS). It is present only in fungi and plants. Nucleoside:H+ symporters (NHS). Small permeases restricted to Gram-negative bacteria.Oxalate/formate antiporters (OFA). Present in bacteria, archaea and eukaryotes. Sialate:H+ symporters (SHS). Small permeases restricted to Gram-negative bacteria. Monocarboxylate porters (MCP). Anion:cation symporters (ACS). Aromatic acid:H+ symporters (AAHS). They transport a variety of aromatic acids as well as cis,cis-muconate. One member of this family (PCAK) serves as a chemoreceptor allowing the bacteria to swim up concentration gradients of its substrate [ ].Cyanate permeases (CP). Small bacterial proteins of around 400 residues. Proton-dependent oligopeptide transporters (POT). AAHS and POT are the most divergent MFS families. This entry represents the MFS superfamily domain, which consists of twelve transmembrane helices. This domain can be found in glycerol-3-phosphate transporter from Escherichia coli, which transports glycerol-3-phosphate into the cytoplasm and inorganic phosphate into the periplasm [ ]. The E. coli proton/sugar transporter lactose permease (LacY) also carries this domain, and acts to couple lactose and H+ translocation [, ].
Protein Domain      
Protein Domain
Name: Transcription termination and cleavage factor, C-terminal domain
Type: Domain
Description: The C-terminal section of cleavage stimulation and termination factor CstF-64 (CSTF2) and its yeast orthologue Rna15 form a discreet structure that is crucial for mRNA 3'-end processing [ ]. This domain interacts with Pcf11 and possibly PC4, thus linking CSTF2 to transcription, transcriptional termination, and cell growth [].Proteins containing this domain also include Pti1 protein from budding yeast. Pti1 is an essential component of CPF (cleavage and polyadenylation factor) involved in 3' end formation of snoRNA and mRNA [ ].
Protein Domain
Name: Cleavage stimulation factor subunit 2, hinge domain
Type: Domain
Description: The hinge domain of cleavage stimulation factor subunit 2 proteins, CSTF2, is necessary for binding to the subunit CstF-77 within the polyadenylation complex and subsequent nuclear localisation. This suggests that nuclear import of a pre-formed CSTF complex is an essential step in polyadenylation. Accurate and efficient polyadenylation is essential for transcriptional termination, nuclear export, translation, and stability of eukaryotic mRNAs. CSTF2 is an important regulatory subunit of the polyadenylation complex [ ].
Protein Domain
Name: Potassium transporter
Type: Family
Description: This is a family of K+ potassium transporters that are conserved across phyla, having both bacterial (KUP) [ ], yeast (HAK) [], and plant (AtKT/POT) [] sequences as members. POT1 from Arabidopsis thaliana plays an essential role in telomere maintenance [].
Protein Domain
Name: Peptidase T2, asparaginase 2
Type: Family
Description: Threonine peptidases are characterised by a threonine nucleophile at the N terminus of the mature enzyme. The threonine peptidases belong to clan PB or are unassigned, clan T-. The type example for this clan is the archaean proteasome beta component of Thermoplasma acidophilum.This group of sequences have a signature that places them in MEROPS peptidase family T2 (clan PB(T)). The glycosylasparaginases ( ) are threonine peptidases. Also in this family is L-asparaginase ( ), which catalyses the following reaction: L-asparagine + H2O = L-aspartate + NH 3Glycosylasparaginase catalyses: N4-(beta-N-acetyl-D-glucosaminyl)-L-asparagine + H(2)O = N-acetyl-beta-glucosaminylamine + L-aspartatecleaving the GlcNAc-Asn bond that links oligosaccharides to asparagine in N-linked glycoproteins. The enzyme is composed of two non-identical alpha/beta subunits joined by strong non-covalent forces and has one glycosylation site located in the alpha subunit [] and plays a major role in the degradation of glycoproteins.
Protein Domain
Name: Ribosomal protein L5
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].Ribosomal protein L5, ~180 amino acids in length, is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [ , , ], groups:Eubacterial L5.Algal chloroplast L5.Cyanelle L5.Archaebacterial L5.Mammalian L11.Tetrahymena thermophila L21.Dictyostelium discoideum (Slime mold) L5Saccharomyces cerevisiae (Baker's yeast) L16 (39A).Plant mitochondrial L5.
Protein Domain
Name: Ribosomal protein L5, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L5, ~180 amino acids in length, is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [ , , ], groups:Eubacterial L5.Algal chloroplast L5.Cyanelle L5.Archaebacterial L5.Mammalian L11.Tetrahymena thermophila L21.Dictyostelium discoideum (Slime mold) L5Saccharomyces cerevisiae (Baker's yeast) L16 (39A).Plant mitochondrial L5.This entry represents a short conserved sequence found in the N-terminal region of these proteins
Protein Domain
Name: Ribosomal protein L5 domain superfamily
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L5, ~180 amino acids in length, is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [ , , ], groups:Eubacterial L5.Algal chloroplast L5.Cyanelle L5.Archaebacterial L5.Mammalian L11.Tetrahymena thermophila L21.Dictyostelium discoideum (Slime mold) L5Saccharomyces cerevisiae (Baker's yeast) L16 (39A).Plant mitochondrial L5.This superfamily represents the L5 structural domain, which forms a 2 layer, mainly antiparallel β-sheet.
Protein Domain
Name: NAC domain
Type: Domain
Description: The NAC domain (for Petunia hybrida (Petunia) NAM and for Arabidopsis ATAF1, ATAF2, and CUC2) is an N-terminal module of ~160 amino acids, which is found in proteins of the NAC family of plant-specific transcriptional regulators (no apical meristem (NAM) proteins) [ ]. NAC proteins are involved in developmental processes, including formation of the shoot apical meristem, floral organs and lateral shoots, as well as in plant hormonal control and defence. The NAC domain is accompanied by diverse C-terminal transcriptional activation domains. The NAC domain has been shown to be a DNA-binding domain (DBD) and a dimerization domain [, ].The NAC domain can be subdivided into five subdomains (A-E). Each subdomain is distinguishable by blocks of heterogeneous amino acids or gaps. While the NAC domains were rich in basic amino acids (R, K and H) as a whole, the distribution of positive and negative amino acids in each subdomain were unequal. Subdomains C and D are rich in basic amino acids but poor in acidic amino acids, while subdomain B contains a high proportion of acidic amino acids. Putative nuclear localization signals (NLS) have been detected in subdomains C and D [ ]. The DBD is contained within a 60 amino acid region located within subdomains D and E []. The overall structure of the NAC domain monomer consists of a very twisted antiparallel β-sheet, which packs against an N-terminal α-helix on one side and one shorter helix on the other side surrounded by a few helical elements. The structure suggests that the NAC domain mediates dimerization through conserved interactions including a salt bridge, and DNA binding through the NAC dimer face rich in positive charges [].
Protein Domain
Name: Fatty acid hydroxylase
Type: Domain
Description: This entry includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways [ ]. This family includes C-5 sterol desaturase [] and C-4 sterol methyl oxidase (SMO) [, ]. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif, involved in coordination of two iron atoms in the catalytic centre. Members of this family are ER integral membrane proteins with a mushroom-like structure consisting of four transmembrane helices (TM1-TM4) that anchor them to the membrane, capped by a cytosolic domain containing the unique histidine-coordinating di metal centre [ ].The plant SMO amino acid sequences possess three histidine-rich motifs (HX3H, HX2HH and HX2HH), characteristic of the small family of membrane-bound non-haem iron oxygenases that are involved in lipid oxidation [ ].
Protein Domain      
Protein Domain
Name: Cytochrome b561/ferric reductase transmembrane
Type: Domain
Description: Cytochromes b561 constitute a class of intrinsic membrane proteins containing two haem molecules that are involved in ascorbate (vitamin C) regeneration. They have been suggested to function as electron transporters, shuttling electrons across membranes from ascorbate to an acceptor molecule. The one-electron oxidation product of ascorbate, monodehydro-ascorbate (MDHA) has been shown to function as an electon acceptor for mammalian and plant cytochromes b561. The cytochrome b561-catalysed reduction of MDHA results in the regeneration of the fully reduced ascorbate molecule. Cytochromes b561 have been identified in a large number of phylogenetically distant species, but are absent in prokaryotes. Most species contain three or four cytochrome b561 paralogous proteins [ ].Members of the cytochrome b561 protein family are characterised by a number of structural features, likely to play an essential part in their function. They are highly hydrophobic proteins with six transmembrane helices (named TMH1 through TMH6), four conserved histidine residues, probably coordinating the two haem molecules, and predicted substrate-binding sites for ascorbate and MDHA [ ]. The functionally relevant and structurally most conserved region in the cytochrome b561 family is the TMH2 to -5 4-helix core with an amino acid composition that is very well conserved in the inner surface and somewhat less conserved in the outer surface of the core. The two terminal helices (TMH1 and TMH6) are less conserved [, ].The entry represents a conserved region containing six transmembrane helices, found in cytochrome b651 and homologous proteins including some ferric reductases.
Protein Domain
Name: RNA polymerase Rpb7-like , N-terminal
Type: Domain
Description: Rpb7 is a subunit of eukaryotic RNA polymerase (RNAP) II that is homologous to Rpa43 of RNAP I, Rpc8/Rpc25 of RNP III, and RpoE of archaeal RNAP. Rpb7 binds to Rpb4 to form a heterodimer. This complex is thought to interact with the nascent RNA strand during RNA polymerase II elongation [ ] and plays a part in transcription, mRNA transport and DNA repair []. In RNA polymerase I, Rpa43 is at least one of the subunits contacted by the transcription factor TIF-IA [ ]. The N terminus of Rpb7 and homologues has a SHS2 domain that is involved in protein-protein interaction [].
Protein Domain
Name: S1 domain
Type: Domain
Description: The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of proteins involved in RNA metabolism. It belongs to the OB-fold family. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site [ , ].The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein [ ].This entry does not include translation initiation factor IF-1 S1 domains.
Protein Domain
Name: UDP-glucose/GDP-mannose dehydrogenase
Type: Family
Description: Enzymes in this family catalyse the NAD-dependent alcohol-to-acid oxidation of nucleotide-linked sugars. Examples include UDP-glucose 6-dehydrogenase ( ) [ ], GDP-mannose 6-dehydrogenase () [ ], UDP-N-acetylglucosamine 6-dehydrogenase () [ ], UDP-N-acetyl-D-galactosaminuronic acid dehydrogenase [], UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase [] and UDP-N-acetyl-D-mannosamine dehydrogenase []. These enzymes are most often involved in the biosynthesis of polysaccharides and are often found in operons devoted to that purpose. All of these enzymes contain three domains, , , and for the N-terminal, central, and C-terminal regions respectively.
Protein Domain
Name: UDP-glucose/GDP-mannose dehydrogenase, C-terminal
Type: Domain
Description: The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate [ , ].The enzymes have a wide range of functions. In plants UDP-glucose dehydrogenase, , is an important enzyme in the synthesis of hemicellulose and pectin [ ], which are the components of newly formed cell walls; while in zebrafish UDP-glucose dehydrogenase is required for cardiac valve formation []. In Xanthomonas campestris, a plant pathogen, UDP-glucose dehydrogenase is required for virulence []. GDP-mannose dehydrogenase, , catalyses the formation of GDP-mannuronic acid, which is the monomeric unit from which the exopolysaccharide alginate is formed. Alginate is secreted by a number of bacteria, which include Pseudomonas aeruginosa and Azotobacter vinelandii. In P. aeruginosa, alginate is believed to play an important role in the bacteria's resistance to antibiotics and the host immune response [ ], while in A. vinelandii it is essential for the encystment process [].This entry represents the C-terminal substrate-binding domain of these enzymes. Structural studies indicate that this domain forms an incomplete dinucleotide binding fold [ , ].
Protein Domain
Name: UDP-glucose/GDP-mannose dehydrogenase, N-terminal
Type: Domain
Description: The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate [ , ].The enzymes have a wide range of functions. In plants UDP-glucose dehydrogenase, , is an important enzyme in the synthesis of hemicellulose and pectin [ ], which are the components of newly formed cell walls; while in zebrafish UDP-glucose dehydrogenase is required for cardiac valve formation []. In Xanthomonas campestris, a plant pathogen, UDP-glucose dehydrogenase is required for virulence []. GDP-mannose dehydrogenase, , catalyses the formation of GDP-mannuronic acid, which is the monomeric unit from which the exopolysaccharide alginate is formed. Alginate is secreted by a number of bacteria, which include Pseudomonas aeruginosa and Azotobacter vinelandii. In P. aeruginosa, alginate is believed to play an important role in the bacteria's resistance to antibiotics and the host immune response [ ], while in A. vinelandii it is essential for the encystment process [].This entry represents the N-terminal NAD(+)-binding domain. Structural studies indicate that this domain forms an α-β structure containing the six-stranded parallel beta sheet characteristic of the dinucleotide binding Rossman fold [ , ].
Protein Domain
Name: UDP-glucose 6-dehydrogenase, eukaryotic type
Type: Family
Description: UDP-glucose 6-dehydrogenase is involved in the biosynthesis of UDP-glucuronic acid (UDP-GlcA), a critical precursor for glycan synthesis. In vertebrates it is involved in the biosynthesis of matrix glycosaminoglycans (hyaluronan, chondroitin sulfate, and heparan sulfan) which play significant roles in signaling, inflammation, morphogenesis, cancer growth and matrix organisation [ ]. In plants, it provides nucleotide sugars for cell-wall polymers []. Proteins in this entry include human UDP-glucose 6-dehydrogenase [, ].This entry represents the UDP-glucose 6-dehydrogenase mainly from eukaryotes. However, subsets of bacterial proteins can be found in this entry. Most of the bacterial type of the UDP-glucose 6-dehydrogenase can be found in .
Protein Domain
Name: ATPase, AAA-type, core
Type: Domain
Description: AAA ATPases (ATPases Associated with diverse cellular Activities) form a large protein family and play a number of roles in the cell including cell-cycle regulation, protein proteolysis and disaggregation, organelle biogenesis and intracellular transport. Some of them function as molecular chaperones, subunits of proteolytic complexes or independent proteases (FtsH, Lon). They also act as DNA helicases and transcription factors [].AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which act via the energy-dependent unfolding of macromolecules [ , ]. There are six major clades of AAA domains (proteasome subunits, metalloproteases, domains D1 and D2 of ATPases with two AAA domains, the MSP1/katanin/spastin group and BCS1 and it homologues), as well as a number of deeply branching minor clades [].They assemble into oligomeric assemblies (often hexamers) that form a ring-shaped structure with a central pore. These proteins produce a molecular motor that couples ATP binding and hydrolysis to changes in conformational states that act upon a target substrate, either translocating or remodelling it [ ].They are found in all living organisms and share the common feature of the presence of a highly conserved AAA domain called the AAA module. This domain is responsible for ATP binding and hydrolysis. It contains 200-250 residues, among them there are two classical motifs, Walker A (GX4GKT) and Walker B (HyDE) [ ].The functional variety seen between AAA ATPases is in part due to their extensive number of accessory domains and factors, and to their variable organisation within oligomeric assemblies, in addition to changes in key functional residues within the ATPase domain itself.
Protein Domain
Name: Peptidase S16, active site
Type: Active_site
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].This signature defines the active site of the serine peptidases belonging to the MEROPS peptidase family S16 (lon protease family, clan SF). These proteases which are dependent on the hydrolysis of ATP for their activity and have a serine in their active site, they include:Bacterial ATP-dependent proteases [ , ]. The prototype of those bacterial enzymes is the Escherichia coli La protease () (gene lon). La is capable of hydrolysing large proteins; it degrades short-lived regulatory (such as rcsA and sulA) and abnormal proteins. It is a cytoplasmic protein of 87kDa that associates as an homotetramer. Its proteolytic activity is stimulated by single-stranded DNA. Eukaryotic mitochondrial matrix proteases [ , ]. The prototype of these enzymes is the yeast PIM1 protease. It is a mitochondrial matrix protein of 120kDa that associated as an homohexamer. It catalyses the initial step of mitochondrial protein degradation.Haemophilus influenzae lon-B (HI1324), a protein which does not contain the ATP-binding domain, but possess a slightly divergent form of the catalytic domain.
Protein Domain
Name: Lon protease
Type: Family
Description: Lon protease belongs to the S16 peptidase family and is an ATP-dependent serine protease that mediates the selective degradation of mutant and abnormal proteins, as well as certain short-lived regulatory proteins. It is required for cellular homeostasis and for survival from DNA damage and developmental changes induced by stress [ ]. In pathogenic bacteria, it is required for the expression of virulence genes that promote cell infection [].Lon (La) protease was the first ATP-dependent protease to be purified from E. coli [, , , ]. The enzyme is a homotetramer of 87kDa subunits, with one proteolytic and one ATP-binding site per monomer, making it structurally less complex than other known ATP-dependent proteases []. Despite this relative structural simplicity, Lon recognises its substrates directly, without delegating the task of substrate recognition to other enzymes [].
Protein Domain
Name: Peptidase S16, Lon proteolytic domain
Type: Domain
Description: Lon (also known as endopeptidase La) is a multi-domain ATP- dependent protease found throughout all kingdoms of life. It is involved inprotein quality control and several regulatory processes. All Lon proteases contain an ATPase domain belonging to the AAA+ superfamily of molecularmachines, and a proteolytic domain with a serine-lysine catalytic dyad in which a lysine assists the catalytic serine in proteolytic cleavage. Lonproteases can be divided into two subfamilies: A type (A-Lons), which have a large multi-lobed N-terminal domain together with the ATPaseand protease domains, and B type (B-Lons), which lack an N domain, but have a membrane-anchoring region emerging from the ATPase domain. B-Lons are found inArchaea, in which they are the lone membrane-anchored ATP-dependent protease. The soluble A-Lons are found in all bacteria and in eukaryotic cellorganelles, such as mitochondria and peroxisomes, and are needed for recovery from various stress conditions [, , , , , ]. The Lon proteolytic domain formspeptidase family S16 of clan SJ [ ].The structure of the Lon proteolytic domain consists of six alpha helices and ten beta strands [, , , , ].
Protein Domain
Name: Ribosomal protein S5 domain 2-type fold
Type: Homologous_superfamily
Description: Domain 2 of the ribosomal protein S5 has a left-handed, 2-layer α/β fold with a core structure consisting of β(3)-α-β-α. Domains with this fold are found in numerous RNA/DNA-binding proteins, as well as in kinases from the GHMP kinase family. Proteins containing this α/β fold domain include: Translational machinery components (ribosomal proteins S5 and S9, and domain IV of elongation factors EF-G and eEF-2) [ ].Ribonuclease P protein (RNase P) [ ].Ribonuclease PH (domain 1) [ ], as well as various exosome complex exonucleases (RRP41, RRP42, RRP43, RRP45, RRP46, MTR3, ECX1, ECX2) [].DNA modification proteins (DNA mismatch repair proteins MutL and PMS2, DNA gyrase B, DNA topoisomerase II, IV-B and VI-B) [ ]. GHMP kinases that transfer a phosphoryl group from ATP to an acceptor (galactokinase ( ), homoserine kinase ( ), and mevalonate kinase ( )) [ , ].Caenorhabditis elegans early switch protein Xol-1 (a divergent member of the GHMP kinase family that has lost the ATP-binding site) [ ].Hsp90 chaperone (middle domain), which is related to the DNA gyrase/MutL family [ ]; this domain contains an extra C-terminal α/β subdomain.Imidazole glycerol phosphate dehydratase, which contains a duplication consisting of two structural repeats of this fold [ ].The catalytic domain of ATP-dependent protease Lon (La), which contains an extra C-terminal α/β subdomain [ ].Formaldehyde-activating enzyme FAE, which contains a modification of this fold consisting of an extra α/β unit after strand 2 [ ].
Protein Domain
Name: Ribosomal protein S5 domain 2-type fold, subgroup
Type: Homologous_superfamily
Description: Domain 2 of the ribosomal protein S5 has a left-handed β-α-β fold that is found in numerous RNA/DNA-binding proteins, as well as in kinases from the GHMP kinase family. Proteins containing this β-α-β fold domain include: Translational machinery components (ribosomal proteins S5 and S9, and domain IV of elongation factors EF-G and eEF-2) [ ].Ribonuclease P protein (RNase P) [ ].DNA modification proteins (DNA mismatch repair proteins MutL and PMS2, DNA gyrase B, DNA topoisomerase II, IV-B and VI-B) [ ].GHMP kinases that transfer a phosphoryl group from ATP to an acceptor (galactokinase ( ), homoserine kinase ( ), and mevalonate kinase ( )) [ , ].Caenorhabditis elegans early switch protein Xol-1 (a divergent member of the GHMP kinase family that has lost the ATP-binding site) [ ].
Protein Domain
Name: Peptide methionine sulfoxide reductase MsrB
Type: Family
Description: The oxidation of methionine residues in proteins is considered to be one of the consequences of oxidative damage to cells, which in many cases leads to the loss of biological activity. Peptide methionine sulphoxide reductase (Msr) reverses the inactivation of many proteins due to the oxidation of critical methionine residues by reducing methionine sulphoxide, (MetO), to methionine [ ]. Methionine (Met) can be oxidised to the R and S diastereomers of methionine sulfoxide (MetO). Methionine sulfoxide reductases A (MsrA) and B (MsrB) reduce MetO back to Met in a stereospecific manner, acting on the S and R forms, respectively. Msr is present in most living organisms [, ].Many bacteria, particularly pathogens, possess methionine sulfoxide reductase MsrA and MsrB as a fusion form (MsrAB) [ ]. This entry includes MsrB and the fusion form of these enzymes.
Protein Domain
Name: Peptide methionine sulphoxide reductase MsrA domain
Type: Domain
Description: Peptide methionine sulphoxide reductase (Msr) reverses the inactivation of many proteins due to the oxidation of critical methionine residues by reducing methionine sulphoxide, Met(O), to methionine [ ]. It is present in most living organisms, and the cognate structural gene belongs to the so-called minimum gene set [, ].The domains MsrA and MsrB reduce different epimeric forms of methionine sulphoxide. This group represent MsrA, the crystal structure of which has been determined in a number of organisms. In Mycobacterium tuberculosis, the MsrA structure has been determined to 1.5 Angstrom resolution [ ]. In contrast to the three catalytic cysteine residues found in previously characterised MsrA structures, M. tuberculosis MsrA represents a class containing only two functional cysteine residues. The overall structure shows no resemblance to the structures of MsrB ( ) from other organisms; though the active sites show approximate mirror symmetry. In each case, conserved amino acid motifs mediate the stereo-specific recognition and reduction of the substrate. In a number of pathogenic bacteria including Neisseria gonorrhoeae, the MsrA and MsrB domains are fused; the MsrA being N-terminal to MsrB. This arrangement is reversed in Treponema pallidum. In N. gonorrhoeae and Neisseria meningitidis a thioredoxin domain is fused to the N terminus. This may function to reduce the active sites of the downstream MsrA and MsrB domains.
Protein Domain
Name: PADRE domain
Type: Family
Description: This entry represents the Pathogen and abiotic stress response, cadmium tolerance, disordered region-containing (PADRE) domain, which is specifically found in plants. PADRE typically occurs in small single-domain proteins with a bipartite architecture. PADRE contains conserved sequence motifs at the N-terminal, and its C-terminal includes an intrinsically disordered region with multiple phosphorylation sites. This domain is associated with plant defense upon diverse stress stimulus and has a role in disease resistance to fungi [ ].
Protein Domain
Name: RidA family
Type: Family
Description: The YjgF/YER057c/UK114 family of proteins is conserved in all domains of life [ ]. A phylogenetic analysis applied by Lambrecht et al.has divided the Rid family into a widely distributed archetypal RidA (YjgF) subfamily and seven other subfamilies (Rid1 to Rid7) that are largely confined to bacteria and often co-occur in the same organism with RidA and each other [ ].This entry represents part of the RidA subfamily. Its members include the mammalian endoribonuclease UK114, the yeast YER057C and the bacterial protein YjgF. However, this entry does not include RutC, which is represented in . YjgF contains the enamine/imine deaminase activity and can accelerate the release of ammonia from reactive enamine/imine intermediates of the pyridoxal 5'-phosphate-dependent threonine dehydratase (IlvA). Therefore, YjgF is also known as RidA (reactive intermediate/imine deaminase A) [ ]. Although RidA subfamily members share protein and structure similarity, they may have different functions. For instance, rat ribonuclease UK114, also known as Hrsp12 or Psp1, is an endoribonuclease responsible for the inhibition of the translation by cleaving mRNA [, ]. Budding yeast YER057C (also known as Hmf1) is involved in maintenance of the mitochondrial genome [].
Protein Domain
Name: Endoribonuclease L-PSP/chorismate mutase-like
Type: Domain
Description: This entry represents the beta-α-β-α-β(2) domains common both to bacterial chorismate mutase and to members of the YjgF/Yer057p/UK114 family. These proteins form trimers with a three-fold symmetry with three closely-packed β-sheets. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site [ , , , ].Chorismate mutase (CM, ) is an enzyme of the aromatic amino acid biosynthetic pathway that catalyses the reaction at the branch point of the pathway leading to the three aromatic amino acids, phenylalanine, tryptophan and tyrosine (chorismic acid is the last common intermediate, and CM leads to the L-phenylalanine/L-tyrosine branch). It is part of the shikimate pathway, which is present only in bacteria, fungi and plants. The structure of chorismate mutase enzymes from Bacillus subtilis [ ] and Thermus thermophilus have been solved and were shown to have a catalytic homotrimer, with the active sites being located at the subunit interfaces, where residues from two subunits contribute to each site.The YjgF/YER057c/UK114 family is a large, highly conserved, and widely distributed family of proteins found in bacteria, archaea and eukaryotes [ ]. YjgF (renamed RidA) deaminates reactive enamine/imine intermediates of pyridoxal 5'-phosphate (PLP)-dependent enzyme reactions. The YjgF/YER057c/UK114 family of proteins is conserved in all domains of life suggesting that reactive enamine/imine metabolites are of concern to all organisms []. This family includes: YjgF (RidA) [ ] the yeast growth inhibitor YER057c (protein HMF1) that appears to play a role in the regulation of metabolic pathways and cell differentiation [ ] the mammalian 14.5kDa translational inhibitor protein UK114, also known as L-PSP (liver perchloric acid-soluble protein), with endoribonucleolytic activity that directly affects mRNA translation and can induce disaggregation of the reticulocyte polysomes into 80 S ribosomes [ ] RutC from E. coli, which is essential for growth on uracil as sole nitrogen source and is thought to reduce aminoacrylate peracid to aminoacrylate [ ] YabJ from B. subtilis, which is required for adenine-mediated repression of purine biosynthetic genes [ ] Structurally these proteins are homotrimers with clefts between the monomeric subunits that are proposed to have some functional relevance [, , ].
Protein Domain
Name: YjgF/YER057c/UK114 family
Type: Family
Description: The YjgF/YER057c/UK114 family (also known as the Rid family) of proteins is conserved in all domains of life [ ]. A phylogenetic analysis applied by Lambrecht et al.has divided the Rid family into a widely distributed archetypal RidA (YjgF) subfamily and seven other subfamilies (Rid1 to Rid7) that are largely confined to bacteria and often co-occur in the same organism with RidA and each other [ ]. Although the family members share high levels of protein sequence and structure similarity, their functions vary widely across different species []. Structurally, these proteins are homotrimers with clefts between the monomeric subunits that are proposed to have some functional relevance [ , , ].This family includes: YjgF (also known as RidA or 2-iminobutanoate/2-iminopropanoate deaminase), which displays enamine/imine deaminase activity and can accelerate the release of ammonia from reactive enamine/imine intermediates of the pyridoxal 5'-phosphate-dependent threonine dehydratase (IlvA) [ , ] the yeast growth inhibitor YER057c (protein HMF1) that appears to play a role in the regulation of metabolic pathways and cell differentiation [ ] the mammalian 14.5kDa translational inhibitor protein UK114, also known as L-PSP (liver perchloric acid-soluble protein), with endoribonucleolytic activity that directly affects mRNA translation and can induce disaggregation of the reticulocyte polysomes into 80 S ribosomes [ ] RutC from E. coli, which is essential for growth on uracil as sole nitrogen source and is thought to reduce aminoacrylate peracid to aminoacrylate [ ] YabJ from B. subtilis, which is required for adenine-mediated repression of purine biosynthetic genes [ ]
Protein Domain
Name: RidA, conserved site
Type: Conserved_site
Description: This entry represents a conserved site found towards the C-terminal of the YjgF family members. YjgF contains the enamine/imine deaminase activity and can accelerate the release of ammonia from reactive enamine/imine intermediates of the pyridoxal 5'-phosphate-dependent threonine dehydratase (IlvA). Therefore, YjgF is also known as RidA (reactive intermediate/imine deaminase A) [ ]. The RidA family members are widely distributed in eubacteria, archaea and eukaryotes. Although they share protein sequences and structures similarity, they may have different functions. In this C-terminal domain, the highly conserved Arg107 of human hp14.5 (also known as ribonuclease UK114) forms salt bridges with the carboxylate oxygens of benzoate [ ], while the corresponding Arg105 of Escherichia coli TdcF forms hydrogen bonds with the carboxylate oxygens of serine, threonine, and 2-oxobutanoate []. The conserved Tyr17 and Glu120 residues of E. coli TdcF have been suggested to play a role in substrate binding and positioning of a water molecule used for imine hydrolysis [].
Protein Domain
Name: Photosynthetic reaction centre, L/M
Type: Family
Description: The photosynthetic apparatus in non-oxygenic bacteria consists of light-harvesting (LH) protein-pigment complexes LH1 and LH2, which use carotenoid and bacteriochlorophyll as primary donors [ ]. LH1 acts as the energy collection hub, temporarily storing it before its transfer to the photosynthetic reaction centre (RC) []. Electrons are transferred from the primary donor via an intermediate acceptor (bacteriopheophytin) to the primary acceptor (quinine Qa), and finally to the secondary acceptor (quinone Qb), resulting in the formation of ubiquinol QbH2. RC uses the excitation energy to shuffle electrons across the membrane, transferring them via ubiquinol to the cytochrome bc1 complex in order to establish a proton gradient across the membrane, which is used by ATP synthetase to form ATP [ , , ]. The core complex is anchored in the cell membrane, consisting of one unit of RC surrounded by LH1; in some species there may be additional subunits [ ]. RC consists of three subunits: L (light), M (medium), and H (heavy). Subunits L and M provide the scaffolding for the chromophore, while subunit H contains a cytoplasmic domain []. In Rhodopseudomonas viridis, there is also a non-membranous tetrahaem cytochrome (4Hcyt) subunit on the periplasmic surface. This entry describes the photosynthetic reaction centre L and M subunits, and the homologous D1 (PsbA) and D2 (PsbD) photosystem II (PSII) reaction centre proteins from cyanobacteria, algae and plants. The D1 and D2 proteins only show approximately 15% sequence homology with the L and M subunits, however the conserved amino acids correspond to the binding sites of the phytochemically active cofactors. As a result, the reaction centres (RCs) of purple photosynthetic bacteria and PSII display considerable structural similarity in terms of cofactor organisation.The D1 and D2 proteins occur as a heterodimer that form the reaction core of PSII, a multisubunit protein-pigment complex containing over forty different cofactors, which are anchored in the cell membrane in cyanobacteria, and in the thylakoid membrane in algae and plants. Upon absorption of light energy, the D1/D2 heterodimer undergoes charge separation, and the electrons are transferred from the primary donor (chlorophyll a) via pheophytin to the primary acceptor quinone Qa, then to the secondary acceptor Qb, which like the bacterial system, culminates in the production of ATP. However, PSII has an additional function over the bacterial system. At the oxidising side of PSII, a redox-active residue in the D1 protein reduces P680, the oxidised tyrosine then withdrawing electrons from a manganese cluster, which in turn withdraw electrons from water, leading to the splitting of water and the formation of molecular oxygen. PSII thus provides a source of electrons that can be used by photosystem I to produce the reducing power (NADPH) required to convert CO2 to glucose [ , ].Also in this entry is the light-dependent chlorophyll f synthase (ChlF) from cyanobacteria such as Chlorogloeopsis fritschii. ChlF synthesizes chlorophyll f or chlorophyllide f, which is able to absorb far red light, probably by oxidation of chlorophyll a or chlorophyllide a and reduction of plastoquinone [ ].
Protein Domain
Name: Expansin, cellulose-binding-like domain
Type: Domain
Description: Expansins are secreted proteins of 25 to 27 Kd that were isolated first from young cucumber seedling and subsequently from other plant tissues. Expression of expansin genes correlates with growth of cells. Increase in expansin content also occurs during fruit ripening. Expansins act on the cell wall to promote its extensibility. The model for its mechanism of action postulates that expansins break non-covalent bonds between cell-wall polysaccharides, thereby permitting pressure dependent expansion of the cell [ , ].Group-I pollen allergens of grasses have limited but significant sequence homology to expansin. These proteins are the main causative agent of hay fever and seasonal asthma induced by grass pollen. Extracts containing group-I allergens are also active in loosening cell-walls. Group-I pollen allergens and related proteins in vegetative tissues have been classified as beta-expansins, whereas the earlier discovered expansins are now referred as α-expansins [].Expansin-like proteins are also found in some fungi. In Trichoderma reesei an expansin-like protein (Cel12A) acts as a glycoside hydrolase on xyloglucan and 1-4 beta-glucan. These hydrolytic actions differ from the action by expansins, which induce wall extension by a non-hydrolytic mechanism [ ].Expansins consist of two domains closely packed and aligned so as to form a long, shallow groove with potential to bind a glycan backbone of ~10 sugarresidues. The N-terminal cysteine-rich domain has distant sequence similarity to family-45 endoglucanases (EG45-like domain). The ~90-residue C-terminal domain may function as a cellulose-binding domain (CBD). It is composed of eight β-strands assembled into two antiparallel β-sheets. The two β-sheets are at slight angles to each other and form a β-sandwich similar to the Ig fold [].This entry represents the expansin C-terminal CBD-like domain.
Protein Domain
Name: Expansin/pollen allergen, DPBB domain
Type: Domain
Description: This entry represents a N-terminal domain that has a Barwin-like double psi β-barrel structure(DPBB). It is found in proteins like expasin and pollen allergens. The major timothy grass pollen allergen Phl p 1 is one of the most potent and frequently recognised environmental allergens [ ].
Protein Domain
Name: RlpA-like protein, double-psi beta-barrel domain
Type: Domain
Description: Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi β-barrel (DPBB) fold [ , ]. RlpA is a bacterial septal ring protein and a lytic transglycosylase that contributes to rod shape and daughter cell separation in Pseudomonas aeruginosa []. It has been shown to act as a prc mutant suppressor in Escherichia coli []. The DPBB fold is often an enzymatic domain. Proteins containing this domain are quite diverse and may have several different functions. Another example of this domain is found in the N terminus of pollen allergen []. Some studies show that the full-length RlpA protein from Pseudomonas Aeruginosa is an outer membrane protein that is a lytic transglycolase with specificity for peptidoglycan lacking stem peptides. Residue D157 in Pseudomonas aeruginosa RlpA is critical for lytic activity [].Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the β-sheet. These two parameters have been shown to determine the major geometrical features of β-barrels. Six-stranded β-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair [].
Protein Domain
Name: Expansin/Lol pI
Type: Family
Description: Expansins are unusual proteins that mediate cell wall extension in plants [ ]. They are believed to act as a sort of chemical grease, allowing polymers to slide past one another by disrupting non-covalent hydrogen bonds that hold many wall polymers to one another. This process is notdegradative and hence does not weaken the wall, which could otherwise rupture under internal pressure during growth. Sequence comparisons indicate at least four distinct expansin cDNAs in rice and at least six in Arabidopsis. The proteins are highly conserved in size and sequence (75-95% amino acid sequence similarity between any pairwise comparison), and phylogenetic trees indicate that this multigenefamily formed before the evolutionary divergence of monocotyledons and dicotyledons [ ]. Sequence and motif analyses show no similarities to known functional domains that might account for expansin action on wall extension. It is thought that several highly-conserved tryptophans may function in expansin binding to cellulose, or other glycans. The high conservation of the family indicates that the mechanism by which expansins promote wall extensin tolerates little variation in protein structure. Grass pollens, such as pollen from timothy grass, represent a major cause of type I allergy [ ]. Interestingly, expansins share a high degree ofsequence similarity with the Lol p I family of allergens.
Protein Domain      
Protein Domain
Name: Small GTPase
Type: Family
Description: Small GTPases form an independent superfamily within the larger class of regulatory GTP hydrolases. This superfamily contains proteins that control a vast number of important processes and possess a common, structurally preserved GTP-binding domain [ , ]. Sequence comparisons of small G proteins from various species have revealed that they are conserved in primary structures at the level of 30-55% similarity [].Crystallographic analysis of various small G proteins revealed the presence of a 20kDa catalytic domain that is unique for the whole superfamily [ , ]. The domain is built of five alpha helices (A1-A5), six β-strands (B1-B6) and five polypeptide loops (G1-G5). A structural comparison of the GTP- and GDP-bound form, allows one to distinguish two functional loop regions: switch I and switch II that surround the gamma-phosphate group of the nucleotide. The G1 loop (also called the P-loop) that connects the B1 strand and the A1 helix is responsible for the binding of the phosphate groups. The G3 loop provides residues for Mg2 and phosphate binding and is located at the N terminus of the A2 helix. The G1 and G3 loops are sequentially similar to Walker A and Walker B boxes that are found in other nucleotide binding motifs. The G2 loop connects the A1 helix and the B2 strand and contains a conserved Thr residue responsible for Mg2 binding. The guanine base is recognised by the G4 and G5 loops. The consensus sequence NKXD of the G4 loop contains Lys and Asp residues directly interacting with the nucleotide. Part of the G5 loop located between B6 and A5 acts as a recognition site for the guanine base [].The small GTPase superfamily can be divided into at least 8 different families, including:Arf small GTPases. GTP-binding proteins involved in protein trafficking by modulating vesicle budding and uncoating within the Golgi apparatus.Ran small GTPases. GTP-binding proteins involved in nucleocytoplasmic transport. Required for the import of proteins into the nucleus and also for RNA export.Rab small GTPases. GTP-binding proteins involved in vesicular traffic.Rho small GTPases. GTP-binding proteins that control cytoskeleton reorganisation.Ras small GTPases. GTP-binding proteins involved in signalling pathways.Sar1 small GTPases. Small GTPase component of the coat protein complex II (COPII) which promotes the formation of transport vesicles from the endoplasmic reticulum (ER).Mitochondrial Rho (Miro). Small GTPase domain found in mitochondrial proteins involved in mitochondrial trafficking.Roc small GTPases domain. Small GTPase domain always found associated with the COR domain.
Protein Domain
Name: Ran GTPase
Type: Family
Description: Small GTPases form an independent superfamily within the larger class of regulatory GTP hydrolases. This superfamily contains proteins that control avast number of important processes and possess a common, structurally preserved GTP-binding domain [, ]. Sequence comparisons of small G proteinsfrom various species have revealed that they are conserved in primary structures at the level of 30-55% similarity [].Crystallographic analysis of various small G proteins revealed the presence of a 20kDa catalytic domain that is unique for the whole superfamily [ , ]. The domain is built of five alpha helices (A1-A5), sixβ-strands (B1-B6) and five polypeptide loops (G1-G5). A structural comparison of the GTP- and GDP-bound form, allows one to distinguish twofunctional loop regions: switch I and switch II that surround the gamma-phosphate group of the nucleotide. The G1 loop (also called the P-loop)that connects the B1 strand and the A1 helix is responsible for the binding of the phosphate groups. The G3 loop provides residues for Mg(2+) and phosphatebinding and is located at the N terminus of the A2 helix. The G1 and G3 loops are sequentially similar to Walker A and Walker B boxes that are found inother nucleotide binding motifs. The G2 loop connects the A1 helix and the B2 strand and contains a conserved Thr residue responsible for Mg(2+) binding.The guanine base is recognised by the G4 and G5 loops. The consensus sequence NKXD of the G4 loop contains Lys and Asp residues directly interacting withthe nucleotide. Part of the G5 loop located between B6 and A5 acts as a recognition site for the guanine base [].The small GTPase superfamily can be divided in 8 different families: Arf small GTPases. GTP-binding proteins involved in protein trafficking by modulating vesicle budding and un-coating within the Golgi apparatusRan small GTPases. GTP-binding proteins involved in nucleocytoplasmic transport. Required for the import of proteins into the nucleus and alsofor RNA export Rab small GTPases. GTP-binding proteins involved in vesicular traffic. Rho small GTPases. GTP-binding proteins that control cytoskeleton reorganisationRas small GTPases. GTP-binding proteins involved in signaling pathways Sar1 small GTPases. Small GTPase component of the coat protein complex II (COPII) which promotes the formation of transport vesicles from theendoplasmic reticulum (ER) Mitochondrial Rho (Miro). Small GTPase domain found in mitochondrial proteins involved in mitochondrial traffickingRoc small GTPases domain. Small GTPase domain always found associated with the COR domain.Ran (or TC4), is an evolutionary conserved member of the Ras superfamily of small GTPases that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran has been implicated in a large number of processes, including nucleocytoplasmic transport, RNA synthesis, processing and export and cell cycle checkpoint control [, ]. Ran plays a crucial role in both import/export pathways and determines the directionality of nuclear transport. Import receptors (importins) bind their cargos in the cytoplasm where the concentration of RanGTP is low (due to action of RanGAP), and release their cargos in the nucleus where the concentration of RanGTP is high (due to action of RanGEF) [, ]. Export receptors (exportins) respond to RanGTP in the opposite manner. Furthermore, it has been shown that nuclear transport factor 2 (NTF2, ) stimulates efficient nuclear import of a cargo protein. NTF2 binds specifically to RanGDP and to the FXFG repeat containing nucleoporins [ ]. Ran is generally included in the RAS 'superfamily' of small GTP-binding proteins [ ], but it is only slightly related to the other RAS proteins. It also differs from RAS proteins in that it lacks cysteine residues at its C-terminal and is therefore not subject to prenylation. Instead, Ran has an acidic C terminus. It is, however, similar to RAS family members in requiring a specific guanine nucleotide exchange factor (GEF) and a specific GTPase activating protein (GAP) as stimulators of overall GTPase activity.Ran consists of a core domain that is structurally similar to the GTP-binding domains of other small GTPases but, in addition, Ran has a C-terminal extension consisting of an unstructured linker and a 16 residue α-helix that is located opposite the "Switch I"region in the RanGDP structure [ ]. Three regions of Ran change conformation depending on the nucleotide bound, the Switch I and II regions, which interact with the bound nucleotide, as well as the C-terminal extension. In RanGDP, the C-terminal extension contacts the core of the protein, while in RanGTP, the extension is extending away from the core, most likely due to a steric clash between the switch I region and the linker part of the C-terminal extension. This suggests that the C-terminal extension in RanGDP is crucial for shielding residues in the core domain and preventing the switch regions from adopting a GTP-like form. This prevents binding of transport factors to RanGDP that would otherwise lead to uncoordinated interaction between importin beta-like proteins and cellular factors.
Protein Domain
Name: Small GTPase Rho
Type: Family
Description: Small GTPases form an independent superfamily within the larger class of regulatory GTP hydrolases. This superfamily contains proteins that control a vast number of important processes and possess a common, structurally preserved GTP-binding domain [ , ]. Sequence comparisons of small G proteins from various species have revealed that they are conserved in primary structures at the level of 30-55% similarity [].Crystallographic analysis of various small G proteins revealed the presence of a 20kDa catalytic domain that is unique for the whole superfamily [ , ]. The domain is built of five alpha helices (A1-A5), six β-strands (B1-B6) and five polypeptide loops (G1-G5). A structural comparison of the GTP- and GDP-bound form, allows one to distinguish two functional loop regions: switch I and switch II that surround the gamma-phosphate group of the nucleotide. The G1 loop (also called the P-loop) that connects the B1 strand and the A1 helix is responsible for the binding of the phosphate groups. The G3 loop provides residues for Mg2 and phosphate binding and is located at the N terminus of the A2 helix. The G1 and G3 loops are sequentially similar to Walker A and Walker B boxes that are found in other nucleotide binding motifs. The G2 loop connects the A1 helix and the B2 strand and contains a conserved Thr residue responsible for Mg2 binding. The guanine base is recognised by the G4 and G5 loops. The consensus sequence NKXD of the G4 loop contains Lys and Asp residues directly interacting with the nucleotide. Part of the G5 loop located between B6 and A5 acts as a recognition site for the guanine base [].The small GTPase superfamily can be divided into at least 8 different families, including:Arf small GTPases. GTP-binding proteins involved in protein trafficking by modulating vesicle budding and uncoating within the Golgi apparatus.Ran small GTPases. GTP-binding proteins involved in nucleocytoplasmic transport. Required for the import of proteins into the nucleus and also for RNA export.Rab small GTPases. GTP-binding proteins involved in vesicular traffic.Rho small GTPases. GTP-binding proteins that control cytoskeleton reorganisation.Ras small GTPases. GTP-binding proteins involved in signalling pathways.Sar1 small GTPases. Small GTPase component of the coat protein complex II (COPII) which promotes the formation of transport vesicles from the endoplasmic reticulum (ER).Mitochondrial Rho (Miro). Small GTPase domain found in mitochondrial proteins involved in mitochondrial trafficking.Roc small GTPases domain. Small GTPase domain always found associated with the COR domain.This entry represents the Rho subfamily of Ras-like small GTPases. The small GTPase-like protein LIP2 (light insensitive period 2) from Arabidopsis thalianais implicated in control of the plant circadian rhythm [ ]. The crystal structures of a number of the members of this entry have been determined: Rnd3/RhoE [], RhoA [] and Cdc42 [].
Protein Domain      
Protein Domain
Name: Small GTPase, Ras-type
Type: Family
Description: Small GTPases form an independent superfamily within the larger class of regulatory GTP hydrolases. This superfamily contains proteins that control a vast number of important processes and possess a common, structurally preserved GTP-binding domain [ , ]. Sequence comparisons of small G proteins from various species have revealed that they are conserved in primary structures at the level of 30-55% similarity [].Crystallographic analysis of various small G proteins revealed the presence of a 20kDa catalytic domain that is unique for the whole superfamily [ , ]. The domain is built of five alpha helices (A1-A5), six β-strands (B1-B6) and five polypeptide loops (G1-G5). A structural comparison of the GTP- and GDP-bound form, allows one to distinguish two functional loop regions: switch I and switch II that surround the gamma-phosphate group of the nucleotide. The G1 loop (also called the P-loop) that connects the B1 strand and the A1 helix is responsible for the binding of the phosphate groups. The G3 loop provides residues for Mg2 and phosphate binding and is located at the N terminus of the A2 helix. The G1 and G3 loops are sequentially similar to Walker A and Walker B boxes that are found in other nucleotide binding motifs. The G2 loop connects the A1 helix and the B2 strand and contains a conserved Thr residue responsible for Mg2 binding. The guanine base is recognised by the G4 and G5 loops. The consensus sequence NKXD of the G4 loop contains Lys and Asp residues directly interacting with the nucleotide. Part of the G5 loop located between B6 and A5 acts as a recognition site for the guanine base [ ].The small GTPase superfamily can be divided into at least 8 different families, including:Arf small GTPases. GTP-binding proteins involved in protein trafficking by modulating vesicle budding and uncoating within the Golgi apparatus.Ran small GTPases. GTP-binding proteins involved in nucleocytoplasmic transport. Required for the import of proteins into the nucleus and also for RNA export.Rab small GTPases. GTP-binding proteins involved in vesicular traffic.Rho small GTPases. GTP-binding proteins that control cytoskeleton reorganisation.Ras small GTPases. GTP-binding proteins involved in signalling pathways.Sar1 small GTPases. Small GTPase component of the coat protein complex II (COPII) which promotes the formation of transport vesicles from the endoplasmic reticulum (ER).Mitochondrial Rho (Miro). Small GTPase domain found in mitochondrial proteins involved in mitochondrial trafficking.Roc small GTPases domain. Small GTPase domain always found associated with the COR domain.Ras proteins are small GTPases that regulate cell growth, proliferation and differentiation. The different Ras isoforms: H-ras, N-ras and K-ras, generate distinct signaloutputs, despite interacting with a common set of activators and effectors. Ras is activated by guanine nucleotide exchange factors (GEFs) that release GDP and allow GTP binding. Many RasGEFs have been identified.These are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEFcolocalizes with Ras. Active GTP-bound Ras interacts with several effector proteins: among the best characterised are the Raf kinases,phosphatidylinositol 3-kinase (PI3K), RalGEFs and NORE/MST1. Ras proteins are synthesized as cytosolic precursors that undergo post-translational processing to be ableto associate with cell membranes [ ]. First, protein farnesyl transferase, a cytosolicenzyme, attaches a farnesyl group to the cysteine residue of the CAAX motif. Second, the farnesylated CAAX sequence targets Ras to thecytosolic surface of the ER where an endopeptidase removes the AAX tripeptide. Third, the α-carboxyl group on the now carboxy-terminal farnesylcysteine ismethylated by isoprenylcysteine carboxyl methyltransferase. Finally, after methylation, Ras proteins take one of two routes to the cell surface, which is dictated by a second targetingsignal that is located immediately amino-terminal to the farnesylated cysteine. N-ras and H-ras are expressedstably on the plasma membrane, on Golgi in transfected cells, and at least transiently on the ER. Ras has also been visualized on endosomes.
Protein Domain
Name: NmrA-like domain
Type: Domain
Description: NmrA is a negative transcriptional regulator of various fungi, involved in the post-translational modulation of the GATA-type transcription factor AreA [ ]. NmrA lacks the canonical GXXGXXG NAD-binding motif and has altered residues at the catalytic triad, including a Met instead of the critical Tyr residue. NmrA may bind nucleotides but appears to lack any dehydrogenase activity. It lacks most of the active site residues of the SDR (short-chain dehydrogenases/reductases) family, but has an NAD(P)-binding motif similar to the extended SDR family, GXXGXXG [].This domain can also be found in other atypical SDRs, such as HSCARG (an NADPH sensor) [ ] and PCBER (phenylcoumaran benzylic ether reductase) [].
Protein Domain
Name: GPI mannosyltransferase 2
Type: Family
Description: This entry represents GPI mannosyltransferase 2, also known as PIG-V in humans or Gpi18 in fungi. PIG-V is a mannosyltransferase that transfers the second mannose in glycosylphosphatidylinositol (GPI) biosynthesis [ , , ]. GPI is a glycolipid that anchors many proteins to the eukaryotic cell surface [].
Protein Domain
Name: Alkaline ceramidase TOD1/Probable hexosyltransferase MUCI70
Type: Family
Description: The entry represents a group of proteins mainly found in plants, including MUCI70 and TOD1 from Arabidopsis. They share a Rossmann-like fold found in glycosyltransferases.MUCI70 is a predicted glycosyltransferase essential for the accumulation of seed mucilage, a gelatinous wall rich in unbranched rhamnogalacturonan I (RG I), and for shaping the surface morphology of seeds [ , ]. Together with IRX14, itis required for xylan and pectin synthesis in seed coat epidermal (SCE) cells.TOD1 is an endoplasmic reticulum ceramidase that catalyses the hydrolysis of ceramides into sphingosine and free fatty acids at alkaline pH (e.g. pH 9.5) [ ]. It is involved in the regulation of turgor pressure in guard cells and pollen tubes [, ].
Protein Domain
Name: Ubiquitin system component CUE
Type: Domain
Description: This domain promotes intramolecular monoubiquitination and has a dual role in mono- and poly-ubiquitination recognition, being involved in binding ubiquitin-conjugating enzymes (UBCs) [ , , ]. CUE domains also occur in two proteins of the IL-1 signal transduction pathway, tollip and TAB2.
Protein Domain
Name: Oxoglutarate/iron-dependent dioxygenase
Type: Domain
Description: Enzymes with the Fe(2+) and 2-oxoglutarate (2OG)-dependent dioxygenase domain typically catalyse the oxidation of an organic substrate using a dioxygen molecule, mostly by using ferrous iron as the active site cofactor and 2OG as a co-substrate which is decarboxylated to succinate and CO2 [ ]. Iron 2OG dioxygenase domain proteins are widespread among eukaryotes and bacteria. In metazoans, prolyl hydroxylases containing the domain act as oxygen sensors and catalyse the hydroxylation of conserved prolyl residues in hypoxia-inducible transcription factor (HIF) alpha [, ]. In plants, Fe(II) 2OG dioxygenase domain enzymes catalyse the formation of plant hormones, such as ethylene, gibberellins, anthocyanidins and pigments such as flavones. In bacteria and fungi Fe(II) 2OG dioxygenase domain enzymes participate in the biosynthesis of antibiotics such as penicillin and cephalosporin. The eukaryotic and bacterial protein AlkB that also shows this structural domain is involved in DNA-repair [, ].The iron 2OG dioxygenase domain has a conserved β-barrel structure [ ], which forms a double-stranded β-helix core fold that forms the predominant class of the cupin superfamily ('cupa' means a small barrel in Latin) []. Two histidines and an aspartate residue catalytically bind a metal ion, in general iron but in some cases another metal, directly involved in catalysis. A conserved arginine or lysine residue further near the C-terminal part acts as the basic residue that interacts with the acidic substrate.
Protein Domain      
Protein Domain
Name: Isopenicillin N synthase-like superfamily
Type: Homologous_superfamily
Description: Isopenicillin N synthase (IPNS) catalyses conversion of the linear tripeptide delta-(L-alpha-aminoadipoyl)-L-cysteinyl-D-valine (ACV) to isopenicillin N (IPN), the central step in biosynthesis of the beta-lactam antibiotics [ ]. IPNS is a nonhaem-Fe2+-dependent enzyme. It belongs to a class of nonhaem Fe2+-containing enzymes which includes 2-oxoglutarate-dependent dioxygenases, 2-oxoglutarate-dependent hydroxy- lases, and enzymes involved in ethylene formation and anthocyaninidin biosynthesis.The IPNS structure shows that the active site is buried within the hydrophobic pocket of an eight-stranded jelly roll barrel [ , ].
Protein Domain
Name: Non-haem dioxygenase N-terminal domain
Type: Domain
Description: This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity [ ].
Protein Domain
Name: Pyridoxal phosphate-dependent transferase, major domain
Type: Homologous_superfamily
Description: Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). Pyridoxal 5'-phosphate (PLP) is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination [ , , ]. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors []. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy [].PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the ε-amino group of an active site lysine residue on the enzyme. The α-amino group of the substrate displaces the lysine ε-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic [ ].The monomer of PLP-dependent transferases consists of two domains, a large domain and a small domain. This entry represents the large domain, which has a 3-layer α/β/α sandwich topology [ ]. This domain can be found in the following PLP-dependent transferase families:Aspartate aminotransferase (AAT)-like enzymes, such as aromatic aminoacid aminotransferase AroAT, glutamine aminotransferase and kynureninase [ ].Beta-eliminating lyases, such as tyrosine phenol lyase and tryptophanase [ ].Pyridoxal-dependent decarboxylases, such as DOPA decarboxylase and glutamate decarboxylase beta (GadB) [ ].Cystathionine synthase-like enzymes, such as cystalysin, methionine gamma-lyase (MGL), and cysteine desulphurase (IscS) [ ].GABA-aminotransferase-like enzymes, such as ornithine aminotransferase and serine hydroxymethyltransferase [ ].Ornithine decarboxylase major domain [ ].
Protein Domain
Name: Pyridoxal phosphate-dependent transferase
Type: Homologous_superfamily
Description: Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). Pyridoxal 5'-phosphate (PLP) is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination [, , ]. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors []. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy [].PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the ε-amino group of an active site lysine residue on the enzyme. The α-amino group of the substrate displaces the lysine ε-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic [ ].This entry represents the major region of PLP-dependent transferases. This domain has a three layer α/β/α sandwich topology, with mixed β-sheets of 7 strands. The major region can be found in the following PLP-dependent transferase families:Aspartate aminotransferase (AAT)-like enzymes, such as aromatic aminoacid aminotransferase AroAT, glutamine aminotransferase and kynureninase [ ].Beta-eliminating lyases, such as tyrosine phenol lyase and tryptophanase [ ].Pyridoxal-dependent decarboxylases, such as DOPA decarboxylase and glutamate decarboxylase beta (GadB) [ ].Cystathionine synthase-like enzymes, such as cystalysin, methionine gamma-lyase (MGL), and cysteine desulphurase (IscS) [ ].GABA-aminotransferase-like enzymes, such as ornithine aminotransferase and serine hydroxymethyltransferase [ ].Ornithine decarboxylase major domain [ ].
Protein Domain
Name: Aminotransferase, class I/classII
Type: Domain
Description: Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped [ ] into class I and class II. This entry includes proteins from both subfamilies, including class I LL-diaminopimelate aminotransferase, chloroplastic from Arabidopsis thaliana (Dap) and class II Histidinol-phosphate aminotransferase from Listeria welshimeri (HisC). Dap consists of two domains, a large domain and a small domain. This entry represents the large domain, which has a 3-layer α/β/α sandwich topology [].
Protein Domain
Name: Pyridoxal phosphate-dependent transferase, small domain
Type: Homologous_superfamily
Description: Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). Pyridoxal 5'-phosphate (PLP) is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination [ , , ]. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors []. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy [].PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the ε-amino group of an active site lysine residue on the enzyme. The α-amino group of the substrate displaces the lysine ε-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic [ ].The monomer of PLP-dependent transferases consists of two domains, a large domain and a small domain. This entry represents small domain, which has a complex α/β structure [ ]. It can be found in the following PLP-dependent transferase families:Aspartate aminotransferase (AAT)-like enzymes, such as aromatic aminoacid aminotransferase AroAT, glutamine aminotransferase and kynureninase [ ].Beta-eliminating lyases, such as tyrosine phenol lyase and tryptophanase [ ].Pyridoxal-dependent decarboxylases, such as DOPA decarboxylase and glutamate decarboxylase beta (GadB) [ ].Cystathionine synthase-like enzymes, such as cystalysin, methionine gamma-lyase (MGL), and cysteine desulphurase (IscS) [ ].GABA-aminotransferase-like enzymes, such as ornithine aminotransferase and serine hydroxymethyltransferase [ ].Ornithine decarboxylase [ ].
Protein Domain
Name: Histone acetyltransferase domain, MYST-type
Type: Domain
Description: Histone acetyltransferases (HATs) fall into at least four different families based on sequence conservation within the HAT domain [ ]. The MYST family is the largest family of HATs and is named after the founding members: MOZ, Ybf2/ Sas3, Sas2 and Tip60. MYST proteins mediate many biological functions including gene regulation, DNA repair, cell-cycle regulation and development [] and have been shown to acetylate several non-histone substrates []. MYST proteins are autoregulated by posttranslational modifications [].The MYST-type HAT domain contains three regions: a central region associated with acetyl-CoA cofactor binding and catalysis in addition to flanking N- andC-terminal regions harboring respectively a C2HC-type zinc finger and a helix- turn-helix DNA-binding motif. The N- and C-terminal segmentsdirectly flanking the catalytic core are likely to play an important role in histone substrate binding [, ]. The catalytic mechanism for the MYST-type HAT domain is still unresolved but seems to involve a conserved glutamate that functions to abstract a proton from lysine to promote the nucleophilic attack on the acetyl carbonyl carbon of acetyl-CoA [, , , ].
Protein Domain
Name: RNA binding activity-knot of a chromodomain
Type: Domain
Description: This is a novel knotted tudor domain which is required for binding to RNA. The knot influences the loop conformation of the helical turn Ht2 (residues 61-63) that is located at the side opposite the knot in the tudor domain-chromo domain; stabilisation of Ht2 is essential for RNA binding [ ].
Protein Domain
Name: Ribosomal RNA adenine methyltransferase KsgA/Erm
Type: Family
Description: The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism [ , , ]. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity []. Another orthologue is the eukaryotic transcription factor B (TFB), which has a second function; this enzyme is a nuclear-encoded mitochondrial transcription factor and is essential for mitochondrial gene expression []. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.rRNA adenine N-6-methyltransferases Erm methylate a single adenosine base in 23S rRNA. They confer resistance to the MLS-B group of antibiotics [ , ]. Despite their sequence similarity to KsgA, the two enzyme families have strikingly different levels of regulation that remain to be elucidated.
Protein Domain
Name: Ribosomal RNA adenine methylase transferase, conserved site
Type: Conserved_site
Description: The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism [ , , ]. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity [ ]. Another orthologue is the eukaryotic transcription factor B (TFB), which has a second function; this enzyme is a nuclear-encoded mitochondrial transcription factor and is essential for mitochondrial gene expression []. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.This signature pattern covers a highly conserved region located in the N-terminal section, and is probably involved in S-adenosyl methionine (SAM) binding.
Protein Domain
Name: Ribosomal RNA adenine dimethylase
Type: Family
Description: The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism [ , , ]. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity []. Another orthologue is the eukaryotic transcription factor B (TFB), which has a second function; this enzyme is a nuclear-encoded mitochondrial transcription factor and is essential for mitochondrial gene expression []. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.
Protein Domain      
Protein Domain
Name: Ribosomal RNA adenine methylase transferase, N-terminal
Type: Domain
Description: The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism [ , , ]. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity []. Another orthologue is the eukaryotic transcription factor B (TFB), which has a second function; this enzyme is a nuclear-encoded mitochondrial transcription factor and is essential for mitochondrial gene expression []. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.This entry represents the N-terminal domain of rRNA adenine methylase transferases.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom