Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 2401 to 2500 out of 38750 for *

Category restricted to ProteinDomain (x)

0.02s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Defective-in-cullin neddylation protein
Type: Family
Description: The eukaryotic defective in cullin neddylation (DCN) protein family, contributes to neddylation of cullin components of SCF-type E3 ubiquitin ligase complexes. These multi-protein complexes are required for polyubiquitination and subsequent degradation of target proteins by the 26S proteasome [ , , ]. Proteins in the DCN family include:Yeast DCN1.Vertebrate DCN1-like protein 1-5.Plant Defective in cullin neddylation protein AAR3 [ ].DCN family proteins all contain a Potentiating neddylation (PONY) domain ( ), contains a cullin-binding surface within its C-terminal region and is sufficient to promote neddylation [ , ]. The N-terminal region of the protein often contains a UBA-like domain.
Protein Domain
Name: Protein of unknown function wound-induced
Type: Family
Description: This family of proteins is found in eukaryotes. Proteins in this family are typically between 81 and 97 amino acids in length. The proteins in the family are often annotated as wound-induced proteins however there is little accompanying literature to confirm this.
Protein Domain
Name: Ribosomal protein/NADH dehydrogenase domain
Type: Domain
Description: Proteins containing this domain are located in the mitochondrion and include ribosomal protein L51, and S25. This domain is also found in mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) . It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins.
Protein Domain
Name: NRAMP family
Type: Family
Description: The natural resistance-associated macrophage protein (NRAMP) family consists of animal NRAMP1, NRAMP2, yeast proteins Smf1 and Smf2 and bacterial homologues [ , , , , , , ]. The NRAMP family includes functional related proteins defined by a conserved hydrophobic core of ten transmembrane domains. These membrane proteins are divalent cation transporters which have a high degree of sequence conservation, particularly, the residues contributing to ion interaction are strongly conserved (DPNG and MPH motifs) [, ].NRAMP1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis, where it plays an essential role in host defense against pathogens. Mutations in NRAMP1 may genetically predispose an individual to susceptibility to diseases including leprosy and tuberculosis [ ]. NRAMP2 (DMT1) is a multiple divalent cation transporter broadly expressed in the duodenum, kidney, brain, testis and placenta. It transports Fe2+, Mn2+ and Cd+2, whereas Zn2+ is a poor substrate. Ca+2 and Mg+2 are not transported, which is important because their high concentrations in duodenum, where NRAMP2 is expressed at high levels, would interfere with the absorption of Fe2+ []. It is the major transferrin-independent iron uptake system in mammals , ].NRAMP related members of this family have substrate specificity for Mn2+ and/or Mg2+, such as the yeast proteins Smf1 and Smf2 [ ] and a group of bacterial transporters (NrmT, for Nramp-related magnesium transporter) [].
Protein Domain
Name: H/ACA ribonucleoprotein complex, subunit Gar1/Naf1
Type: Family
Description: H/ACA ribonucleoprotein particles (RNPs) are a family of RNA pseudouridine synthases that specify modification sites through guide RNAs. The function of these H/ACA RNPs is essential for biogenesis of the ribosome, splicing of precursor mRNAs (pre-mRNAs), maintenance of telomeres and probably for additional cellular processes [ ]. All H/ACA RNPs contain a specific RNA component (snoRNA or scaRNA) and at least four proteins common to all such particles: Cbf5, Gar1, Nhp2 and Nop10. These proteins are highly conserved from yeast to mammals and homologues are also present in archaea []. The H/ACA protein complex contains a stable core composed of Cbf5 and Nop10, to which Gar1 and Nhp2 subsequently bind [].Naf1 is an RNA-binding protein required for the maturation of box H/ACA snoRNPs complex and ribosome biogenesis. During assembly of the H/ACA snoRNPs complex, it associates with the complex, disappearing during maturation of the complex and being replaced by Gar1 to yield mature H/ACA snoRNPs complex. The core domain of Naf1 is homologous to the core domain of Gar1, suggesting that they share a common Cbf5 binding surface [ ].
Protein Domain
Name: SAND-like domain superfamily
Type: Homologous_superfamily
Description: The SAND domain (named after Sp100, AIRE-1, NucP41/75, DEAF-1) is a conserved ~80 residue region found in a number of nuclear proteins, many of which function in chromatin-dependent transcriptional control. These include proteins linked to various human diseases, such as the Sp100 (Speckled protein 100kDa), NUDR (Nuclear DEAF-1 related), GMEB (Glucocorticoid Modulatory Element Binding) proteins and AIRE-1 (Autoimmune regulator 1) proteins.Proteins containing the SAND domain have a modular structure; the SAND domain can be associated with a number of other modules, including the bromodomain, the PHD finger and the MYND finger. Because no SAND domain has been found in yeast, it is thought that the SAND domain could be restricted to animal phyla. Many SAND domain-containing proteins, including NUDR, DEAF-1 (Deformed epidermal autoregulatory factor-1) and GMEB, have been shown to bind DNA sequences specifically. The SAND domain has been proposed to mediate the DNA binding activity of these proteins [, ]. Structurally, the SAND domain consists of a novel alpha/beta fold, which has a core of three short helices packed against a barrel-like β-sheet; it is structurally similar to the SH3-like fold.Other proteins display domains that are structurally similar to the SAND domain. One such example is the SMAD4-binding domain of the oncoprotein Ski, which is stabilised by a bound zinc atom, and resembles a SAND domain, in which the corresponding I loop is responsible for DNA binding. Ski is able to disrupt the formation of a functional complex between the Co- and R-SMADs, leading to the repression of TGF-beta, Activin and BMP responses, resulting in the repression of TGF-signalling [ ].
Protein Domain
Name: SAND domain
Type: Domain
Description: The SAND domain (named after Sp100, AIRE-1, NucP41/75, DEAF-1) is a conserved ~80 residue region found in a number of nuclear proteins, many of which function in chromatin-dependent transcriptional control. These include proteins linked to various human diseases, such as the Sp100 (Speckled protein 100kDa) [ ], NUDR (Nuclear DEAF-1 related), GMEB (Glucocorticoid Modulatory Element Binding) proteins [] and AIRE-1 (Autoimmune regulator 1) proteins.Proteins containing the SAND domain have a modular structure; the SAND domain can be associated with a number of other modules, including the bromodomain, the PHD finger and the MYND finger. Because no SAND domain has been found in yeast, it is thought that the SAND domain could be restricted to animal phyla. Many SAND domain-containing proteins, including NUDR, DEAF-1 (Deformed epidermal autoregulatory factor-1) and GMEB, have been shown to bind DNA sequences specifically. The SAND domain has been proposed to mediate the DNA binding activity of these proteins [, ].The resolution of the 3D structure of the SAND domain from Sp100b has revealed that it consists of a novel alpha/beta fold. The SAND domain adopts a compact fold consisting of a strongly twisted, five-stranded antiparallel β-sheet with four α-helices packing against one side of the β-sheet. The opposite side of the β-sheet is solvent exposed. The β-sheet and α-helical parts of the structure form two distinct regions. Multiple hydrophobic residues pack between these regions to form a structural core. A conserved KDWK sequence motif is found within the α-helical, positively charged surface patch. The DNA binding surface has been mapped to the α-helical region encompassing the KDWK motif [].
Protein Domain
Name: CRIB domain
Type: Domain
Description: This entry represents the CRIB domain. Many putative downstream effectors of the small GTPases Cdc42 and Rac contain a GTPase binding domain (GBD), also called p21 binding domain (PBD), which has been shown to specifically bind the GTP bound form of Cdc42 or Rac, with a preference for Cdc42 [ , ]. The most conserved region of GBD/PBD domains is the N-terminal Cdc42/Rac interactive binding motif (CRIB), which consists of about 16 amino acids with the consensus sequence I-S-x-P-x(2,4)-F-x-H-x(2)-H-V-G [].Although the CRIB motif is necessary for the binding to Cdc42 and Rac, it is not sufficient to give high-affinity binding [ , ]. A less well conserved inhibitory switch (IS) domain responsible for maintaining the proteins in a basal (autoinhibited) state is located C-terminaly of the CRIB-motif [, , ].GBD domains can adopt related but distinct folds depending on context. Although GBD domains are largely unstructured in the free state, the IS domain forms an N-terminal β-hairpin that immediately follows the conserved CRIB motif and a central bundle of three α-helices in the autoinhibited state. The interaction between GBD domains and their respective G proteins leads to the formation of a high-affinity complex in which unstructured regions of both the effector and the G protein become rigid. CRIB motifs from various GBD domains interact with Cdc42 in a similar manner, forming an intermolecular β-sheet with strand β-2 of Cdc42. Outside the CRIB motif, the C-terminal of the various GBD domains are very divergent and show variation in their mode of binding to Cdc42, perhaps determining the specificity of the interaction. Binding of Cdc42 or Rac to the GBD domain causes a dramatic conformational change, refolding part of the IS domain and unfolding the rest [ , , , , ].Some proteins known to contain a CRIB domain are listed below:Mammalian activated Cdc42-associated kinases (ACKs), nonreceptor tyrosine kinases implicated in integrin-coupled pathways.Mammalian p21-activated kinases (PAK1 to PAK4), serine/threonine kinases that modulate cytoskeletal assembly and activate MAP-kinase pathways.Mammalian Actin nucleation-promoting factor WAS (also known as Wiskott-Aldrich Symdrom Proteins, WASPs), non-kinase proteins involved in the organisation of the actin cytoskeleton.Yeast STE20 and CLA4, the homologues of mammalian PAKs. STE20 is involved in the mating/pheromone MAP kinase cascade.
Protein Domain
Name: Histidine phosphatase superfamily, clade-2
Type: Family
Description: The histidine phosphatase superfamily is so named because catalysis centres on a conserved His residue that is transiently phosphorylatedduring the catalytic cycle. Other conserved residues contribute to a 'phosphate pocket' and interact with the phospho group of substratebefore, during and after its transfer to the His residue. Structure and sequence analyses show that different families contribute differentadditional residues to the 'phosphate pocket' and, more surprisingly, differ in the position, in sequence and in three dimensions, of acatalytically essential acidic residue. The superfamily may be divided into two main branches. The relationship between the two branches isnot evident by (PSI-)BLAST but is clear from more sensitive sequence searches and structural comparisons [].The smaller clade-2 is composed mainly of acid phosphatases and phytases. Acid phosphatases are a heterogeneous group of proteins that hydrolyse phosphate esters, optimally at low pH. The catalytic functions of these proteins include phytase, glucose-1-phosphatase and multiple inositol polyphosphate phosphatase. Fungal phytases are histidine acid phosphatases that catalyse the hydrolysis of phytate (myo-inositol hexakisphosphate) to myo-inositol and inorganic phosphate [ , ].Included in this group are:Escherichia coli pH 2.5 acid phosphatase (gene appA).E. coli glucose-1-phosphatase ( ) (gene agp). Yeast constitutive and repressible acid phosphatases (genes PHO3 and PHO5).Schizosaccharomyces pombe acid phosphatase (gene pho1).Aspergillus awamori phytases A and B ( ) (gene phyA and phyB). Mammalian lysosomal and prostatic acid phosphatase.Several Caenorhabditis elegans hypothetical proteins.
Protein Domain
Name: Domain of unknown function DUF569
Type: Domain
Description: This domain is found in a family of hypothetical proteins. Some family members contain two copies of the domain.
Protein Domain
Name: Late embryogenesis abundant protein, LEA-18
Type: Family
Description: This is a family of late embryogenesis-abundant proteins There is high accumulation of this protein in dry seeds, and in the roots of full-grown plants in response to dehydration and ABA (abscisic acid application) treatments [ ]. This LEA protein disappears after germination. It accumulates in growing regions of well irrigated hypocotyls and meristems suggesting a role in seedling growth resumption on rehydration []. As a group the LEA proteins are highly hydrophilic, contain a high percentage of glycine residues, lack Cys and Trp residues and do not coagulate upon exposure to high temperature, and for these reasons are considered to be members of a group of proteins called hydrophilins []. Expression of the protein is negatively regulated during etiolating growth, particularly in roots, in contrast to its expression patterns during normal growth [].
Protein Domain
Name: IPT domain
Type: Domain
Description: The IPT (Ig-like, plexins, transcription factors) domain has an immunoglobulin like fold [ ]. These domains are found in cell surface receptors such as Met and Ron as well as in intracellular transcription factors where it is involved in DNA binding. The Ron tyrosine kinase receptor shares with the members of its subfamily (Met and Sea) a unique functional feature: the control of cell dissociation, motility, and invasion of extracellular matrices (scattering) [ ].
Protein Domain      
Protein Domain
Name: DNA mismatch repair protein MutS, N-terminal
Type: Homologous_superfamily
Description: Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication [ ]. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA. MutS is a modular protein with a complex structure [ ], and is composed of:N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease.Connector domain, which is similar in structure to Holliday junction resolvase ruvC.Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA.Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a β-sheet structure.ATPase domain (connected to the core domain), which has a classical Walker A motif.HTH (helix-turn-helix) domain, which is involved in dimer contacts.The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [ ].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts []. This entry represents the N-terminal domain of proteins in the MutS family of DNA mismatch repair proteins. The N-terminal domain of MutS is responsible for mismatch recognition and forms a 6-stranded mixed β-sheet surrounded by three α-helices, which is similar to the structure of tRNA endonuclease.
Protein Domain
Name: Porin, eukaryotic type
Type: Family
Description: Eukaryotic mitochondrial porins are voltage-dependent anion-selective channels (VDAC) that behave as general diffusion pores for small hydrophilic molecules [ , , , ]. The channel adopts an open conformation at low or zero membrane potential and a closed conformation at potentials above 30-40 mV.The proteins are composed of between 12 to 16 β-strands that span the mitochondrial outer membrane. Yeast contains two members of this family (genes POR1 and POR2); vertebrates have at least three members (genes VDAC1, VDAC2 and VDAC3) [ , ].
Protein Domain
Name: KDPG/KHG aldolase
Type: Family
Description: 4-Hydroxy-2-oxoglutarate aldolase ( ) (KHG-aldolase) catalyzes the interconversion of 4-hydroxy-2-oxoglutarate into pyruvate and glyoxylate. Phospho-2-dehydro-3-deoxygluconate aldolase ( ) (KDPG-aldolase) catalyzes the interconversion of 6-phospho-2-dehydro-3-deoxy-D-gluconate into pyruvate and glyceraldehyde 3-phosphate. E. coli Eda is a KDPG-aldolase that has also been found to have a role in the degradation of 2-keto-4-hydroxyglutarate (KHG) to pyruvate and glyoxylate []. In fact, KHG-aldolase and Eda are the same enzyme [].This family consists of KHG/KDPG aldolases, and also includes 2-keto-3-deoxy-6-phosphogalactonate (KDPGal) aldolase. KDPGal-aldolase catalyzes an identical reaction to KDPG, differing in substrate specificity in only the configuration of a single stereocentre [ ].
Protein Domain
Name: ATPase, V0 complex, subunit d
Type: Family
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release []. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins [].The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis [ , , ]. The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases. This entry represents subunit d from the V0 complex of V-ATPases, which are involved in the translocation of protons across a membrane. There is more than one type of d subunit in V-ATPases, where the D1 subunit is ubiquitous, while the D2 subunit has limited tissue expressivity, possibly to account for differential functions, targeting or regulation of V-ATPase activity [ ].
Protein Domain
Name: ATPase, V0 complex, c/d subunit
Type: Family
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V-ATPase folding, localisation, and stability is made possible through the formation of a luminal glycan coat by the glycolipids and the glycosylated V0 subunits [ ]. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis [, , ]. The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases. This entry represents subunits C and D from the V0 complex of V-ATPases. There is more than one type of D subunit in V-ATPases, where the D1 subunit is ubiquitous, while the D2 subunit has limited tissue expressivity, possibly to account for differential functions, targeting or regulation of V-ATPase activity [ ].
Protein Domain
Name: NUDIX hydrolase
Type: Domain
Description: Recently, the generic name 'NUDIX hydrolases' (NUcleoside DIphosphate linke to some other moeity X) has been coined for this domain family [ ]. The family can be divided into a number of subgroups, of which MutT anti-mutagenic activity represents only one type; most of the rest hydrolyse diverse nucleoside diphosphate derivatives (including ADP-ribose, GDP-mannose, TDP-glucose, NADH, UDP-sugars, dNTP and NTP).
Protein Domain
Name: SAC3/GANP/THP3, conserved domain
Type: Domain
Description: This domain contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. Proteins containing this domain include the yeast nuclear export factor Sac3 [ ], and mammalian GANP/MCM3-associated protein, which facilitates the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle []. It also includes yeast Thp3 (THO-related protein 3), which may have a role in transcription elongation [ ].
Protein Domain
Name: Mitochondrial protein C2orf69
Type: Family
Description: This family includes mitochondrial protein C2orf69 from human, previously known as UPF0565, which is an important regulator of human mitochondrial function and may play a role in the respiratory chain and other metabolic pathways [ ].
Protein Domain      
Protein Domain
Name: Flavin monooxygenase FMO
Type: Family
Description: Flavin-containing monooxygenases (FMOs) constitute a family of xenobiotic-metabolising enzymes [ ]. Using an NADPH cofactor and FAD prosthetic group, these microsomal proteins catalyse the oxygenation of nucleophilic nitrogen, sulphur, phosphorus and selenium atoms in a range of structurally diverse compounds. FMOs have been implicated in the metabolism of a number of pharmaceuticals, pesticides and toxicants. In man, lack of hepatic FMO-catalysed trimethylamine metabolism results in trimethylaminuria (fish odour syndrome). Five mammalian forms of FMO are now known and have been designated FMO1-FMO5 [, , , , , ]. This is a recent nomenclature based on comparison of amino acid sequences, and has been introduced in an attempt to eliminate confusion inherent in multiple, laboratory-specific designations and tissue-based classifications []. Following the determination of the complete nucleotide sequence of Saccharomyces cerevisiae (Baker's yeast) [], a novel gene was found to encode a protein with similarity to mammalian monooygenases. In Aspergillus, flavin-containing monooxygenases ustF1 and ustF2 are components in the biosynthesis of the antimitotic tetrapeptide ustiloxin B, a secondary metabolite. The monooxygenases modify the side chain of the intermediate S-deoxyustiloxin H [].
Protein Domain
Name: Flavin monooxygenase-like
Type: Family
Description: Flavin-containing monooxygenases (FMOs) constitute a family of xenobiotic-metabolising enzymes [ ]. Using an NADPH cofactor and FAD prosthetic group, these microsomal proteins catalyse the oxygenation of nucleophilic nitrogen, sulphur, phosphorus and selenium atoms in a range of structurally diverse compounds. FMOs have been implicated in the metabolism of a number of pharmaceuticals, pesticides and toxicants. In man, lack of hepatic FMO-catalysed trimethylamine metabolism results in trimethylaminuria (fish odour syndrome). Five mammalian forms of FMO are now known and have been designated FMO1-FMO5 [, , , , , ]. This is a recent nomenclature based on comparison of amino acid sequences, and has been introduced in an attempt to eliminate confusion inherent in multiple, laboratory-specific designations and tissue-based classifications []. Following the determination of the complete nucleotide sequence of Saccharomyces cerevisiae (Baker's yeast) [], a novel gene was found to encode a protein with similarity to mammalian monooygenases. In Aspergillus, flavin-containing monooxygenases ustF1 and ustF2 are components in the biosynthesis of the antimitotic tetrapeptide ustiloxin B, a secondary metabolite. The monooxygenases modify the side chain of the intermediate S-deoxyustiloxin H [].
Protein Domain
Name: Nucleolar GTP-binding protein 2, circularly permuted GTPase motif
Type: Domain
Description: This entry represents the circularly permuted GTPase motif of Nucleolar GTP-binding protein 2 (GNL2, also known as NGP-1) from animals and fungi. GNL2 is a GTPase that associates with pre-60S ribosomal subunits in the nucleolus and is required for their nuclear export and maturation [ ].This group of proteins also includes Nuclear/nucleolar GTPase 2 (NUG2) from plants [ ].
Protein Domain
Name: Calmodulin-binding domain, plant
Type: Domain
Description: This domain is found repeated in a number of plant calmodulin-binding proteins (such as , and ). It is thought to represent a calmodulin-binding domain [ , ]. Binding of the proteins to calmodulin depends on the presence of calcium ions [, ]. Proteins containing this domain are thought to be involved in various processes, such as plant defence responses [] and stolonisation or tuberization [].
Protein Domain
Name: Decaprenyl diphosphate synthase-like
Type: Family
Description: In prokaryotes, undecaprenyl diphosphate synthase (UPP synthase, di-trans-poly-cis-decaprenylcistransferase or ditrans,polycis-undecaprenyl-diphosphate synthase ( )), catalyzes the formation of the carrier lipid undecaprenyl pyrophosphate (UPP) in bacterial cell wall peptidoglycan biosynthesis from isopentenyl pyrophosphate (IPP) [ , , , , , , , , , , , , , ]. Cis (Z)-Isoprenyl diphosphate synthase (cis-IPPS) catalyzes the successive 1'-4 condensation of the IPP molecule to trans,trans-farnesyl diphosphate (FPP) or to cis,trans-FPP to form long-chain polyprenyl diphosphates. A few can also catalyze the condensation of IPP to trans-geranyl diphosphate to form the short-chain cis,trans- FPP. cis-IPPS form homodimers and are mechanistically and structurally distinct from trans-IPPS, which lack the DDXXD motifs, yet require Mg2+for activity. Homologues are also found in archaebacteria and include a number of uncharacterised proteins including some from yeasts. This entry also matches related enzymes that transfer alkyl groups, such as dehydrodolichyl diphosphate synthase from eukaryotes, which catalyzes the formation of the polyisoprenoid glycosyl carrier lipid dolichyl monophosphate.
Protein Domain
Name: Di-trans-poly-cis-decaprenylcistransferase-like, conserved site
Type: Conserved_site
Description: Synonym(s): Di-trans-poly-cis-undecaprenyl-diphosphate synthase, Undecaprenyl pyrophosphate synthetase, Undecaprenyl pyrophosphate synthase, UPP synthetaseDi-trans-poly-cis-decaprenylcistransferase ( ) (UPP synthetase) generates undecaprenyl pyrophosphate (UPP) from isopentenyl pyrophosphate(IPP) [ ]. This bacterial enzyme is also found in archaebacteria and in a number of uncharacterised proteins including some from yeasts.This conserved pattern is found towards the C-terminal region in many of the proteins in this entry. The proteins are of about 26 to 40kDa whose central region is well conserved. This pattern also hits related enzymes such as dehydrodolichyl diphosphate synthase.
Protein Domain
Name: Glycoside hydrolase, family 79
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.This is a family of endo-beta-N-glucuronidase, or heparanase belonging to glycoside hydrolase family 79 ( ). Heparan sulphate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulphate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular microenvironment. Heparanase degrades HS at specific intrachain sites. The enzyme is synthesized as a latent approximately 65kDa protein that is processed at the N terminus into a highly active approximately 50kDa form. Experimental evidence suggests that heparanase may facilitate both tumor cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity [ ].
Protein Domain
Name: Powdery mildew resistance protein, RPW8 domain
Type: Domain
Description: This entry represents the RPW8 domain found in several broad-spectrum mildew resistance proteins from Arabidopsis thaliana and other dicots. Plant disease resistance (R) genes control the recognition of specific pathogens and activate subsequent defence responses. The R protein-mediated defences typically involve a rapid, localized necrosis, or hypersensitive response (HR), at the site of infection, and the localised formation of antimicrobial chemicals and proteins that restrict growth of the pathogen. The A. thaliana locus Resistance to Powdery Mildew 8 (RPW8) contains two naturally polymorphic, dominant R genes: RPW8.1 and RPW8.2, which individually control resistance to a broad range of powdery mildew pathogens. They induce localised, salicylic acid-dependent defences similar to those induced by R genes that control specific resistance. Apparently, broad-spectrum resistance mediated by RPW8 uses the same mechanisms as specific resistance [ , ]. RPW8.1 and RPW8.2 share similarity with an ~150 amino acid module forming the N terminus of a group of disease resistance proteins, which have a nucleotide-binding site (NBS) and leucine-rich repeats (LRRs) [, ].The RPW8 domain sequences contain a predicted N-terminal transmembrane (TM) region or possibly a signal peptide, and a coiled-coil (CC) motif [ ].
Protein Domain
Name: HNH nuclease
Type: Domain
Description: This domain is found in HNH family of nucleases that includes yeast intron 1 protein, human DNA annealing helicase and endonuclease ZRANB3, bacterial CRISPR-associated endonuclease Cas9, colicins, pyocins and endonuclease HphI. Members in this group are found in all domains of life.
Protein Domain
Name: HNH endonuclease
Type: Domain
Description: HNH endonuclease is found in bacteria and viruses [ , , ]. This family includes pyocins, colicins and anaredoxins.
Protein Domain
Name: Translation elongation factor, IF5A C-terminal
Type: Domain
Description: A five-stranded β-barrel was first noted as a common structure among four proteins binding single-stranded nucleic acids (staphylococcal nuclease and aspartyl-tRNA synthetase) or oligosaccharides (B subunits of enterotoxin and verotoxin-1), and has been termed the oligonucleotide/oligosaccharide binding motif, or OB fold, a five-stranded β-sheet coiled to form a closed β-barrel capped by an alpha helix located between the third and fourth strands []. Two ribosomal proteins, S17 and S1, are members of this class, and have different variations of the OB fold theme. Comparisons with other OB fold nucleic acid binding proteins suggest somewhat different mechanisms of nucleic acid recognition in each case [].There are many nucleic acid-binding proteins that contain domains with this OB-fold structure, including anticodon-binding tRNA synthetases, ssDNA-binding proteins (CDC13, telomere-end binding proteins), phage ssDNA-binding proteins (gp32, gp2.5, gpV), cold shock proteins, DNA ligases, RNA-capping enzymes, DNA replication initiators and RNA polymerase subunit RBP8 [ ].This entry represents the RNA-binding domain of translation elongation factor IF5A [ ].
Protein Domain
Name: Translation elongation factor IF5A-like
Type: Family
Description: Eukaryotic eIF-5A was initially thought to function as a translation initiation factor, based on its ability to stimulate methionyl-puromycin synthesis. However, subsequent work revealed a role for eIF5A in translation elongation [ , ]. Depletion or inactivation of eIF-5A in the yeast Saccharomyces cerevisiae (Baker's yeast) resulted in the accumulation of polysomes and an increase in ribosomal transit times. Addition of recombinant eIF-5A from yeast, but not a derivative lacking hypusine, enhanced the rate of tripeptide synthesis in vitro. Moreover, inactivation of eIF-5A mimicked the effects of the eEF2 inhibitor sordarin, indicating that eIF-5A might function together with eEF2 to promote ribosomal translocation. Finally, it was shown that eIF5A is specifically required to promote peptide-bond formation between consecutive proline residues. It has been proposed to stimulate the peptidyl-transferase activity of the ribosome and facilitate the reactivity of poor substrates like proline [].eIF-5A is a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively [ , , ]. IF-5A is the sole protein in eukaryotes and archaea to contain the unusual amino acid hypusine (Ne-(4-amino-2-hydroxybutyl)lysine) that is an absolute functional requirement. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported []. The archaeal IF-5A proteins have not been studied as comprehensively as their eukaryotic homologues, though the crystal structure of the Pyrobaculum aerophilum protein has been determined. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding [ ].This family also includes the Woronin body major protein Hex1, whose sequence and structure are similar to eukaryotic initiation factor 5A (eIF5A), suggesting they share a common ancestor during evolution [ ]. Woronin bodies are important for stress resistance and virulence [].
Protein Domain
Name: Translation elongation factor, IF5A, hypusine site
Type: PTM
Description: Translation initiation factor 5A (IF-5A) was previously reported to be involved in the first step of peptide bond formation in translation; however more recent work implicates it as a universally conserved translation elongation factor [ ].eIF5A is a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively [ , , ]. IF-5A is the sole protein in eukaryotes and archaea to contain the unusual amino acid hypusine (Ne-(4-amino-2-hydroxybutyl)lysine) that is an absolute functional requirement. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported. Hypusine is derived from lysine by the post-translational addition of a butylamino group (from spermidine) to the ε-amino group of lysine. The hypusine group is essential to the function of eIF-5A. A hypusine-containing protein has been found in archaebacteria such as Sulfolobus acidocaldarius or Methanocaldococcus jannaschii (Methanococcus jannaschii); this protein is highly similar to eIF-5A and could play a similar role in protein biosynthesis. The signature for eIF-5A is centred on the hypusine residue. The crystal structure of IF-5A from the archaeon Pyrobaculum aerophilum has been determined to 1.75 A. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of Escherichia coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding [ ].
Protein Domain
Name: Carbohydrate binding module family 25
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This entry represents , which has been shown to bind starch [ ].
Protein Domain
Name: Glyoxalase/fosfomycin resistance/dioxygenase domain
Type: Domain
Description: Glyoxalase I ( ) (lactoylglutathione lyase) catalyzes the first step of the glyoxal pathway. S-lactoylglutathione is then converted by glyoxalase II to lactic acid [ ].Glyoxalase I is an ubiquitous enzyme which binds one mole of zinc per subunit. The bacterial and yeast enzymes are monomeric while the mammalian one is homodimeric. The sequence of glyoxalase I is well conserved.The domain represented by this entry is found in glyoxalase I and in other related proteins, including fosfomycin resistance proteins FosB [ ], FosA [], FosX [] and dioxygenases (eg. 4-hydroxyphenylpyruvate dioxygenase).
Protein Domain
Name: DNA endonuclease activator Ctp1, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of the fission yeast Ctip (Ctp1) protein. Proteins containing this domain include DNA endonuclease RBBP8 (also known as CtBP-interacting protein, CtIP) from animals, protein gamma response 1 (GR1) from Arabidopsis and SAE2 from S. cerevisiae [ , ]. SAE2 is a protein involved in repairing meiotic and mitotic double-strand breaks in DNA [, , ].Although proteins containing this domain were described as endonucleases, it is now known that they actually function as endonuclease activators that cooperates with the MRE11-RAD50-NBN (MRN) complex in processing meiotic and mitotic double-strand breaks (DSBs) by ensuring both resection and intrachromosomal association of the broken ends [ , , ]. This domain contains highly conserved residues at its 15-residue extreme that are indispensable for MRN (Mre11-Rad50-Nbs1) complex activation, through the stimulation of Mre11 endonuclease activity [].
Protein Domain
Name: Neurotransmitter-gated ion-channel, conserved site
Type: Conserved_site
Description: Neurotransmitter ligand-gated ion channels are transmembrane receptor-ion channel complexes that open transiently upon binding of specific ligands, allowing rapid transmission of signals at chemical synapses [ , ]. Five of these ion channel receptor families have been shown to form a sequence-related superfamily:Nicotinic acetylcholine receptor (AchR), an excitatory cation channel in vertebrates and invertebrates; in vertebrate motor endplates it is composed of alpha, beta, gamma and delta/epsilon subunits; in neurons it is composed of alpha and non-alpha (or beta) subunits [ ].Glycine receptor, an inhibitory chloride ion channel composed of alpha and beta subunits [ ].Gamma-aminobutyric acid (GABA) receptor, an inhibitory chloride ion channel; at least four types of subunits (alpha, beta, gamma and delta) are known [ ].Serotonin 5HT3 receptor, of which there are seven major types (5HT3-5HT7) [ ].Glutamate receptor, an excitatory cation channel of which at least three types have been described (kainate, N-methyl-D-aspartate (NMDA) and quisqualate) [ ].These receptors possess a pentameric structure (made up of varying subunits), surrounding a central pore. All known sequences of subunits from neurotransmitter-gated ion-channels are structurally related. They are composed of a large extracellular glycosylated N-terminal ligand-binding domain, followed by three hydrophobic transmembrane regions which form the ionic channel, followed by an intracellular region of variable length. A fourth hydrophobic region is found at the C-terminal of the sequence [ , ].This entry represents a conserved site based around two highly conserved cysteines in the N-terminal of AchR/GABA/5HT3/Gly receptors. The residues between these two cysteines are also well conserved. In AchR, these cysteine residues have been shown to form a disulphide bond essential to the tertiary structure of the receptor.
Protein Domain
Name: Zinc finger, double-stranded RNA binding
Type: Domain
Description: This zinc finger is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus [ ]. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation. This entry represents the multiple-adjacent-C2H2 zinc finger, JAZ.
Protein Domain
Name: Matrin/U1-C-like, C2H2-type zinc finger
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few [ ]. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short β hairpin and an α helix (β/β/α structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 [ ]. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved β/β/α structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short α-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets [].This entry represents U1-type zinc finger domains, a family of C2H2-type zinc fingers present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins [ , ].
Protein Domain
Name: Proline dehydrogenase domain
Type: Domain
Description: The proline oxidase/dehydrogenase is responsible for the first step in the conversion of proline to glutamate for use as a carbon and nitrogen source. The enzyme requires FAD as a cofactor, and is induced by proline.
Protein Domain
Name: Proline oxidase family
Type: Family
Description: This entry includes a group of proline dehydrogenases (proline oxidases) found in bacteria, archaea and eukaryotes (mitochondria). This entry includes FadM from Bacillus subtilissubsp. natto[ ] and slgA from Drosophila melanogaster. They convert proline to delta-1-pyrroline-5-carboxylate [ ]. This entry also includes mammalian hydroxyproline dehydrogenase which converts trans-4-L-hydroxyproline to delta-1-pyrroline-3-hydroxy-5-carboxylate [].
Protein Domain
Name: Peptidase M17, leucine aminopeptidase/peptidase B
Type: Family
Description: The majority of members of this family are zinc-dependent exopeptidases belonging to MEROPS peptidase family M17 (leucyl aminopeptidase, clan MF).Leucyl aminopeptidase (LAP; ) selectively release N-terminal amino acid residues from polypeptides and proteins; in general they are involved in the processing, catabolism and degradation of intracellular proteins [ , , ]. Leucyl aminopeptidase forms a homohexamer containing two trimers stacked on top of one another []. Each monomer binds two zinc ions. The zinc-binding and catalytic sites are located within the C-terminal catalytic domain []. Leucine aminopeptidase has been shown to be identical with prolyl aminopeptidase () in mammals [ ]. Interestingly, members of this group are also implicated in transcriptional regulation and are thought to combine catalytic and regulatory properties [ ]. The N-terminal domain of these proteins has been shown in Escherichia coli PepA to function as a DNA-binding protein in Xer site-specific recombination and in transcriptional control of the carAB operon [, ]. It is not well conserved and in some members can be found only by PSI-BLAST (after 4-6 iterations). It is not clear if the DNA binding function is preserved in all or even in most of the members.For additional information please see [ , , , ].
Protein Domain
Name: Ubiquitin/SUMO-activating enzyme ubiquitin-like domain
Type: Domain
Description: This is the C-terminal domain of ubiquitin-activating enzyme and SUMO-activating enzyme 2. It is structurally similar to ubiquitin. This domain is involved in E1-SUMO-thioester transfer to the SUMO E2 conjugating protein [ ].
Protein Domain      
Protein Domain
Name: Ubiquitin-activating enzyme, SCCH domain
Type: Domain
Description: Ubiquitin-activating enzyme (E1 enzyme) activates ubiquitin by first adenylating with ATP its C-terminal glycine residue and thereafter linking this residue to the side chain of a cysteine residue in E1, yielding an ubiquitin-E1 thiolester and free AMP. Later the ubiquitin moiety is transferred to a cysteine residue on one of the many forms of ubiquitin-conjugating enzymes (E2) [ ]. This domain carries the last of five conserved cysteines that is part of the active site of the enzyme, responsible for ubiquitin thiolester complex formation, the active site being represented by the sequence motif PICTLKNFP []. Not all proteins in this entry contain a functional active site.The catalytic cysteine domain contains the E1 active site cysteine, and is divided in two half-domains, FCCH and SCCH, for 'first' and 'second' catalytic cysteine half-domain, respectively. This domain represents the domain 5 found in Ub-activating enzyme E1, the SCCH in which resides the catalytic cysteine [ ]. This domain has an α-helical structure and likely to exist in equilibrium of open (adenylation active) and closed (thioester bond formation active) conformations. SCCH, FCCH (the first catalytic cysteine half-domain) and UFD (ubiquitin fold domain) are connected to the AAD (active adenylation domain) through flexible loops that allow the conformational changes and rotations of these domains essential for catalysis of Ub activation and transfer of activated UB from E1 to E2 [].
Protein Domain      
Protein Domain
Name: Rab-GTPase-TBC domain
Type: Domain
Description: The ~200 amino acid TBC/rab GTPase-activating protein (GAP) domain is well conserved across species and has been found in a wide range of different proteins from plant adhesion molecules to mammalian oncogenes. The name TBC derives from the name of the murine protein Tbc1 in which this domain was first identified based on its similarity to sequences in the tre-2 oncogene, and the yeast regulators of mitosis, BUB2 and cdc16 [ ]. The connection of this domain with rab GTPase activation stems from subsequent in-depth sequence analyses and alignments [] and recent work demonstrating that it appears to contain the catalytic activities of the yeast rab GAPs, GYP1, and GYP7 [].The TBC/rab GAP domain has also been named PTM after three proteins known to contain it: the Drosophila pollux, the human oncoprotein TRE17 (oncoTRE17), and a myeloid cell line-expressed protein [ ]. The TBC/rab GAP domain contains six conserved motifs named A to F []. A conserved arginine residue in the sequence motif B has been shown to be critical for the full GAP activity []. Resolution of the 3D structure of the TBC/rab GAP domain of GYP1 has shown that it is a fully α-helical V-shaped molecule. The conserved arginine residue is positioned at the side of the narrow cleft on the concave site of the V-shaped molecule. It has been proposed that this cleft is the binding site for the GTPase. The conserved arginine residue probably functions as a catalytic arginine finger analogous to that seen in ras and Rho-GAPs. The two key features of the arginine finger activation mechanism appear to be (i) the positioning of the catalytically essential GTPase glutamine side chain via a hydrogen bonding interaction between the glutamine carbamoyl-NH2 group and the main chain carbonyl group of the GAP arginine, and (ii) the polarization of the gamma-phosphate group or the stabilization of charge on it via the interaction of the positively charged side chain guanidinoyl group of the GAP arginine [].
Protein Domain
Name: Protein kinase C-like, phorbol ester/diacylglycerol-binding domain
Type: Domain
Description: Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues of DAG and potent tumour promoters that cause a variety of physiological changes when administered to both cells and tissues. DAG activates a family of serine/threonine protein kinases, collectively known as protein kinase C (PKC) [ ]. Phorbol esters can directly stimulate PKC. The N-terminal region of PKC, known as C1, has been shown [] to bind PE and DAG in a phospholipid and zinc-dependent fashion. The C1 region contains one or two copies (depending on the isozyme of PKC) of a cysteine-rich domain, which is about 50 amino-acid residues long, and which is essential for DAG/PE-binding. The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are probably the six cysteines and two histidines that are conserved in this domain.
Protein Domain
Name: Diacylglycerol kinase, accessory domain
Type: Domain
Description: Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The DAG kinase domain is assumed to be an accessory domain. Upon cell stimulation, DAG kinase converts DAG into phosphatidate, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. It catalyses the reaction: ATP + 1,2-diacylglycerol = ADP + 1,2-diacylglycerol 3-phosphate. The enzyme is stimulated by calcium and phosphatidylserine and phosphorylated by protein kinase C. This domain is always associated with . Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].
Protein Domain
Name: Transmembrane protein TPRA1/CAND2/CAND8
Type: Family
Description: This family of membrane proteins are conserved from plants to humans, including CAND2 and CAND8 from Arabidopsis. CAND2 and CAND8 are predicted G-protein coupled receptors [ ]. CAND2 plays a role in plants and microbes interactions [] and acts as a phytomelatonin receptor that regulates stomatal closure through the Galpha subunit-mediated H2O2 production and Ca2 flux dynamics [].
Protein Domain
Name: Yif1 family
Type: Family
Description: Yif1 (Yip1 interacting factor) is an integral membrane protein required for membrane fusion of ER derived vesicles [ ]. It also plays a role in the biogenesis of ER derived COPII transport vesicles [].
Protein Domain
Name: Tryptophan synthase, beta chain-like
Type: Family
Description: These sequences represent a family of pyridoxal-phosphate dependent enzymes that are closely related to the beta subunit of tryptophan synthase.
Protein Domain
Name: Tryptophan synthase beta chain/beta chain-like
Type: Family
Description: Tryptophan synthase catalyses the last step in the biosynthesis of tryptophan [ , ]:L-serine + 1-(indol-3-yl)glycerol 3-phosphate = L-tryptophan + glyceraldehyde 3-phosphate + H2O It has two functional domains, each found in bacteria and plants on a separate subunit: alpha chain () is for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate and beta chain is for the synthesis of tryptophan from indole and serine. In fungi the two domains are fused together on a single multifunctional protein [ ].The beta chain of the enzyme, represented here, requires pyridoxal-phosphate as a cofactor. The pyridoxal-phosphate group is attached to a lysine residue. The region around this lysine residue also contains two histidine residues which are part of the pyridoxal-phosphate binding site.
Protein Domain
Name: Biogenesis of lysosome-related organelles complex-1, subunit 2
Type: Family
Description: This entry represents a family of proteins that play a role in cellular proliferation, as well as in the biogenesis of specialised organelles of the endosomal-lysosomal system [ ].
Protein Domain
Name: Rieske iron-sulphur protein
Type: Family
Description: The Rieske subunit can be found in the Ubiquinol-cytochrome c reductase (bc1 complex or complex III) or the cytochrome b6f complex. The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron [ , ]. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit. Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems. It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential which is linked to ATP synthesis [ , ].The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur `Rieske' subunit, which contains a high potential 2Fe-2S cluster []. The mitochondrial form also includes six other subunits that do not possess redox centres. The plant cytochrome b6f is located in the thylakoid membrane and functions in both linear and cyclic electron transport, providing ATP and NADPH for photosynthetic carbon fixation. The cytochrome b6f complex has eight different subunits, six being encoded in the chloroplast genome (PetA [cyt f], PetB [cyt b6], PetD, PetG, PetL, and PetN) and two in the nucleus (PetC [Rieske FeS] and PetM. The complex functions as a dimer []. In cyanobacteria, the cytochrome b6f complex contains four large subunits, including cytochrome f, cytochrome b6, the Rieske iron-sulfur protein (ISP), and subunit IV; as well as four small hydrophobic subunits, PetG, PetL, PetM, and PetN []. Proteins in this entry also include arsenite oxidase subunit AioB from Alcaligenes faecalis. It is involved in the detoxification of arsenic [ ].
Protein Domain
Name: Rieske iron-sulphur protein, C-terminal
Type: Domain
Description: Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential, which is linked to ATP synthesis [ , ]. The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur `Rieske' subunit, which contains a high potential 2Fe-2S cluster [ ].The mitochondrial form also includes six other subunits that do not possess redox centres. Plastoquinone-plastocyanin reductase (b6f complex), cyanobacteria and the chloroplasts of plants, catalyses the oxidoreduction of plastoquinol and cytochrome f. This complex, which is functionally similar to ubiquinol-cytochrome c reductase, comprises cytochrome b6, cytochrome f and Rieske subunits [].The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron [ , ]. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit.
Protein Domain      
Protein Domain
Name: Rieske [2Fe-2S] iron-sulphur domain
Type: Domain
Description: There are multiple types of iron-sulphur clusters which are grouped into three main categories based on their atomic content: [2Fe-2S], [3Fe-4S], [4Fe-4S] (see ), and other hybrid or mixed metal types. Two general types of [2Fe-2S] clusters are known and they differ in their coordinating residues. The ferredoxin-type [2Fe-2S]clusters are coordinated to the protein by four cysteine residues (see ). The Rieske-type [2Fe-2S] cluster is coordinated to its protein by two cysteine residues and two histidine residues [, ].The structure of several Rieske domains has been solved [ ]. It contains three layers of antiparallel beta sheets forming two beta sandwiches. Both beta sandwiches share the central sheet 2. The metal-binding site is at the top of the beta sandwich formed by the sheets 2 and 3. The Fe1 iron of the Rieske cluster is coordinated by two cysteines while the other iron Fe2 is coordinated by two histidines. Two inorganic sulphide ions bridge the two iron ions forming a flat, rhombic cluster. Rieske-type iron-sulphur clusters are common to electron transfer chains of mitochondria and chloroplast and to non-haem iron oxygenase systems: The Rieske protein of the Ubiquinol-cytochrome c reductase ( ) (also known as the bc1 complex or complex III), a complex of the electron transport chains of mitochondria and of some aerobic prokaryotes; it catalyses the oxidoreduction of ubiquinol and cytochrome c. The Rieske protein of chloroplastic plastoquinone-plastocyanin reductase ( ) (also known as the b6f complex). It is functionally similar to the bc1 complex and catalyses the oxidoreduction of plastoquinol and cytochrome f. Bacterial naphthalene 1,2-dioxygenase subunit alpha, a component of the naphthalene dioxygenase (NDO) multicomponent enzyme system which catalyses the incorporation of both atoms of molecular oxygen into naphthalene to form cis-naphthalene dihydrodiol. Bacterial 3-phenylpropionate dioxygenase ferredoxin subunit. Bacterial toluene monooxygenase. Bacterial biphenyl dioxygenase.
Protein Domain
Name: Folate-sensitive fragile site protein Fra10Ac1
Type: Family
Description: This entry represents the full-length proteins in which, in higher eukaryotes, the nested domain EDSLL lies. Fra10Ac1 is a highly conserved nuclear protein of unknown function that is highly expressed in brain tissue [ ].
Protein Domain
Name: CCAAT-binding factor, conserved site
Type: Conserved_site
Description: The CCAAT-binding factor (CBFB/NF-YA) is a mammalian transcription factor that binds to a CCAAT motif in the promoters of a wide variety of genes, including type I collagen and albumin [ ]. The factor is a heteromeric complex of A and B subunits, both of which are required for DNA-binding [ ]. The subunits can interact in the absence of DNA-binding, conserved regions in each being important in mediating this interaction. The B subunit contains a region of similarity with the yeast protein HAP2 []. For the B subunit it has been suggested that the N-terminal portion of the conserved region is involved in subunit interaction and the C-terminal region involved in DNA-binding [].
Protein Domain
Name: Nuclear transcription factor Y subunit A
Type: Family
Description: Diverse DNA binding proteins are known to bind the CCAAT box, a common cis- acting element found in the promoter and enhancer regions of a large number ofgenes in eukaryotes. Amongst these proteins is one known as the CCAAT-binding factor (CBF) or nuclear transcription factor Y (NF-Y) []. CBF is a heteromeric transcription factor that consists of two different components both needed for DNA-binding.The HAP protein complex of yeast binds to the upstream activation site of cytochrome C iso-1 gene (CYC1) as well as other genes involved inmitochondrial electron transport and activates their expression. It also recognises the sequence CCAAT and is structurally and evolutionary related toCBF.The first subunit of CBF is known as CBF-A or NF-YB in vertebrates, and HAP3 in budding yeast. The second subunit is known as CBF-B or NF-YA in vertebrates and HAP2 in budding yeast. It is a protein of 265 to 350 amino-acid residues which contains a highly conserved region of about 60 residues. This region, called the 'essential core' [ ], seems to consist of two subdomains: an N-terminal subunit-association domain and a C-terminal DNA recognition domain. This entry represents the NF-YA subunit.
Protein Domain
Name: Protein Thf1
Type: Family
Description: Thf1 protein (also known as Psb29) is found in Cyanobacteria and in the plastids of vascular plants. They may function in the biogenesis of Photosystem II complexes [ ].In Synechocystis it was isolated and partially sequenced from purified photosystem II (PS II). Deletion of psb29 in Synechocystis 6803 results in slower growth rates under high light intensities, increased light sensitivity, and lower PSII efficiency, without affecting the PSII core electron transfer activities [ ]. In plants Thf1 is localised to the outer plastid membrane and the stroma. Thf1 has a role in sugar signalling. Thf1 is also thought to have a role in chloroplast and leaf development. Thf1 has been shown to play a crucial role in vesicle-mediated thylakoid membrane biogenesis [ , ].
Protein Domain
Name: Translation initiation factor IF-1
Type: Family
Description: This family consists of translation initiation factor IF-1 as found in bacteria and chloroplasts. This protein, about 70 residues in length, consists largely of an S1 RNA binding domain ( ) [ ].Translation initiation includes a number of interrelated steps preceding the formation of the first peptide bond. In Escherichia coli, the initiation mechanism requires, in addition to mRNA, fMet-tRNA, and ribosomal subunits, the presence of three additional proteins (initiation factors IF1, IF2, and IF3) and at least one GTP molecule. The three initiation factors influence both the kinetics and the stability of ternary complex formation. IF1 is the smallest of the three factors. IF1 enhances the rate of 70S ribosome subunit association and dissociation and the interaction of 30S ribosomal subunit with IF2 and IF3. It stimulates 30S complex formation. In addition, by binding to the A-site of the 30S ribosomal subunit, IF1 may contribute to the fidelity of the selection of the initiation site of the mRNA [ , , , , ].
Protein Domain
Name: RNA-binding domain, S1, IF1 type
Type: Domain
Description: The S1 domain of around 70 amino acids, originally identified in ribosomal protein S1, is found in a large number of RNA-associated proteins. It has been shown that S1 proteins bind RNA through their S1 domains with some degree of sequence specificity. This type of S1 domain is found in translation initiation factor 1.The solution structure of one S1 RNA-binding domain from Escherichia coli polynucleotide phosphorylase has been determined [ ]. It displays some similarity with the cold shock domain (CSD) (). Both the S1 and the CSD domain consist of an antiparallel beta barrel of the same topology with 5 beta strands. This fold is also shared by many other proteins of unrelated function and is known as the OB fold. However, the S1 and CSD fold can be distinguished from the other OB folds by the presence of a short 3(10) helix at the end of strand 3. This unique feature is likely to form a part of the DNA/RNA-binding site.This entry is specific for bacterial, chloroplastic and eukaryotic IF-1 type S1 domains.
Protein Domain
Name: Bystin
Type: Family
Description: Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastinbind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts atinvasion front in the placenta from early pregnancy [ ]. This family also includes theSaccharomyces cerevisiae protein ENP1. ENP1 is an essential protein in S. cerevisiae and is localised in the nucleus[ ]. It is thought that ENP1 plays a direct role in the early steps of rRNA processingas enp1 defective S. cerevisiae cannot synthesise 20S pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits[ ].
Protein Domain
Name: COBRA, plant
Type: Family
Description: This entry represents the COBRA family proteins. In Arabidopsis thaliana, members of the family are all extracellular glycosyl-phosphatidyl inositol-anchored proteins (GPI-linked) []. COBRA is involved in determining the orientation of cell expansion, probably by playing an important role in cellulose deposition. It may act by recruiting cellulose synthesizing complexes to discrete positions on the cell surface. Some members of this family are annotated as phytochelatin synthase, but these annotations are incorrect [].
Protein Domain
Name: Histidine triad (HIT) protein
Type: Family
Description: The Histidine Triad (HIT) motif, His-x-His-x-His-x-x (x, a hydrophobic amino acid) was identified as being highly conserved in a variety of organisms [, ]. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified into three branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the FHIT branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases [, ]. In budding yeast Hnt1 has been shown to have adenosine monophosphoramidase activity and function as positive regulators of Cdk7/Kin28 in vivo [ , ]. FHIT plays a very important role in the development of tumours. In fact, FHIT deletions are among the earliest and most frequent genetic alterations in the development of tumours [, ]. The third branch of the HIT superfamily, which includes GalT homologues, contains a related His-X-His-X-Gln motif and transfers nucleoside monophosphate moieties to phosphorylated second substrates ratherthan hydrolysing them [ ].
Protein Domain
Name: Histidine triad, conserved site
Type: Conserved_site
Description: The Histidine Triad (HIT) motif, His-x-His-x-His-x-x (x, a hydrophobic amino acid) was identified as being highly conserved in a variety of organisms []. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HITsuperfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles [ ]. Hint homologues including rabbit Hint and yeastHnt1 hydrolyse adenosine 5' monophosphoramide substrates such as AMP-NH2 and AMP-lysine to AMP plus the amine product and function as positive regulatorsof Cdk7/Kin28 in vivo [ ]. Fhit homologues are diadenosine polyphosphate hydrolases [] and function as tumour suppressors in human and mouse [] though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis []. The third branch of the HIT superfamily, which includesGalT homologues, contains a related His-X-His-X-Gln motif and transfers nucleoside monophosphate moieties to phosphorylated second substrates ratherthan hydrolysing them [ ].The bovine protein kinase C inhibitor, PKCI-1, is an inhibitor protein that binds zinc without the use of zinc-finger motifs [ ]. Each protein molecule binds one zinc ion via a novel binding site containing 3 closely-spaced histidine residues []. This region, referred to as the histidine triad (HIT) [], has been identified in various prokaryotic and eukaryotic proteins of uncertain function [].The signature pattern used in this entry contains the region of the histidine triad and includes the three conserved histidine residues which are thought to bind the zinc ion.
Protein Domain
Name: Replication factor-A protein 1, N-terminal
Type: Domain
Description: Replication factor-a protein 1 (RPA1) forms a multiprotein complex with RPA2 and RPA3 that binds single-stranded DNA and functions in the recognition of DNA damage for nucleotide excision repair. The complex binds to single-stranded DNA sequences participating in DNA replication in addition to those mediating transcriptional repression and activation, and stimulates the activity of cognate strand exchange protein Sep1. It cooperates with T-AG and DNA topoisomerase I to unwind template DNA containing the Simian Virus 40 origin of replication [ ].
Protein Domain
Name: Ethylene-responsive binding factor-associated repression
Type: Domain
Description: The EAR motif is the ethylene-responsive element binding factor-associated amphiphilic repression motif. This motif binds to the Groucho/Tup1-type co-repressor TOPLESS (TPL) and TPL-related proteins. The motif is frequently to be find at the N terminus of NINJA, or Novel INteractor of JAZ, proteins [ ]. The EAR motif, defined by the consensus sequence patterns of either LxLxL or DLN xxP, is the most predominant form of transcriptional repression motif so far identified in plants. It is highly conserved in transcriptional regulators that are known to function as negative regulators in a broad range of developmental and physiological processes across evolutionarily diverse plant species []. This family is closely related to family AUX_IAA () which also has an LxLxL signature.
Protein Domain
Name: Ribosomal protein L14P
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins []. L14 is a protein of 119 to 137 amino-acid residues.
Protein Domain      
Protein Domain      
Protein Domain
Name: GLABROUS1 enhancer-binding protein family
Type: Family
Description: This family of plant transcription factors includes GLABROUS1 enhancer-binding protein (GeBP) and GeBP-like proteins, and storekeeper and storekeeper-like (STKL) transcription factors.GeBP and GeBP-like proteins play a redundant role in cytokinin hormone pathway regulation [ ]. Storekeeper was identified as a B-box motif binding factor that regulates expression of patatin, a storage protein in potato []. Storekeeper-like transcription factors STKL1 and STKL2 function as transcription factors in the glucose signaling pathway [].
Protein Domain
Name: Peptidase A22A, presenilin
Type: Family
Description: This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family), subfamily A22A, the type example being presenilin 1 from Homo sapiens (Human).Presenilins are polytopic transmembrane (TM) proteins, mutations in which are associated with the occurrence of early-onset familial Alzheimer'sdisease, a rare form of the disease that results from a single-gene mutation [, ]. Alzheimer's disease is associated with the formation of extracellular deposits of amyloid, which contain aggregates of the amyloid-beta peptide. The β-peptides are released from the Alzheimer's amyloid precursor protein (APP) by the action of two peptidase activities: "beta-secretase"cleaves at the N terminus of the peptide, and "gamma-secretase"cleaves at the C terminus. The gamma-secretase cleavage occurs in a transmembrane segment of APP. Presenilin, which exists in a complex with nicastrin, APH-1 and PEN-2, has been identified as gamma-secretase from its deficiency [ ] and mutation of its active site residues [], but proteolytic activity has only been directly demonstrated on a peptide derived from APP [].Presenilin-1 is also known to process notch proteins [ ] and syndecan-3 [].Presenilin has nine transmembrane regions with the active site aspartic acid residues located on TM6, within a Tyr-Asp motif, and TM7, within a Gly-Xaa-Gly-Asp motif [ ]. The protein autoprocesses to form an amino-terminal fragment (TMs 1-6) and a C-terminal fragment (TMs 7-9) []. The tertiary structure of the human gamma-sectretase complex has been solved []. Nicastrin is extracellular, whereas presenilin-1, APH-1 and PEN-2 are all transmembrane proteins. The transmembrane regions of all three proteins form a horseshoe shape.Aspartic peptidases, also known as aspartyl proteases ([intenz:3.4.23.-]), are widely distributed proteolytic enzymes [, , ] known to exist in vertebrates, fungi, plants, protozoa, bacteria, archaea, retroviruses and some plant viruses. All known aspartic peptidases are endopeptidases. A water molecule, activated by two aspartic acid residues, acts as the nucleophile in catalysis. Aspartic peptidases can be grouped into five clans, each of which shows a unique structural fold [].Peptidases in clan AA are either bilobed (family A1 or the pepsin family) or are a homodimer (all other families in the clan, including retropepsin from HIV-1/AIDS) [ ]. Each lobe consists of a single domain with a closed β-barrel and each lobe contributes one Asp to form the active site. Most peptidases in the clan are inhibited by the naturally occurring small-molecule inhibitor pepstatin [].Clan AC contains the single family A8: the signal peptidase 2 family. Members of the family are found in all bacteria. Signal peptidase 2 processes the premurein precursor, removing the signal peptide. The peptidase has four transmembrane domains and the active site is on the periplasmic side of the cell membrane. Cleavage occurs on the amino side of a cysteine where the thiol group has been substituted by a diacylglyceryl group. Site-directed mutagenesis has identified two essential aspartic acid residues which occur in the motifs GNXXDRX and FNXAD (where X is a hydrophobic residue) [ ]. No tertiary structures have been solved for any member of the family, but because of the intramembrane location, the structure is assumed not to be pepsin-like.Clan AD contains two families of transmembrane endopeptidases: A22 and A24. These are also known as "GXGD peptidases"because of a common GXGD motif which includes one of the pair of catalytic aspartic acid residues. Structures are known for members of both families and show a unique, common fold with up to nine transmembrane regions [ ]. The active site aspartic acids are located within a large cavity in the membrane into which water can gain access [].Clan AE contains two families, A25 and A31. Tertiary structures have been solved for members of both families and show a common fold consisting of an α-β-alpha sandwich, in which the beta sheet is five stranded [ , ].Clan AF contains the single family A26. Members of the clan are membrane-proteins with a unique fold. Homologues are known only from bacteria. The structure of omptin (also known as OmpT) shows a cylindrical barrel containing ten beta strands inserted in the membrane with the active site residues on the outer surface [ ].There are two families of aspartic peptidases for which neither structure nor active site residues are known and these are not assigned to clans. Family A5 includes thermopsin, an endopeptidase found only in thermophilic archaea. Family A36 contains sporulation factor SpoIIGA, which is known to process and activate sigma factor E, one of the transcription factors that controls sporulation in bacteria [ ].
Protein Domain
Name: BSD domain
Type: Domain
Description: The BSD domain is an about 60-residue long domain named after the BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins in which it is found. Additionally, it is also found in several hypothetical proteins. The BSD domain occurs in one or two copies in a variety of species ranging from primal protozoan to human. It can be found associated with other domains such as the BTB domain (see ) or the U-box in multidomain proteins. The function of the BSD domain is unknown [ ].Secondary structure prediction indicates the presence of three predicted alpha helices, which probably form a three-helical bundle in small domains. The third predicted helix contains neighbouring phenylalanine and tryptophan residues - less common amino acids that are invariant in all the BSD domains identified and that are the most striking sequence features of the domain [ ].Some proteins known to contain one or two BSD domains are listed below:Mammalian TFIIH basal transcription factor complex p62 subunit (GTF2H1).Yeast RNA polymerase II transcription factor B 73kDa subunit (TFB1), the homologue of BTF2.Yeast DOS2 protein. It is involved in single-copy DNA replication and ubiquitination.Drosophila synapse-associated protein SAP47.Mammalian SYAP1.Arabidopsis thaliana (Mouse-ear cress) TFB1-1 (TFB1A) and TFB1-3 (TFB1C).
Protein Domain      
Protein Domain
Name: Oligosaccharyl transferase complex, subunit OST3/OST6
Type: Family
Description: During N-linked glycosylation of proteins, oligosaccharide chains are assembled on the carrier molecule dolichyl pyrophosphate in the following order: 2 molecules of N-acetylglucosamine (GlcNAc), 9 molecules of mannose, and 3 molecules of glucose. These 14-residue oligosaccharide cores are then transferred to asparagine residues on nascent polypeptide chains in the endoplasmic reticulum (ER). As proteins progress through the Golgi apparatus, the oligosaccharide cores are modified by trimming and extension to generate a diverse array of glycosylated proteins [ , ].The oligosaccharyl transferase complex (OST complex) transfers 14-sugar branched oligosaccharides from dolichyl pyrophosphate to asparagine residues [ ]. The complex contains nine protein subunits: Ost1p, Ost2p, Ost3p, Ost4p, Ost5p, Ost6p, Stt3p, Swp1p, and Wbp1p, all of which are integral membrane proteins of the ER. The OST complex interacts with the Sec61p pore complex [] involved in protein import into the ER.This entry represents subunits OST3 and OST6. OST3 is homologous to OST6 [ ], and several lines of evidence indicate that they are alternative members of the OST complex. Disruption of both OST3 and OST6 causes severe underglycosylation of soluble and membrane-bound glycoproteins and a defect in the assembly of the complex. Hence, the function of these genes seems to be essential for recruiting a fully active complex necessary for efficient N-glycosylation []. This entry also includes the magnesium transporter protein 1, also known as OST3 homologue B, which might be involved in N-glycosylation through its association with the oligosaccharyl transferase (OST) complex.
Protein Domain
Name: PDZ-binding protein, CRIPT
Type: Family
Description: The CRIPT protein is a cytoskeletal protein involved in microtubule production. This C-terminal domain is essential for binding to the PDZ3 domain of the SAP90 protein, one of a super-family of PDZ-containing proteins that play an important role in coupling the membrane ion channels with their signalling partners [ ].
Protein Domain
Name: Thiamine thiazole synthase
Type: Family
Description: This entry represents the thiamine thiazole synthase, which is involved in biosynthesis of the thiamine precursor thiazole in fungi and in chloroplasts. It catalyses the conversion of NAD and glycine to adenosine diphosphate 5-(2-hydroxyethyl)-4-methylthiazole-2-carboxylic acid (ADT), an adenylated thiazole intermediate. The reaction includes an iron-dependent sulfide transfer from a conserved cysteine residue of the protein to a thiazole intermediate. The enzyme can only undergo a single turnover, which suggests it is a suicide enzyme. It may have additional roles in adaptation to various stress conditions and in DNA damage tolerance [ , ].
Protein Domain
Name: Thiazole biosynthetic enzyme Thi4 family
Type: Family
Description: Thiamine (vitamin B1) can be synthesised de novo in prokaryotes, plants and fungi. In eukaryotes, THI4 is involved in the biosynthesis of the thiamine precursor thiazole, and is repressed by thiamine [ ].Archaea harbour structural homologues of both the bacterial (ThiS-ThiF) and eukaryotic (THI4) proteins for thiazole synthesis. Most archaea have homologues that cluster to the THI4 family of proteins, but lack the conserved cysteine residue of the yeast THI4p that is required for sulfur transfer in formation of the thiazole ring. Instead, they have a histidine residue that is well conserved. Initially, these archaeal histidine-cointaining THI4 homologues were reported to convert ribose-1,5-bisphosphate (R15P) into ribulose-1,5-bisphosphate, the substrate of ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) [ ], but this may not be correct and their function remains unknown []. On the other hand, archaeal THI4 homologues with a conserved catalytic cysteine are linked to thiamine biosynthesis [].This family is represented by THI4 and also includes chloroplastic thiamine thiazole synthase Thi1 [ , ] and prokaryotic proteins.
Protein Domain
Name: Succinate dehydrogenase/fumarate reductase type B, transmembrane subunit
Type: Family
Description: Succinate dehydrogenase (SDH) is a membrane-bound complex of two main components: a membrane-extrinsic component composed of an FAD-binding flavoprotein and an iron-sulphur protein, and a hydrophobic component composed of a cytochrome b and a membrane anchor protein. The cytochrome b component is a mono-haem transmembrane protein [ , , ] belonging to a family that includes:Cytochrome b-556 from bacterial SDH (gene sdhC).Cytochrome b560 from the mammalian mitochondrial SDH complex, which is encoded in the mitochondrial genome of some algae and in the plant Marchantia polymorpha.Cytochrome b from yeast mitochondrial SDH complex (gene SDH3 or CYB3).Protein cyt-1 from Caenorhabditis elegans.These cytochromes are proteins of about 130 residues that comprise three transmembrane regions. There are two conserved histidines which may be involved in binding the haem group.This family also includes the subunit C (the cytochrome B subunit) of type B fumarate reductases [ ].
Protein Domain
Name: IBR domain
Type: Domain
Description: The IBR (In Between Ring fingers) domain is often found to occur between pairs of ring fingers. This domain has also been called the C6HC domain and DRIL (for double RING finger linked) domain [ ]. Proteins that contain two Ring fingers and an IBR domain (these proteins are also termed RBR family proteins) are thought to exist in all eukaryotic organisms. RBR family members play roles in protein quality control and can indirectly regulate transcription []. Evidence suggests that RBR proteins are often parts of cullin-containing ubiquitin ligase complexes. The ubiquitin ligase Parkin is an RBR family protein whose mutations are involved in forms of familial Parkinson's disease [].IBR domain is a cysteine-rich (C6HC) zinc finger domain that is present in Triad1, and which is conserved in other proteins encoded by various eukaryotes. The C6HC consensus pattern is:C-x(4)-C-x(14-30)-C-x(1-4)-C-x(4)-C-x(2)-C-x(4)-H-x(4)-C The C6HC zinc finger motif is the fourth family member of the zinc-binding RING, LIM, and LAP/PHD fingers. Strikingly, in most of the proteins the C6HC domain is flanked by two RING finger structures . The novel C6HC motif has been called DRIL (double RING finger linked). The strong conservation of the larger tripartite TRIAD (twoRING fingers and DRIL) structure indicates that the three subdomains are functionally linked and identifies a novel class of proteins [ ].
Protein Domain      
Protein Domain
Name: Small GTPase Tem1/Spg1
Type: Family
Description: Small GTPases form an independent superfamily within the larger class of regulatory GTP hydrolases. This superfamily contains proteins that control a vast number of important processes and possess a common, structurally preserved GTP-binding domain [ , ]. Sequence comparisons of small G proteins from various species have revealed that they are conserved in primary structures at the level of 30-55% similarity [].Crystallographic analysis of various small G proteins revealed the presence of a 20kDa catalytic domain that is unique for the whole superfamily [ , ]. The domain is built of five alpha helices (A1-A5), six β-strands (B1-B6) and five polypeptide loops (G1-G5). A structural comparison of the GTP- and GDP-bound form, allows one to distinguish two functional loop regions: switch I and switch II that surround the gamma-phosphate group of the nucleotide. The G1 loop (also called the P-loop) that connects the B1 strand and the A1 helix is responsible for the binding of the phosphate groups. The G3 loop provides residues for Mg2 and phosphate binding and is located at the N terminus of the A2 helix. The G1 and G3 loops are sequentially similar to Walker A and Walker B boxes that are found in other nucleotide binding motifs. The G2 loop connects the A1 helix and the B2 strand and contains a conserved Thr residue responsible for Mg2 binding. The guanine base is recognised by the G4 and G5 loops. The consensus sequence NKXD of the G4 loop contains Lys and Asp residues directly interacting with the nucleotide. Part of the G5 loop located between B6 and A5 acts as a recognition site for the guanine base [].The small GTPase superfamily can be divided into at least 8 different families, including:Arf small GTPases. GTP-binding proteins involved in protein trafficking by modulating vesicle budding and uncoating within the Golgi apparatus.Ran small GTPases. GTP-binding proteins involved in nucleocytoplasmic transport. Required for the import of proteins into the nucleus and also for RNA export.Rab small GTPases. GTP-binding proteins involved in vesicular traffic.Rho small GTPases. GTP-binding proteins that control cytoskeleton reorganisation.Ras small GTPases. GTP-binding proteins involved in signalling pathways.Sar1 small GTPases. Small GTPase component of the coat protein complex II (COPII) which promotes the formation of transport vesicles from the endoplasmic reticulum (ER).Mitochondrial Rho (Miro). Small GTPase domain found in mitochondrial proteins involved in mitochondrial trafficking.Roc small GTPases domain. Small GTPase domain always found associated with the COR domain.This entry includes Tem1 from budding yeasts and Spg1 from fission yeasts. They are GTPases involved in the regulation of the cell cycle. In Schizosaccharomyces pombe, Spg1 is required for the localisation of Cdc7 (part of the septation initiation network) to the spindle pole body (SPB) []. It is regulated negatively by a GTPase-activating protein (GAP) comprising two subunits - Byr4 and Cdc16. In anaphase B, Spg1 is localised on the new SPB []. In Saccharomyces cerevisiae, Tem1 is associated with the mitotic exit network (MEN). It is involved in termination of M phase of the cell cycle [ ].
Protein Domain
Name: Putative pyruvate, phosphate dikinase regulatory protein
Type: Family
Description: This is a family of proteins which are putative bifunctional serine/threonine kinase/phosphorylases involved in the regulation of the pyruvate, phosphate dikinase (PPDK) by catalysing its phosphorylation/dephosphorylation [ , ]. In plants, the pyruvate, phosphate dikinase regulatory protein 1 (RP1) is a bifunctional serine/threonine kinase and phosphorylase involved in the dark/light-mediated regulation of PPDK by catalysing its phosphorylation/dephosphorylation. In the dark, RP1 phosphorylates the catalytic intermediate of PPDK (PPDK-HisP), inactivating it. Light exposure induces the phosphorolysis reaction that reactivates PPDK [ , , , ].
Protein Domain
Name: Bifunctional kinase-pyrophosphorylase
Type: Family
Description: This family of regulatory proteins has ADP-dependent kinase and inorganic phosphate-dependent pyrophosphorylase activity [ , , ].
Protein Domain
Name: Vacuolar (H+)-ATPase G subunit
Type: Family
Description: This family represents the eukaryotic vacuolar (H+)-ATPase (V-ATPase) G subunit. V-ATPases generate an acidic environment in several intracellular compartments. Correspondingly, they are found as membrane-attached proteins in several organelles. They are also found in the plasma membranes of some specialised cells.V-ATPases consist of peripheral (V1) and membrane integral (V0) heteromultimeric complexes. The G subunit is part of the V1 subunit, but is also thought to be strongly attached to the V0 complex. It may be involved in the coupling of ATP degradation to H+ translocation.
Protein Domain      
Protein Domain
Name: NADPH-dependent FMN reductase-like
Type: Domain
Description: This domain in found in several flavoproteins such as FMN-dependent NADPH-azoreductase, which catalyses the reductive cleavage of azo bond in aromatic azo compounds to the corresponding amines [ ], and NAD(P)H:quinone oxidoreductase, which reduces quinones to the hydroquinone state to prevent interaction of the semiquinone with O2 and production of superoxide []. In Arabidopsis NADPH:quinone oxidoreductase is involved in detoxification pathways []. NAD(P)H:quinone oxidoreductase prefers NADH over NADPH, while FMN-dependent NADPH-azoreductase requires NADPH, but not NADH, as an electron donor for its activity. Other proteins with this domain include iron-sulfur flavoproteins [ ] and chromate reductase [].
Protein Domain
Name: CRC domain
Type: Domain
Description: The following proteins of the tesmin/TSO1 family contain two cysteines-rich repeats with the consensus C-X-C-X(4)-C-X(3)-Y-C-X-C-X(6)-C-X(3)-C-X-C-X(2)-Cseparated by a region of variable length containing the short conserved sequence R-N-P-X-A-F-X-P-K:Animal tesmin or MTL5, originally identified by its specific expression in testes, but subsequently it was also detected at specific stages of ovary development.Animal tesmin-like (tesl) or LIN54.Drosophila melanogaster tombola (tomb), a meiotic arrest protein which is expressed specifically in testis.Arabidopsis thaliana TSO1, 'tso' means 'ugly' in Chinese and refers to the appearance of tso1 mutant flowers.Arabidopsis thaliana TSO1-like 1 and 2 (SOL1 and SOL2).Legume Cysteine-rich Polycomb-like Protein 1 (CPP1), a DNA-binding protein acting as a negative regulator of the leghemoglobin gene.This domain has been named the CRC domain (C1-RNPXAFXPK-C2). It binds zinc andis able to bind DNA [ , , , , ].The CRC domain shows some similarity to the CXC domain found in the E(z)-type of Polycomb group proteins []. However, a clear distinctioncan be made, since the CXC domain lacks the RNPXAFXPK motif.
Protein Domain
Name: Transcription factor MYC/MYB N-terminal
Type: Domain
Description: This is the N-terminal region of a family of MYB and MYC transcription factors. The DNA-binding HLH domain is further downstream, . Members of the MYB and MYC family regulate the biosynthesis of phenylpropanoids in several plant species [ , ].
Protein Domain
Name: ATPase, AFG1-like
Type: Family
Description: This P-loop motif-containing family of proteins includes AFG1, LACE1 and ZapE.ATPase family gene 1 (AFG1) is a 377 amino acid yeast protein with an ATPase motif typical of the family [ ].AFG1-like ATPase (also known as lactation elevated 1 or LACE1), the mammalian homologue of AGF1, is a mitochondrial integral membrane protein that is essential for maintenance of fused mitochondrial reticulum and lamellar cristae morphology. It has also been demonstrated that LACE1 mediates degradation of nuclear-encoded complex IV subunits COX4 (cytochrome c oxidase 4), COX5A and COX6A, and is required for normal activity of complexes III and IV of the respiratory chain [ ].ZapE is a cell division protein found in Gram-negative bacteria. The bacterial cell division process relies on the assembly, positioning, and constriction of FtsZ ring (the so-called Z-ring), a ring-like network that marks the future site of the septum of bacterial cell division. ZapE is a Z-ring associated protein required for cell division under low-oxygen conditions. It is an ATPase that appears at the constricting Z-ring late in cell division. It reduces the stability of FtsZ polymers in the presence of ATP in vitro [ ].
Protein Domain
Name: Ribosomal protein L32p
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].Ribosomal protein L32p is part of the 50S ribosomal subunit. This family is found in both prokaryotes and eukaryotes. Ribosomal protein L32 of yeast binds to and regulates the splicing and the translation of the transcript of its own gene [ ].
Protein Domain
Name: Ribonuclease T2-like
Type: Family
Description: Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases) [ , , , , ]. Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self"pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sissile bond, respectively. The fungal ribonucleases T2 from Aspergillus oryzae, M from Aspergillus saitoi and Rh from Rhizopus niveus are structurally and functionally related 30 Kd glycoproteins [ ] that cleave the 3'-5' internucleotide linkage of RNA via a nucleotide 2',3'-cyclic phosphate intermediate (). Two histidines residues have been shown [ , ] to be involved in the catalytic mechanism of RNase T2 and Rh. These residues and the region around them are highly conserved in a number of other RNAses that have been found to be evolutionary related to these fungal enzymes.
Protein Domain
Name: Ribonuclease T2, His active site 1
Type: Active_site
Description: The fungal ribonucleases T2 from Aspergillus oryzae, M from Aspergillus saitoi and Rh from Rhizopeus niveus are structurally and functionally related 30 Kd glycoproteins [ ] that cleave the 3'-5' internucleotide linkage of RNA via a nucleotide 2',3'-cyclic phosphate intermediates (). Two histidines residues have been shown [ , ] to be involved in the catalytic mechanism of RNase T2 and Rh. These residues and the region around them are highly conserved. This entry represents the conserved region containing the first His active site.
Protein Domain
Name: PHP domain
Type: Domain
Description: The PHP (Polymerase and Histidinol Phosphatase) domain has four conserved sequence motifs that contain invariant histidine and aspartate residues implicated in metal ion coordination. It is found in alpha-subunit of bacterial DNA polymerase III () and family X DNA polymerases in addition to histidinol phosphatase ( ) [ , ]. As part of DNA polymerases, the PHP domain was suggested to hydrolyse pyrophosphate []. This family is often associated with an N-terminal region . This domain has a distorted (α/β)7 barrel fold [ ].
Protein Domain
Name: Polymerase/histidinol phosphatase, N-terminal
Type: Domain
Description: This domain is associated with the N terminus of members of the PHP superfamily, this includes: subunit of bacterial DNA polymerase III, eukaryotic DNA polymerase, X-family of DNA polymerases,histidinol phosphatases,and a number of uncharacterised protein families.In common for all PHP proteins is the presence of four conserved sequence motifs that contain invariant histidine and aspartate residues implicated in metal ion coordination. As part of DNA polymerases, the PHP domain was suggested to hydrolyse pyrophosphate and thereby shift the reaction equilibrium toward nucleotide polymerisation. However, it cannot be ruled out that the PHP domain possesses a nuclease activity, particularly in the repair polymerases of the X-family. No functional information is available for standalone proteins that belong to the PHP superfamily. The crystal structure of the YcdX protein from Escherichia coli has been determined to 1.6-A resolution. YcdX has an unusual topology of a α7-β7 barrel compared with the more common α8-β8 (TIM) barrel. The C-terminal helix caps the barrel on the N-terminal side. The deep cleft at the C-terminal side of the barrel contains the three zinc binding residues. These residues are invariant in the YcdX family confirming their functional importance. Only four proteins with known structures have a similar trinuclear zinc catalytic site. All four (nuclease P1, endonuclease IV, alkaline phosphatase, and phospholipase C) hydrolyse the phosphoester bond. This finding suggests a similar activity for YcdX. YcdX is among the genes significantly induced in response to the DNA damage, therefore indicating that members of the YcdX family may be involved in DNA repair [ ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom