Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 2601 to 2700 out of 38750 for *

Category restricted to ProteinDomain (x)

0.015s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Putative RNA methyltransferase
Type: Family
Description: This family has a TIM barrel-like fold with a deep C-terminal trefoil knot. The arrangement of its hydrophilic and hydrophobic surfaces are opposite to that of the classic TIM barrel proteins. It is likely to bind RNA [ ], and may function as a methyltransferase [, ].
Protein Domain
Name: The fantastic four family
Type: Family
Description: This entry represents a plant specific family called "fantastic four"(FAF). Members in this family include FAF1/2/3/4 from Arabidopsis thaliana. They may be modulators of the genetic circuit that regulates the meristem [ ]. This family also includes Protein SUGARY ENHANCER 1 from maize, which has a FAF domain and is involved in starch metabolism in endosperm [].
Protein Domain
Name: Malate dehydrogenase, active site
Type: Active_site
Description: Malate dehydrogenase ( ) (MDH) [ ] catalyzes the interconversionof malate to oxaloacetate utilizing the NAD/NADH cofactor system. The enzyme participates in the citric acid cycle and exists in all aerobics organisms.While prokaryotic organisms contains a single form of MDH, in eukaryotic cells there are two isozymes: one which is located in the mitochondrial matrix andthe other in the cytoplasm. Fungi and plants also harbor a glyoxysomal form which functions in the glyoxylate pathway. In plants chloroplast there is anadditional NADP-dependent form of MDH ( ) which is essential for both the universal C3 photosynthesis (Calvin) cycle and the more specialisedC4 cycle. The pattern for this enzyme includes two residues involved in the catalytic mechanism []: an aspartic acid whichis involved in a proton relay mechanism, and an arginine which binds the substrate.
Protein Domain
Name: Alpha-N-methyltransferase NTM1
Type: Family
Description: All fungal and animal N-terminally methylated proteins contain a unique N-terminal motif, Met-(Ala/Pro/Ser)-Pro-Lys. Alpha-N-methyltransferase methylates the N terminus of target proteins containing the N-terminal motif [Ala/Pro/Ser]-Pro-Lys when the initiator Met is cleaved. It catalyses mono-, di- or tri-methylation of the exposed alpha-amino group of Ala or Ser residue in the [Ala/Ser]-Pro-Lys motif and mono- or di-methylation of Pro in the Pro-Pro-Lys motif [ , , ]. Some of the substrates may be primed by NTMT2-mediated monomethylation [].
Protein Domain
Name: Geranylgeranyl reductase family
Type: Family
Description: This entry includes geranylgeranyl reductases involved in chlorophyll and bacteriochlorophyll biosynthesis as well as other related enzymes which may also act on geranylgeranyl groups or related substrates.
Protein Domain
Name: Geranylgeranyl reductase, plant/cyanobacteria
Type: Family
Description: This entry represents the reductase which acts reduces the geranylgeranyl group to the phytyl group in the side chain of chlorophyll. It is unclear whether the enzyme has a preference for acting before or after the attachment of the side chain to chlorophyllide a by chlorophyll synthase. This clade is restricted to plants and cyanobacteria to separate it from the homologues which act in the biosynthesis of bacteriochlorophyll.
Protein Domain
Name: Geranylgeranyl reductase, plant/prokaryotic
Type: Family
Description: This entry represents a group of geranylgeranyl reductases specific for the biosyntheses of bacteriochlorophyll and chlorophyll [ ]. It is unclear whether the processes of isoprenoid ligation to the chlorin ring and reduction of the geranylgeranyl chain to a phytyl chain are necessarily ordered the same way in all species (see introduction to []).
Protein Domain
Name: Isocitrate lyase/phosphorylmutase, conserved site
Type: Conserved_site
Description: Isocitrate lyase ( ) [ , ] is an enzyme that catalyzes the conversion of isocitrate to succinate and glyoxylate. This is the first step in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle in bacteria, fungi and plants. A cysteine, a histidine and a glutamate or aspartate have been found to be important for the enzyme's catalytic activity. Only one cysteine residue is conserved between the sequences of the fungal, plant and bacterial enzymes; it is located in the middle of a conserved hexapeptide. Other enzymes also belong to this family including carboxyvinyl-carboxyphosphonate phosphorylmutase ( ) which catalyses the conversion of 1-carboxyvinyl carboxyphosphonate to 3-(hydrohydroxyphosphoryl) pyruvate carbon dioxide, and phosphoenolpyruvate mutase ( ), which is involved in the biosynthesis of phosphinothricin tripeptide antiobiotics.
Protein Domain      
Protein Domain
Name: Isocitrate lyase
Type: Family
Description: Isocitrate lyase ( ) [ , ] is an enzyme that catalyzes the conversion of isocitrate to succinate and glyoxylate. This is the first step in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle (also known as the TCA cycle) in bacteria, fungi and plants. A cysteine, a histidine and a glutamate or aspartate have been found to be important for the enzyme's catalytic activity. Only one cysteine residue is conserved between the sequences of the fungal, plant and bacterial enzymes; it is located in the middle of a conserved hexapeptide. Mitochondrial 2-methylisocitrate lyase ICL2 from the yeast Saccharomyces cerevisiae does not act on isocitrate but on 2-methylisocitrate. It catalyses the formation of pyruvate and succinate during the metabolism of endogenous propionyl-CoA [ ]. Methylisocitrate lyase, mitochondrial from the filamentous fungus Aspergillus nidulans is responsible for the same reaction [].
Protein Domain
Name: Carnosine N-methyltransferase
Type: Family
Description: This family includes Carnosine N-methyltransferase ( ), conserved from yeast to human, that catalyses the formation of anserine (beta-alanyl-N(Pi)-methyl-L-histidine) from carnosine. Anserine, a methylated derivative of carnosine (beta-alanyl-L-histidine), is an abundant constituent of vertebrate skeletal muscles. It also methylates other L-histidine-containing di- and tripeptides such as Gly-Gly-His, Gly-His and homocarnosine (GABA-His) [, , ].
Protein Domain
Name: Heat shock protein 70, conserved site
Type: Conserved_site
Description: Heat shock proteins, Hsp70 chaperones help to fold many proteins. Hsp70 assisted folding involves repeated cycles of substrate binding and release. Hsp70 activity is ATP dependent. Hsp70 proteins are made up of two regions: the amino terminus is the ATPase domain and the carboxyl terminus is the substrate binding region [ ].Hsp70 proteins have an average molecular weight of 70kDa [ , , ]. In most species,there are many proteins that belong to the hsp70 family. Some of these are only expressed under stress conditions (strictly inducible), while some are present in cells under normal growth conditions and are not heat-inducible (constitutive or cognate) [, ]. Hsp70 proteins can be found in different cellular compartments(nuclear, cytosolic, mitochondrial, endoplasmic reticulum, for example).This entry represents three conserved sites of the heat shock 70 protein family.
Protein Domain
Name: DNA polymerase delta, subunit 4
Type: Family
Description: DNA polymerase delta (Pol delta) is responsible for effective DNA replication, playing a key role in the elongation of both the leading and the lagging strands of DNA and the maturation of Okazaki fragments [ ]. It consists of four subunits: the catalytic and largest subunit p125, p50 that interacts with p125 to form the core enzyme, p68 which interacts with p50, and a fourth subunit, p12, that bridges p125 and p50, stabilising its interaction [, ].This entry represents the p12 subunit (also called subunit 4), which increases the rate of DNA synthesis and decreases fidelity by regulating POLD1 polymerase and proofreading 3' to 5' exonuclease activity in the Pol delta4 tetramer complex [, , ]. p12 is PCNA-binding protein, as it contains a N-terminal PCNA-binding motif. Under conditions of DNA replication stress, it is required for the repair of broken replication forks through break-induced replication (BIR), a mechanism that may induce segmental genomic duplications of up to 200 kb []. This subunit is involved in Pol-delta4 translesion synthesis (TLS) of templates carrying O6-methylguanine or abasic sites []. p12 plays a major role in Pol delta4 catalytic functions while its degradation is required for the conversion of Pol delta4 to Pol delta3 in the cellular response to DNA damage, as Pol delta3 has an enhanced proofreading activity [].
Protein Domain
Name: Gdt1 family
Type: Family
Description: Budding yeast Gdt1 is a Golgi-localized calcium transporter required for stress-induced calcium signalling and protein glycosylation [ ]. Its human homologue, TMEM165, may be a Golgi Ca2(+)/H(+) antiporter []. Defects in the human protein TMEM165 cause a subtype of Congenital Disorders of Glycosylation []. In Arabidopsis , this protein is variously known as CCHA1 (a chloroplast-localized potential Ca(2+)/H(+) antiporter), chloroplastic PAM71 (photosynthesis affected mutant 71), and GDT1-like protein 1, chloroplastic. It has been reported to be a putative chloroplast-localized Ca(2+)/H(+) antiporter with critical functions in the regulation of PSII and in chloroplast Ca(2+) and pH homeostasis []. It has also been suggested that it may function in Mn(2+) uptake into thylakoids, ensuring optimal PSII performance [].
Protein Domain
Name: Atypical dual-specificity phosphatase Siw14-like
Type: Family
Description: This group of atypical dual-specificity phosphatases are predominantly from fungi, plants and bacteria. This entry includes budding yeast Siw14 (also known as Oca3) and related proteins. Siw14 is a inositol pyrophosphate phosphatase that modulates inositol pyrophosphate metabolism by dephosphorylating the IP7isoform 5PP-IP5to IP6 [ ]. This entry also includes budding yeast Oca1/2/4/6 and Arabidopsis DSP1/2/3/4/5. All DSPs tested (AtPFA-DSP1, -2, -3, and -5) displayed phosphatase activity toward PI(3,5)P2, with AtPFA-DSP2 showing a higher activity []. Oca1/2/4/6 are putative phosphatases associated with the caffeine-sensitivity stress pathway in S. cerevisiae [].
Protein Domain
Name: Atypical dual-specificity phosphatase Siw14-like, plant and fungi
Type: Family
Description: This entry represent a group of atypical dual-specificity phosphatases from plant and fungi. Proteins in this entry include Siw14 and related proteins [ ]. Siw14 is a inositol pyrophosphate phosphatase that modulates inositol pyrophosphate metabolism by dephosphorylating the IP7isoform 5PP-IP5to IP6 []. This entry also includes DSP1/2/3/4/5 from Arabidopsis. All DSPs tested (AtPFA-DSP1, -2, -3, and -5) displayed phosphatase activity toward PI(3,5)P2, with AtPFA-DSP2 showing a higher activity [].
Protein Domain
Name: Photosystem I PsaL, reaction centre subunit XI
Type: Domain
Description: The trimeric photosystem I of the cyanobacterium Synechococcus elongatus recomprises 11 protein subunits. Subunit XI, PsaL, from plants and bacteria is one of the smaller subunits with only two transmembrane alpha helices. PsaL interacts closely with PsaI [ ].
Protein Domain
Name: Isopentenyl-diphosphate delta-isomerase, type 1
Type: Family
Description: This entry represents type 1 of two non-homologous families of the enzyme isopentenyl-diphosphate delta-isomerase (IPP isomerase; ). IPP isomerase is a member of the Nudix hydrolase superfamily, and is a key enzyme in the isoprenoid biosynthetic pathway. It catalyses the interconversion of isopentenyl diphosphate (IPP) and dimethylallyl diphosphate. Dimethylallyl phosphate is the initial substrate for the biosynthesis of carotenoids and other long chain isoprenoids [ ]. IPP is an essential building block for many compounds, including enzyme cofactors, sterols, and prenyl groups. This enzyme interconverts isopentenyl diphosphate and dimethylallyl diphosphate. The enzyme requires one Mn2+ or Mg2+ ion in its active site to fold into an active conformation and also contains the Nudix motif, a highly conserved 23-residue block (GX5EX7REUXEEXGU, where U = I, L or V), that functions as a metal binding and catalytic site. The metal binding site is present within the active site and plays structural and catalytical roles. IPP isomerase is well represented in several bacteria, archaebacteria and eukaryotes, including fungi, mammals and plants. Despite sequence variations (mainly at the N terminus), the core structure is highly conserved [].
Protein Domain
Name: Ribonuclease E inhibitor RraA/RraA-like protein
Type: Family
Description: This entry represents the regulator of ribonuclease E activity A (RraA). These proteins contain a swivelling 3-layer beta/beta/alpha domain that appears to be mobile in most multi-domain proteins known to contain it. These proteins are structurally similar, and may have distant homology, to the phosphohistidine domain of pyruvate phosphate dikinase. The RraA fold is an ancient platform that has been adapted for a wide range of functions. RraA had been identified as a putative demethylmenaquinone methyltransferase and was annotated as MenG, but further analysis showed that RraA lacked the structural motifs usually required for methylases [ ]. The Escherichia coli protein regulator RraA acts as a trans-acting modulator of RNA turnover, binding essential endonuclease RNase E and inhibiting RNA processing [ ]. RNase E forms the core of a large RNA-catalysis machine termed the degradosomes. RraA (and RraB) causes remodelling of degradosome composition, which is associated with alterations in RNA decay and global transcript abundance and as such is a bacterial mechanism for the regulation of RNA cleavage.This fold is also found in 4-hydroxy-4-methyl-2-oxoglutarate aldolase, also known as RraA-like protein [ ] and at the C terminus of the DlpA protein .
Protein Domain
Name: Seipin family
Type: Family
Description: Seipin is a cell-autonomous regulator of lipolysis essential for adipocyte differentiation [ ]. Seipin is predicted to contain two transmembrane domains at residues 28-49 and 237-258, in humans, and a third transmembrane domain might be present at residues 155-173. Mutations in the Seipin gene underlie human congenital generalized lipodystrophy [ ]. Seipin may also be implicated in Silver spastic paraplegia syndrome and distal hereditary motor neuropathy type V [].This entry also includes Seipin homologues from fission yeasts and plants. There are three SEIPIN homologues in Arabidopsis thaliana, designated SEIPIN1, SEIPIN2, and SEIPIN3. Similar to their animal homologus, plant and yeast Seipins also play roles in lipid droplet (LD) biogenesis [ ].
Protein Domain
Name: Regulator of ribonuclease activity A
Type: Family
Description: The regulator of ribonuclease activity A (RraA) family includes a number of closely related sequences from bacteria and plants. The Escherichia coli member has been characterised, and its crystal structure determined [ ]. It acts as a regulator of the endonuclease RNase E [] (see ) by binding to it and inhibiting RNA processing [ ].
Protein Domain
Name: NADAR
Type: Domain
Description: This family contains E. coli swarming motility protein YbiA. Mutations in YbiA cause defects in Escherichia coli swarming, but not necessarily in motility. This family was predicted to be involved in NAD-utilizing pathways, likely to act on ADP-ribose derivatives, and was been named NADAR (NAD and ADP-ribose) [ , ]. More recently, YbiA has been shown to be involved in the disposal of riboflavin intermediates. It catalyzes the hydrolysis of the N-glycosidic bond in the first two intermediates of riboflavin biosynthesis, which are highly reactive metabolites, yielding relatively innocuous products [].
Protein Domain
Name: Bacterial bifunctional deaminase-reductase, C-terminal
Type: Domain
Description: This domain is found in the C terminus of the bifunctional deaminase-reductase of Escherichia coli, Bacillus subtilis and other bacteria in combination with that catalyses the second and third steps in the biosynthesis of riboflavin, i.e., the deamination of 2,5-diamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate (deaminase) and the subsequent reduction of the ribosyl side chain (reductase) [ ]. The domain is also present in some HTP reductases from archaea and fungi.
Protein Domain
Name: Dihydrofolate reductase-like domain superfamily
Type: Homologous_superfamily
Description: Dihydrofolate reductase (DHFR) catalyses the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate, an essential step in de novosynthesis both of glycine and of purines and deoxythymidine phosphate (the precursors of DNA synthesis) [ ], and important also in the conversion of deoxyuridine monophosphate to deoxythymidine monophosphate.This superfamily represents a structural domain found in dihydrofolate reductases, as well as in riboflavin biosynthesis proteins (usually as a C-terminal domain) and related enzymes.
Protein Domain
Name: Protein of unknown function DUF2358
Type: Family
Description: This entry represents a family of conserved proteins. The function is unknown.
Protein Domain
Name: Glucosamine-fructose-6-phosphate aminotransferase, isomerising
Type: Family
Description: Glucosamine:fructose-6-phosphate aminotransferase (GFAT or GlmS, ) catalyses the formation of glucosamine 6-phosphate and is the first and rate-limiting enzyme of the hexosamine biosynthetic pathway. The final product of the hexosamine pathway, UDP-N-acetyl glucosamine, is an active precursor of numerous macromolecules containing amino sugars [ , ].This family of sequences belong to the MEROPS peptidase family C44 (clan PB(C)), and are classified as non-peptidase homologues.
Protein Domain
Name: SIS domain
Type: Domain
Description: The sugar isomerase (SIS) domain is a phosphosugar-binding module that is found in a variety of eubacterial, archaebacterial and eukaryotic proteins that have a role in phosphosugar isomerization or regulation []. In enzymes, the SIS domain can have a catalytic function as an isomerase and bind to phosphorylated sugars. In bacterial transcriptional regulators of the rpiR family, the domain seems to bind substrates implicated in the genes for sugar metabolism that are controlled by the regulator. The SIS domain is found in one or two copies and can be linked to additional domains, such as helix-turn-helix (HTH), CBS, glutamine amidotransferases type 2, or phosphopantetheine-attachment [, ].The SIS domain has an α-β structure and is dominated by a five-stranded parallel β-sheet flanked on either side by α-helices forming a three-layer α-β-α sandwich [ ]. The fold shows similarities to that of glucose-6-phosphate isomerase.
Protein Domain
Name: Phosphopantetheine attachment site
Type: PTM
Description: Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups [ ].The amino-terminal region of the ACP proteins is well defined and consists of four α helices arranged in a right-handed bundle held together by interhelical hydrophobic interactions. The Asp-Ser-Leu (DSL) motif is conserved in all of the ACP sequences, and the 4'-PP prosthetic group is covalently linked via a phosphodiester bond to the serine residue. The DSL sequence is present at the amino terminus of helix II, a domain of the protein referred to as the recognition helix and which is responsible for the interaction of ACPs with the enzymes of type II fatty acid synthesis [ ].
Protein Domain
Name: Calreticulin/calnexin
Type: Family
Description: The calreticulin family is a family of calcium-binding ER chaperonesthat includes calreticulin, calnexin and camlegin [ ].Calreticulin (calregulin) [ ] is a high-capacity calcium-binding protein which is present in most tissues and located at the periphery of the endoplasmic (ER) and the sarcoplamic reticulum (SR) membranes. It probably plays a role in the storage of calcium in the lumen of the ER and SR and it may well have other important functions.Structurally, calreticulin is a protein of about 400 amino acid residues consisting of three domains: An N-terminal, probably globular, domain of about 180 amino acid residues (N-domain).A central domain of about 70 residues (P-domain) which contains three repeats of an acidic 17 amino acid motif. This region binds calcium with a low-capacity, but a high-affinity.A C-terminal domain rich in acidic residues and in lysine (C-domain). This region binds calcium with a high-capacity but a low-affinity.Calreticulin is evolutionarily related to several other calcium-binding proteins, including Onchocerca volvulus antigen RAL-1, calnexin [ ] and calmegin [].
Protein Domain
Name: Calreticulin/calnexin, conserved site
Type: Conserved_site
Description: Synonym(s): Calregulin, CRP55, HACBP Calreticulin [ ] is a high-capacity calcium-binding protein which is present in most tissues and located at the periphery of the endoplasmic (ER) and the sarcoplamic reticulum (SR) membranes. It probably plays a role in the storage of calcium in the lumen of the ER and SR and it may well have other important functions.Structurally, calreticulin is a protein of about 400 amino acid residues consisting of three domains: An N-terminal, probably globular, domain of about 180 amino acid residues (N-domain).A central domain of about 70 residues (P-domain) which contains three repeats of an acidic 17 amino acid motif. This region binds calcium with a low-capacity, but a high-affinity.A C-terminal domain rich in acidic residues and in lysine (C-domain). This region binds calcium with a high-capacity but a low-affinity.Calreticulin is evolutionarily related to several other calcium-binding proteins, including Onchocerca volvulus antigen RAL-1, calnexin [ ] and calmegin [].
Protein Domain
Name: Calreticulin/calnexin, P domain superfamily
Type: Homologous_superfamily
Description: The type-I integral membrane protein calnexin (CNX) and its soluble paralog calreticulin (CRT) are members of a family of molecular chaperones that function in the endoplasmic reticulum (ER) of eukaryotic cells. These calcium-binding proteins are lectins that bind newly synthesised N-linked glycoproteins to help promote efficient folding and oligomeric assembly. The chaperones act to retain the glycoproteins in the ER while they are still incompletely folded, ensuring that the ER quality control machinery can dispose of misfolded glycoproteins. The family of molecular chaperones are conserved among plants, fungi, and animals. The P domain contains a high-affinity calcium-binding site and is thought to be involved in either substrate binding or protein-protein interactions. The P domain forms part of the lumenal region in CNX. In both CRT and CNX the P domain forms a protrusion, or arm, extending from the core protein. The amino acid sequence of the P domain is highly conserved and is characteristic for this family of lectins. The structure of the P domain consists of a non-globular proline-rich hairpin fold. The P domain is composed of multiple copies of two types of proline-rich repeat sequences, a 17 amino acid type 1 motif and a 14 amino acid type 2 motif, with the arrangement 111222 in CRT and 11112222 in CNX [ , ].
Protein Domain
Name: Ovate protein family, C-terminal
Type: Domain
Description: This domain in found towards the C terminus in the Oval family of transcriptional repressors. These proteins are important regulators of growth and development in plants [ , , ].
Protein Domain
Name: Late embryogenesis abundant protein, SMP subgroup domain
Type: Domain
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [, ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].This entry represents Pfam SMP, or D-34 from Dure, or group 6 from Bray.
Protein Domain
Name: Protein of unknown function DUF1677, plant
Type: Family
Description: The sequences found in this family are all derived from hypothetical plant proteins of unknown function. The region features a number of highly conserved cysteine residues.
Protein Domain
Name: Chromatin assembly factor 1 subunit A
Type: Family
Description: The CAF-1 or chromatin assembly factor-1 consists of three subunits, and this is the first, or A [ ]. The A domain is uniquely required for the progression of S phase in mouse cells [], independent of its ability to promote histone deposition [] but dependent on its ability to interact with HP1 - heterochromatin protein 1-rich heterochromatin domains next to centromeres that are crucial for chromosome segregation during mitosis. This HP1-CAF-1 interaction module functions as a built-in replication control for heterochromatin, which, like a control barrier, has an impact on S-phase progression in addition to DNA-based checkpoints [].
Protein Domain      
Protein Domain
Name: HAD-superfamily hydrolase, subfamily IIA
Type: Family
Description: These sequences form one of the structural subclasses of the Haloacid Dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The superfamily is defined by the presence of three short catalytic motifs [ ]. The classes are defined [] based on the location and the observed or predicted fold of a so-called "capping domain"[ ], or the absence of such a domain. Class I consists of sequences in which the capping domain is found in between the first and second catalytic motifs. Class II consists of sequences in which the capping domain is found between the second and third motifs. Class III sequences have no capping domain in either of these positions. The Class IIA capping domain is predicted to consist of a mixed α-β fold with the basic pattern: Helix-Helix-Helix-Sheet-Helix-Loop-Sheet-Helix-Sheet-Helix. Presently, this subfamily covers the eukaryotic phosphoglycolate phosphatase, as well as four further subfamilies covering closely related sequences in eukaryotes, in Gram-positive bacteria and in Gram-negative bacteria. The Escherichia coli NagD gene and the Bacillus subtilis AraL gene are members of this subfamily but are not members of the any of the presently defined equivalogs within it. NagD is part of the NAG operon responsible for N-acetylglucosamine metabolism [ ]. Genes from several organisms have been annotated as NagD, or NagD-like. However, without data on the presence of other members of this pathway, (such as in the case of Yersinia pestis) these assignments should not be given great weight. The AraL gene is similar and is part of the L-arabinose operon []. A gene from Halobacteriumhas been annotated as AraL, but no other Ara operon genes have been annotated. Many of the genes in this subfamily have been annotated as "pNPPase""4-nitrophenyl phosphatase"or "NPPase". These all refer to the same activity versus a common lab test compound used to determine phosphatase activity. There is no evidence that this activity is physiologically relevant. Dihydroxyacetone phosphatase from Corynebacterium glutamicumis also a member of this superfamily, and catalyses the dephosphorylation of dihydroxyacetone phosphate to produce 1,3-dihydroxyacetone [ ].The structure of NagD from Escherichia coli>(strain K12) has been reported and its activity against various substrates determined. It has high specificity for nucleotide monophosphates, and in particular UMP and GMP. In the context of its occurrence in the NAG operon, it may well be involved in the recycling of cell wall metabolites [ ].
Protein Domain
Name: Rad21/Rec8-like protein, C-terminal, eukaryotic
Type: Domain
Description: This domain represents a conserved C-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Rad21/Rec8 like proteins mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex [ ]. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation [].
Protein Domain
Name: ScpA-like, C-terminal
Type: Homologous_superfamily
Description: This superfamily represents a conserved C-terminal region found in bacterial segregation and condensation protein A (ScpA) as well as in eukaryotic cohesins of the Rad21 and Scc1 families. ScpA participates in chromosomal partition during cell division. It may act via the formation of a condensin-like complex containing Smc and ScpB that pull DNA away from mid-cell into both cell halves.
Protein Domain
Name: Rad21/Rec8-like protein, N-terminal
Type: Domain
Description: This domain represents a conserved N-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Rad21/Rec8 like proteins mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex [ ]. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation [].
Protein Domain
Name: 26S proteasome non-ATPase regulatory subunit Rpn12
Type: Family
Description: Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes [ , ]. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.The 26S proteasome can be divided into two subcomplexes: the 19S regulatory particle (RP) and the 20S core particle (CP) [ ]. The 19S component is divided into a "base"subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid"subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12) [ ]. Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation [].This entry represents Rpn12 (also often annotated as 26S proteasome non-ATPase regulatory subunit 8). This protein has been shown to be important for the transition from metaphase to anaphase and the activation of Cdc28p kinase in yeast [ , ].
Protein Domain
Name: PDCD5-like
Type: Family
Description: This protein family is found in archaea and eukaryota. Proteins in this family contain a predicted DNA-binding domain [ ] and may function as DNA-binding proteins. Methanobacterium thermoautotrophicum MTH1615 was predicted to bind DNA based on structural proteomics data, and this was confirmed by the demonstration that it can interact non-specifically with a randomly chosen 20-mer of double stranded DNA []. This suggests that the human protein may be involved in nucleic acid binding or metabolism.The human programmed cell death protein 5 (PDCD5, also known as TFAR19) encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. PDCD5 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. PDCD5 may play a general role in the apoptotic process [ , ].
Protein Domain
Name: Two pore domain potassium channel
Type: Family
Description: Potassium channels are the most diverse group of the ion channel family [ , ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K +channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [ ]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [ ]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K +channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K +selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K +across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K +channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K +channels; and three types of calcium (Ca)-activated K +channels (BK, IK and SK) [ ]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K +channel alpha-subunits that possess two P-domains. These are usually highly regulated K +selective leak channels. 2P-domain channels influence the resting membrane potential and as a result can control cell excitability. In addition, they pass K+ in response to changes in membrane potential, and are also tightly regulated by molecular oxygen, GABA (gamma-aminobutyric acid), noradrenaline and serotonin.The first member of this family (TOK1), cloned from Saccharomyces cerevisiae [ ], ispredicted to have eight potential transmembrane (TM) helices. However, subsequently-cloned two P-domain family members from Drosophila andmammalian species are predicted to have only four TM segments. They are usually referred to as TWIK-related channels (Tandem of P-domains in a Weakly Inward rectifying K+ channel) [ , , , ]. Functional characterisation of these channels has revealed a diversity of properties in that they may show inward or outward rectification, their activity may be modulated in different directions by protein phosphorylation, and their sensitivity to changes in intracellular or extracellular pH varies. Despite these disparate properties, they are all thought to share the same topology offour TM segments, including two P-domains. That TWIK-related K+ channels all produce instantaneous and non-inactivating K+ currents, which do notdisplay a voltage-dependent activation threshold, suggests that they are background (leak) K+ channels involved in the generation and modulation of the resting membrane potential in various cell types. Further studies have revealed that they may be found in many species, including: plants, invertebrates and mammals.
Protein Domain
Name: Potassium channel domain
Type: Domain
Description: This domain is found in a variety of potassium channel proteins, including the two membrane helix type ion channels found in bacteria [ ].
Protein Domain
Name: Cas1p 10 TM acyl transferase domain
Type: Domain
Description: Cas1p protein of Cryptococcus neoformans is required for the synthesis of O-acetylated glucuronoxylomannans, a consitutent of the capsule, and is critical for its virulence [ ]. The multi TM domain of the Cas1p was unified with the 10 TM Sugar Acyltransferase superfamily []. This superfamily is comprised of members from the OatA, MdoC, OpgC, NolL and GumG families in addition to the Cas1p family []. The Cas1p protein has a N-terminal PC-esterase domain with the opposing acyl esterase activity [].
Protein Domain
Name: 3'-RNA ribose 2'-O-methyltransferase, Hen1
Type: Family
Description: This entry consists of the eukaryal and bacterial 3'-RNA ribose 2'-O-methyltransferase, Hen1. Eukaryal Hen1 is a methyltransferase that adds a 2'-O-methyl group at the 3'-end small RNAs to protects the 3'-end of sRNAs from uridylation activity and subsequent degradation [ ]. In bacetria, Hen1 catalyses the same chemical reaction, but forms Pnkp/Hen1 heterotetramer and is involved in RNA repair [, ].
Protein Domain
Name: Stress up-regulated Nod 19
Type: Family
Description: This family of plant proteins have been implicated in nodule development [ ] in the legume Medicago truncatula (Barrel medic). MtN-19 was shown by Northern blot to be induced during nodulation []. The molecular function of these proteins is unknown.
Protein Domain
Name: Germin, manganese binding site
Type: Binding_site
Description: Germins and germin-like proteins [ ] are a family of hexameric ubiquitous plant glycoproteins. They are not restricted to germinating grains as initially thought and therefore called 'germins', but they exist in all organs and developmental stages. They are all partly associated with the extracellular matrix. A wide range of functions have been uncovered for germins and germin-like proteins: some act as oxalate oxidases ( ) or as superoxide dismutases ( ), while others seem to be structural proteins or receptors for auxins or other proteins. Germins and germin-like proteins are highly similar to slime mold spherulins 1a and 1b which are proteins that accumulate specifically during spherulation, a process induced by various forms of environmental stress which leads to encystment and dormancy. The signature pattern for this entry is located in the central region and it contains three residues; 2 histidines and a glutamate, which are implicated in the binding of a manganese ion [ ].
Protein Domain
Name: Germin
Type: Family
Description: Germins (also known as Oxalate oxidases) and germin-like proteins [ ] are a family of hexameric ubiquitous plant glycoproteins. They are not restricted to germinating grains as initially sought and thereof called 'germins', but they exist in all organs and developmental stages. All are at least partly associated with the extracellular matrix. A wide range of function has been uncovered for germins and germin-like proteins: some act as oxalate oxidases ( ) or as superoxide dismutase ( ), while others seem to be structural proteins or receptors for auxins or proteins. Germin catalyses the manganese-dependent oxidative decarboxylation of oxalate to carbon dioxide and hydrogen peroxide (H2O2) [ , , ]. It is widespread in fungi and various plant tissues and may play a role in plant signaling and defense.Germins and germin-like proteins are highly similar to slime mold spherulins 1a and 1b which are proteins that accumulate specifically during spherulation, a process induced by various forms of environmental stress which leads to encystment and dormancy.
Protein Domain
Name: Putative deacetylase LmbE-like domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a domain which consists of a 3-layer (alpha/beta/alpha) sandwich, and it is found in the enzymes 1D-myo-inositol 2-acetamido-2-deoxy-alpha-D-glucopyranoside deacetylase and N-acetylglucosaminyl-phosphatidylinositol de-N-acetylase, as well as in uncharacterised putative deacetylases. 1D-myo-inositol 2-acetamido-2-deoxy-alpha-D-glucopyranoside deacetylase ( ) catalyzes the deacetylation of 1D-myo-inositol 2-acetamido-2-deoxy-alpha-D-glucopyranoside (GlcNAc-Ins) in the mycothiol biosynthesis pathway [ ]. N-acetylglucosaminyl-phosphatidylinositol de-N-acetylase, () catalyses the second step in glycosylphosphatidylinositol (GPI) biosynthesis [ ].
Protein Domain
Name: N-acetylglucosaminyl phosphatidylinositol deacetylase-related
Type: Family
Description: Members of this family include N-acetylglucosaminyl-phosphatidylinositol de-N-acetylase ( ) that catalyses the second step in glycosylphosphatidylinositol (GPI) biosynthesis [ , ] and 1D-myo-inositol 2-acetamido-2-deoxy-alpha-D-glucopyranoside deacetylase (), involved in the biosynthesis of mycothiol (an unusual thiol compound found in Actinobacteria) [ ].
Protein Domain
Name: NPK1-activating kinesin-like protein, C-terminal
Type: Domain
Description: This domain is found at the C terminus of plant kinesin-like proteins NACK1 and NACK2. NACK1 (NPK1-activating kinesin-like protein 1) is key regulator of plant cytokinesis [ ]. NACK2 (TETRASPORE) is required for male meiotic cytokinesis [].
Protein Domain
Name: Beta-ketoacyl synthase, N-terminal
Type: Domain
Description: Beta-ketoacyl-ACP synthase (KAS) [ ] is the enzyme that catalyses the condensation of malonyl-ACP with the growing fatty acid chain. It is found as a component of a number of enzymatic systems, including fatty acid synthetase (FAS), which catalyzes theformation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH; the multi-functional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum [ ], which is involved in the biosynthesis of a polyketide antibiotic; polyketide antibiotic synthase enzymesystems; Emericella nidulans multifunctional protein Wa, which is involved in the biosynthesis of conidial green pigment; Rhizobium nodulation protein nodE, which probably acts as a beta-ketoacyl synthase in the synthesis of the nodulation Nod factor fatty acyl chain; and yeast mitochondrial protein CEM1. The condensation reaction is a two step process, first the acyl component of an activated acyl primer is transferred to a cysteine residue of the enzyme and is then condensed with an activated malonyl donor with the concomitant release of carbon dioxide. This entry represents the N-terminal domain of beta-ketoacyl-ACP synthases.
Protein Domain
Name: Mini-chromosome maintenance, conserved site
Type: Conserved_site
Description: MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication [, , ]. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have adirect role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins [ ].This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.The "MCM motif"contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity [ ]. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.The MCM proteins bind together in a large complex [ ].Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7 [ ]. This core complex in human MCMs has been associated with helicase activity in vitro[ ], leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker's yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus [ ].The signature pattern used in this entry represents a perfectly conserved region that is a special version of the B motif found in ATP-binding proteins.
Protein Domain
Name: MCM N-terminal domain
Type: Domain
Description: This entry represents the N-terminal region of MCM proteins. This region is composed of three structural domains. Firstly a four helical bundle, secondly a zinc binding motif and thirdly an OB-like fold [ ].MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication [, , ]. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have adirect role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins [ ].
Protein Domain
Name: DNA replication licensing factor Mcm3
Type: Family
Description: The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. In eukaryotes, Mcm3 is a component of the MCM2-7 complex (MCM complex), which consists of six sequence-related AAA + type ATPases/helicases that form a hetero-hexameric ring [ ]. MCM2-7 complex is part of the pre-replication complex (pre-RC). In G1 phase, inactive MCM2-7 complex is loaded onto origins of DNA replication [, , ]. During G1-S phase, MCM2-7 complex is activated to unwind the double stranded DNA and plays an important role in DNA replication forks elongation [].The components of the MCM2-7 complex are: . DNA replication licensing factor MCM2, DNA replication licensing factor MCM3, DNA replication licensing factor MCM4, DNA replication licensing factor MCM5, DNA replication licensing factor MCM6, DNA replication licensing factor MCM7,
Protein Domain      
Protein Domain
Name: Uncharacterised protein family Ycf23
Type: Family
Description: Protein in this entry are of unknown function and are found in cyanobacteria and the chloroplasts of algae. As the family is exclusively found in phototrophic organisms it may play a role in photosynthesis.
Protein Domain
Name: Ribosomal protein S4/S9, eukaryotic/archaeal
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].The S4 domain is a small domain consisting of 60-65 amino acid residues that probably mediates binding to RNA. This model finds eukaryotic ribosomal protein S9 as well as eukaryotic and archaeal ribosomal protein S4.
Protein Domain
Name: Ribosomal protein S4/S9, N-terminal
Type: Domain
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein S4 is one of the proteins from the small ribosomal subunit. S4 is known to bind directly to 16S ribosomal RNA. The crystal structure of a bacterial S4 protein revealed a two domain molecule. The first domain is composed of four helices in the known structure. The second domain is an insertion within domain 1 and displays some structural homology with the ETS DNA binding domain [].This entry represents the domain found at the N terminus of small ribosomal subunits S4 and S9.
Protein Domain
Name: Ribosomal protein S4, conserved site
Type: Conserved_site
Description: Ribosomal protein S4 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S4 is known to bind directly to 16S ribosomal RNA. Mutations in S4 have been shown to increase translational error frequencies. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [ ], groups: Eubacterial S4. Algal and plant chloroplast S4. Cyanelle S4. Archaebacterial S4. Mammalian S9. Yeast YS11 (SUP46). Marchantia polymorpha (Liverwort) mitochondrial S4. Dictyostelium discoideum (Slime mold) rp1024. Yeast protein NAM9 [ ]. NAM9 has been characterised as a suppressor for ochre mutations in mitochondrial DNA. It could be a ribosomal protein that acts as a suppressor by decreasing translation accuracy. S4 is a protein of 171 to 205 amino-acid residues (except for NAM9 which is much larger).
Protein Domain
Name: TB2/DP1/HVA22-related protein
Type: Family
Description: This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein, which in human is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease [ ]. REEP/DP1/Yop1 family of proteins are involved in the control of endoplasmic reticulum organisation and mutations in some members of this family are associated with inherited diseases like Hereditary spastic paraplegias (HSPs) [].The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein, which is thought to be a regulatory protein [ ].
Protein Domain
Name: NLE
Type: Domain
Description: This domain is located N-terminal to WD40 repeats( ). It is found in the microtubule-associated protein [ ].
Protein Domain
Name: SLIDE domain
Type: Domain
Description: The SLIDE domain adopts a secondary structure comprising a main core of three α-helices. It has a role in DNA binding, contacting DNA target sites similar to c-Myb repeats or homeodomains [ , ].
Protein Domain
Name: ISWI, HAND domain
Type: Domain
Description: Nucleosome remodelling is an energy-dependent process that alters histone-DNA interactions within nucleosomes, thereby rendering nucleosomal DNA accessible to regulatory factors. The ATPases involved belong to the SWI2/SNF2 subfamily of DEAD/H-helicases, which contain a conserved ATPase domain characterised by seven motifs. Proteins within this family differ with regard to domain organisation, their associated proteins and the remodelling complex in which they reside. The ATPase ISWI is a member of this family. ISWI can be divided into two regions: an N-terminal region that contains the SWI2/SNF2 ATPase domain, and a C-terminal region that is responsible for substrate recognition. The C-terminal region contains 12 α-helices and can be divided into three domains and a spacer region: a HAND domain (named because its 4-helical structure resembles an open hand), a SANT domain (c-Myb DNA-binding like), a spacer helix, and a SLIDE domain (SANT-like but with several insertions) [ , ].This entry represents the HAND domain, which adopts a secondary structure consisting of four alpha helices, three of which (H2, H3, H4) form an L-like configuration. Helix H2 runs antiparallel to helices H3 and H4, packing closely against helix H4, whilst helix H1 reposes in the concave surface formed by these three helices and runs perpendicular to them. This domain confers DNA and nucleosome binding properties to the protein [ ].
Protein Domain
Name: DBINO domain
Type: Domain
Description: Proteins belonging to the SNF2 family of DNA dependent ATPases are important members of the chromatin remodelling complexes that are implicated in epigenetic control of gene expression. Members of the SNF2 family of proteins have been identified in organisms ranging from Escherichia coli to Homo sapiens (Human). All of them contain the conserved SNF2 domain, which is defined by the existence of seven motifs (I, Ia, and II-VI) with sequences similarity to those motifs found in DNA and RNA helicases (see ). SNF2-like family members can be further subdivided into several subfamilies according to the presence of protein motifs outside of the ATPase region. The DBINO (DNA binding domain of INO80) domain is characteristic of the INO80 subfamily and is a DNA-binding domain [ , ]. The DBINO domain is a 126 amino acid long peptide located near the N terminus, approximately 100 residues upstream of the SNF2 helicase domain. The presence of this domain in all the INO80 subfamily proteins from yeast to humans suggests its conserved function in evolution [, ].
Protein Domain
Name: Regulatory factor, effector binding domain superfamily
Type: Homologous_superfamily
Description: The effector domain is found in bacterial regulatory proteins, such as transcription factors. The effector domain consists of a duplication of a beta/alpha/beta(2) motif, where the antiparallel beta sheets form a barrel structure. Several proteins contain this domain, such as the multidrug-binding domain of the transcription factor BmrR, which transcriptionally regulates multidrug transporters as well as acting as a multidrug-binding protein [ ], the C-terminal domain of the Rob transcription factor, which belongs to the AraC/XylS protein family that regulate genes involved in resistace to antibiotics, organic solvents and heavy metals [], and the gyrase inhibitory protein GyrI (SbmC, TeeB), which is induced by DNA damaging agents to suppress cell proliferation by inhibiting bacterial gyrase activity [].
Protein Domain
Name: SOUL haem-binding protein
Type: Family
Description: This family represents a group of putative haem-binding proteins [ ]. It includes archaeal and bacterial homologues.
Protein Domain
Name: Myosin head, motor domain
Type: Domain
Description: Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains [ ], 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail, although some forms have a globular region in their C-terminal. There are many cell-specific isoforms of myosin heavy chains, coded for by a multi-gene family []. Myosin interacts with actin to convert chemical energy, in the form of ATP, to mechanical energy []. The 3-D structure of the head portion of myosin has been determined [] and a model for actin-myosin complex has been constructed [].The globular head is well conserved, some highly-conserved regions possibly relating to functional and structural domains [ ]. The rod-like tail starts with an invariant proline residue, and contains many repeats of a 28 residue region, interrupted at 4 regularly-spaced points known as skip residues. Although the sequence of the tail is not well conserved, the chemical character is, hydrophobic, charged and skip residues occuring in a highly ordered and repeated fashion [].
Protein Domain
Name: Domain of unknown function DUF3444
Type: Domain
Description: This entry represents an uncharacterised domain. This domain is found in DnaJ, cytosine-specific methyltransferases, and members from the zinc finger, C3HC4 type family.
Protein Domain
Name: Transketolase, N-terminal
Type: Domain
Description: Transketolase (TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such asribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link betweenthe glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK hasbeen purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources [, ] show that theenzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Pichia angusta (Yeast) (Hansenula polymorpha), there is ahighly related enzyme, dihydroxy-acetone synthase (DHAS) (also known as formaldehyde transketolase), which exhibits a very unusualspecificity by including formaldehyde amongst its substrates. 1-deoxyxylulose-5-phosphate synthase (DXP synthase) [] is an enzyme so farfound in bacteria (gene dxs) and plants (gene CLA1) which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbonatoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D- xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway toisoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthase is evolutionary related to TK. The N-terminal section, contains a histidine residue which appears to function inproton transfer during catalysis [ ]. In the centralsection there are conserved acidic residues that are part of the active cleft and may participate in substrate-binding [].This family includes transketolase enzymes and also partially matches to 2-oxoisovalerate dehydrogenasebeta subunit . Both these enzymes utilise thiamine pyrophosphate as a cofactor, suggestingthere may be common aspects in their mechanism of catalysis.
Protein Domain
Name: Deoxyxylulose-5-phosphate synthase
Type: Family
Description: 1-Deoxy-D-xylulose-5-phosphate synthase (DXP synthase) is a regulatory enzyme of the mevalonate-independent pathway involved in terpenoid biosynthesis. Terpeniods are plant natural products with important pharmaceutical activity. DXP synthase is a thiamine diphosphate-dependent enzyme related to transketolase and the pyruvate dehydrogenase E1-beta subunit. DXP synthase is found in bacteria (gene dxs)and plants (gene CLA1) which catalyzes the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D-xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthase is evolutionary related to TK. The N-terminal section contains a histidine residue which appears to function in proton transfer during catalysis []. In the central section there are conserved acidic residues that are part of the active cleft and may participate in substrate-binding []. This family includes transketolase enzymes , and also partially matches to 2-oxoisovalerate dehydrogenase beta subunit P37941 . Both these enzymes utilise thiamine pyrophosphate as a cofactor, suggesting there may be common aspects in their mechanism of catalysis.
Protein Domain
Name: Allergen Ole e 1, conserved site
Type: Conserved_site
Description: Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed ofthe first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two speciesnames have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.The allergens in this family include allergens with the following designations: Ole e 1.A number of plant pollen proteins, whose biological function is not yet known, are structurally related [].These proteins are most probably secreted and consist of about 145 residues. There are six cysteineswhich are conserved in the sequence of these proteins. They seem to be involved in disulphide bonds.
Protein Domain
Name: NHL repeat
Type: Repeat
Description: The NHL repeat, named after NCL-1, HT2A and Lin-41, is a conserved structural motif present in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides [ ] and in a large family of growth regulators. Many NHL-containing proteins have additional domains such as a RING finger, a B-box zinc finger or a coiled-coil motif. In many, it occurs in tandem arrays, for example in the ringfinger β-box, coiled-coil (RBCC) eukaryotic growth regulators [] or the 'Brain Tumor' protein (Brat) [, ].The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK PknD from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain [ ].The NHL domain is a six-bladed β-propeller, with the blades arrayed in a radial fashion around a central axis, and each blade composed of a highly twisted four stranded antiparallel β-sheet [ ]. The innermost strand of each blade is labeled 'a' and the outermost strand, 'd'. Like in other β-propellers the sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue NHL repeat spans strands 'b-d' of one propeller blade and strand 'a' of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand a of the blade that harbors strands 'b-d' from the first sequence repeat. According to structural model analysis, the NHL domain could be involved in protein-protein interaction [].
Protein Domain
Name: Carbohydrate kinase, thermoresistant glucokinase
Type: Family
Description: This family of proteins includes thermoresistant and thermosensitve isozymes of gluconate kinase (gluconokinase) in E. coli and other related proteins; members of this family are often named by similarity to the thermostable isozyme. These proteins show homology to shikimate kinases and adenylate kinases but not to gluconate kinases from the FGGY family of carbohydrate kinases. The structure of the E. coli thermoresistant gluconate kinase GntK has been revealed [ ]. GntK catalyzes the phosphoryl transfer from ATP to gluconate. The resulting product gluconate-6-phoshate is an important precursor of gluconate metabolism. GntK acts as a dimmer composed of two identical subunits [, ].
Protein Domain
Name: Shikimate kinase/Threonine synthase-like 1
Type: Family
Description: Shikimate kinase ( ) catalyses the fifth step in the biosynthesis of aromatic amino acids from chorismate (the so-called shikimate pathway) [ ]. The enzyme catalyses the following reaction:ATP + shikimate = ADP + shikimate-3-phosphate The protein is found in bacteria (gene aroK or aroL), plants and fungi (where it is part of a multifunctional enzyme that catalyses five consecutive steps in this pathway). In 1994, the 3D structure of shikimate kinase was predicted to be very close to that of adenylate kinase, suggesting a functional similarity as well as an evolutionary relationship []. This prediction has since been confirmed experimentally. The protein is reported to possess an alpha/beta fold, consisting of a central sheet of five parallel β-strands flanked by α-helices. Such a topology is very similar to that of adenylate kinase [].The N terminus of threonine synthase-like 1 from metazoan shares protein sequence similarity with shikimate kinase and is included in this entry. However, their functions may be different.
Protein Domain
Name: Ribosomal protein S19/S15
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence [ ] has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins [, ] that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).
Protein Domain
Name: Impact family
Type: Family
Description: The Impact protein is a translational regulator that ensures constant high levels of translation under amino acid starvation. It acts by interacting with Gcn1/Gcn1L1, thereby preventing activation of Gcn2 protein kinases (EIF2AK1 to 4) and subsequent down-regulation of protein synthesis. It is evolutionary conserved from eukaryotes to archaea [ ].
Protein Domain
Name: Impact, N-terminal
Type: Domain
Description: The Impact protein is a translational regulator that ensures constant high levels of translation under amino acid starvation. It acts by interacting with Gcn1/Gcn1L1, thereby preventing activation of Gcn2 protein kinases (EIF2AK1 to 4) and subsequent down-regulation of protein synthesis. It is evolutionary conserved from eukaryotes to archaea [ ]. This entry represents the N-terminal domain of the Impact proteins.
Protein Domain
Name: Pyrophosphate-dependent phosphofructokinase TP0108-type
Type: Family
Description: Phosphofructokinase (PFK) catalyses the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate, which then enters the Embden-Meyerhof pathway. PFK is a key regulatory enzyme in glycolysis [ ].This family consists of a group of plant and bacterial pyrophosphate-dependent phosphofructokinases related to TP0108 from Treponema [ ]. The bacterial versions are non-allosteric dimers, while the plant versions are allosteric heterotetramers []. They belong to the PFK domain superfamily of proteins, which also includes prokaryotic () and eukaryotic ATP-dependent PFKs ( ). The membership of this group largely resembles group B1 PFKs [ ].
Protein Domain
Name: Phosphofructokinase domain
Type: Domain
Description: The enzyme-catalysed transfer of a phosphoryl group from ATP is an important reaction in a wide variety of biological processes []. Oneenzyme that utilises this reaction is phosphofructokinase (PFK), which catalyses the phosphorylation of fructose-6-phosphate to fructose-1,6-bisphosphate, a key regulatory step in the glycolytic pathway [ , ]. PFK exists as a homotetramer in bacteria and mammals (where each monomer possesses 2 similar domains), and as an octomer in yeast (where there are4 alpha- (PFK1) and 4 beta-chains (PFK2), the latter, like the mammalian monomers, possessing 2 similar domains []). PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved inATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6-bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP [], as the 2 products are now further apart. These conformations arethought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react [].Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting,muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise []. Sufferers are usually able to lead a reasonablyordinary life by learning to adjust activity levels [ ].
Protein Domain
Name: ATP-dependent 6-phosphofructokinase
Type: Family
Description: The enzyme-catalysed transfer of a phosphoryl group from ATP is an important reaction in a wide variety of biological processes []. Oneenzyme that utilises this reaction is phosphofructokinase (PFK), which catalyses the phosphorylation of fructose-6-phosphate to fructose-1,6-bisphosphate, a key regulatory step in the glycolytic pathway [ , ]. PFK exists as a homotetramer in bacteria and mammals (where each monomer possesses 2 similar domains), and as an octomer in yeast (where there are4 alpha- (PFK1) and 4 beta-chains (PFK2), the latter, like the mammalian monomers, possessing 2 similar domains []). PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved inATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6-bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP [], as the 2 products are now further apart. These conformations arethought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react [].Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting,muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise []. Sufferers are usually able to lead a reasonablyordinary life by learning to adjust activity levels [ ].
Protein Domain
Name: Dedicator of cytokinesis
Type: Family
Description: DOCK family members are evolutionarily conserved guanine nucleotide exchange factors (GEFs) for Rho-family GTPases [ ]. DOCK proteins are required during several cellular processes, such as cell motility and phagocytosis. The N-terminal SH3 domain of the DOCK proteins functions as an inhibitor of GEF, which can be relieved upon its binding to the ELMO1-3 adaptor proteins, after their binding to active RhoG at the plasma membrane [, ]. DOCK family proteins are categorised into four subfamilies based on their sequence homology: DOCK-A subfamily (DOCK1/180, 2, 5), DOCK-B subfamily (DOCK3, 4), DOCK-C subfamily (DOCK6, 7, 8), DOCK-D subfamily (DOCK9, 10, 11) []. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker).
Protein Domain
Name: BURP domain
Type: Domain
Description: The BURP domain was named after the proteins in which it was first identified: BNM2, USP, RD22, and PG1beta. It is found in the C terminus of a number of plant cell wall proteins, which are defined not only by the BURP domain, but also by the overall similarity in their modular construction. The BURP domain-containing proteins consists of either three or four modules: (i) an N-terminal hydrophobic domain - a presumptive transit peptide, joined to (ii) a short conserved segment or other short segment, (iii) an optional segment consisting of repeated units which is unique to each member, and (iv) the C-terminal BURP domain. Although the BURP domain proteins share primary structural features, their expression patterns and the conditions under which they are expressed differ. The presence of the conserved BURP domain in diverse plant proteins suggests an important and fundamental functional role for this domain []. It is possible that the BURP domain represents a general motif for localization of proteins within the cell wall matrix. The other structural domains associated with the BURP domain may specify other target sites for intermolecular interactions [].Some proteins known to contain a BURP domain are listed below [ , , ]:Brassica protein BNM2, which is expressed during the induction of microspore embryogenesis.Field bean USPs, abundant non-storage seed proteins with unknown function.Soybean USP-like proteins ADR6 (or SALI5-4A), an auxin-repressible, aluminium-inducible protein and SALI3-2, a protein that is up-regulated by aluminium.Soybean seed coat BURP-domain protein 1 (SCB1). It might play a role in the differentiation of the seed coat parenchyma cells.Arabidopsis RD22 drought induced protein.Maize ZRP2, a protein of unknown function in cortex parenchyma.Tomato PG1beta, the beta-subunit of polygalacturonase isozyme 1 (PG1), which is expressed in ripening fruits.Cereal RAFTIN. It is essential specifically for the maturation phase of pollen development.
Protein Domain      
Protein Domain
Name: Dihydroxy-acid/6-phosphogluconate dehydratase
Type: Family
Description: Two dehydratases, dihydroxy-acid dehydratase ( ) (gene ilvD or ILV3) and 6-phosphogluconate dehydratase ( ) (gene edd) have been shown to be evolutionary related [ ]. Dihydroxy-acid dehydratase catalyses the fourth step in the biosynthesis of isoleucine and valine, the dehydratation of 2,3-dihydroxy-isovaleic acid into alpha-ketoisovaleric acid. 6-Phosphogluconate dehydratase catalyses the first step in the Entner-Doudoroff pathway, the dehydratation of 6-phospho-D-gluconate into 6-phospho-2-dehydro-3-deoxy-D-gluconate. Another protein containing this signature is Escherichia coli YjhG, which has been identified as a D-xylonate dehydratase []. The N-terminal part of the proteins contains a cysteine that could be involved in the binding of a 2Fe-2S iron-sulphur cluster [].
Protein Domain
Name: Dihydroxy-acid/6-phosphogluconate dehydratase, conserved site
Type: Conserved_site
Description: Two dehydratases, dihydroxy-acid dehydratase ( ) (gene ilvD or ILV3) and 6-phosphogluconate dehydratase ( ) (gene edd) have been shown to be evolutionary related [ ]. Dihydroxy-acid dehydratase catalyses the fourth step in the biosynthesis of isoleucine and valine, the dehydratation of 2,3-dihydroxy-isovaleic acid into alpha-ketoisovaleric acid. 6-Phosphogluconate dehydratase catalyses the first step in the Entner-Doudoroff pathway, the dehydratation of 6-phospho-D-gluconate into 6-phospho-2-dehydro-3-deoxy-D-gluconate. Another protein containing this signature is Escherichia coli YjhG, which has been identified as a D-xylonate dehydratase []. The N-terminal part of the proteins contains a cysteine that could be involved in the binding of a 2Fe-2S iron-sulphur cluster [].This entry contains two signature patterns: the first pattern is located in the N-terminal half and contains a cysteine that could be involved in the binding of a 2Fe-2S iron-sulphur cluster; the second pattern is located in the C-terminal half.
Protein Domain
Name: Dihydroxy-acid dehydratase
Type: Family
Description: Two dehydratases, dihydroxy-acid dehydratase (gene ilvD or ILV3) and 6-phosphogluconate dehydratase (gene edd) have been shown to be evolutionary related []. Dihydroxy-aciddehydratase catalyzes the fourth step in the biosynthesis of isoleucine and valine, the dehydratation of2,3-dihydroxy-isovaleic acid into alpha-ketoisovaleric acid. 6-Phosphogluconate dehydratase catalyzes the first step in the Entner-Doudoroff pathway, the dehydratation of 6-phospho-D-gluconate into 6-phospho-2-dehydro-3-deoxy-D-gluconate. Another protein containing this signature is the Escherichia coli hypothetical protein yjhG. The N-terminal part of the proteins contains a cysteine that could be involved in the binding of a2Fe-2S iron-sulphur cluster [ ].This family represents dihydroxy-acid dehydratase (DAD). It contains a catalytically essential [4Fe-4S] cluster and catalyses the fourth step in valine and isoleucine biosynthesis.
Protein Domain
Name: Aconitase/3-isopropylmalate dehydratase, swivel
Type: Homologous_superfamily
Description: Aconitase (aconitate hydratase; ) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop [, ]. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is smaller than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) [].Eukaryotic cAcn enzyme balances the amount of citrate and isocitrate in the cytoplasm, which in turn creates a balance between the amount of NADPH generated from isocitrate by isocitrate dehydrogenase with the amount of acetyl-CoA generated from citrate by citrate lyase. Fatty acid synthesis requires both NADPH and acetyl-CoA, as do other metabolic processes, including the need for NADPH to combat oxidative stress. The enzymatic form of cAcn predominates when iron levels are normal, but if they drop sufficiently to cause the disassembly of the [4Fe-4S]-cluster, then cAcn undergoes a conformational change from a compact enzyme to a more open L-shaped protein known as iron regulatory protein 1 (IRP1; or IRE-binding protein 1, IREBP1) [, ]. As IRP1, the catalytic site and the [4Fe-4S]-cluster are lost, and two new RNA-binding sites appear. IRP1 functions in the post-transcriptional regulation of genes involved in iron metabolism - it binds to mRNA iron-responsive elements (IRE), 30-nucleotide stem-loop structures at the 3' or 5' end of specific transcripts. Transcripts containing an IRE include ferritin L and H subunits (iron storage), transferrin (iron plasma chaperone), transferrin receptor (iron uptake into cells), ferroportin (iron exporter), mAcn, succinate dehydrogenase, erythroid aminolevulinic acid synthetase (tetrapyrrole biosynthesis), among others. If the IRE is in the 5'-UTR of the transcript (e.g. in ferritin mRNA), then IRP1-binding prevents its translation by blocking the transcript from binding to the ribosome. If the IRE is in the 3'-UTR of the transcript (e.g. transferrin receptor), then IRP1-binding protects it from endonuclease degradation, thereby prolonging the half-life of the transcript and enabling it to be translated [ ].IRP2 is another IRE-binding protein that binds to the same transcripts as IRP1. However, since IRP1 is predominantly in the enzymatic cAcn form, it is IRP2 that acts as the major metabolic regulator that maintains iron homeostasis [ ]. Although IRP2 is homologous to IRP1, IRP2 lacks aconitase activity, and is known only to have a single function in the post-transcriptional regulation of iron metabolism genes []. In iron-replete cells, IRP2 activity is regulated primarily by iron-dependent degradation through the ubiquitin-proteasomal system.Bacterial AcnB is also known to be multi-functional. In addition to its role in the TCA cycle, AcnB was shown to be a post-transcriptional regulator of gene expression in Escherichia coli and Salmonella enterica [ , ]. In S. enterica, AcnB initiates a regulatory cascade controlling flagella biosynthesis through an interaction with the ftsH transcript, an alternative RNA polymerase sigma factor. This binding lowers the intracellular concentration of FtsH protease, which in turn enhances the amount of RNA polymerase sigma32 factor (normally degraded by FtsH protease), and sigma32 then increases the synthesis of chaperone DnaK, which in turn promotes the synthesis of the flagellar protein FliC. AcnB regulates the synthesis of other proteins as well, such as superoxide dismutase (SodA) and other enzymes involved in oxidative stress.3-isopropylmalate dehydratase (or isopropylmalate isomerase; ) catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family [ ]. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S]cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively [ , ]. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase , converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis [ ]. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus []. It is also found in the higher plant Arabidopsis thaliana, where it is targeted to the chloroplast [].This superfamily represents the 'swivel' domain found at the C-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the N-terminal region following the HEAT-like domain in bacterial AcnB. This domain has a three layer beta/beta/alpha structure, and in cytosolic Acn is known to rotate between the cAcn and IRP1 forms of the enzyme. This domain is also found in the small subunit of isopropylmalate dehydratase (LeuD).
Protein Domain
Name: Transmembrane protein TMEM64
Type: Family
Description: This entry contains a group of transmemberane proteins, including TMEM64. The yeast member of this family, Tvp38 ( ), localises with the t-SNARE Tlg2 [ ].
Protein Domain
Name: Redoxin
Type: Domain
Description: This redoxin domain is found in peroxiredoxin, thioredoxin and glutaredoxin proteins. Peroxiredoxins (Prxs) constitute a family of thiol peroxidases that reduce hydrogen peroxide, peroxinitrite, and hydroperoxides using a strictly conserved cysteine [ , ]. Chloroplast thioredoxin systems in plants regulate the enzymes involved in photosynthetic carbon assimilation []. It is thought that redoxins have a large role to play in anti-oxidant defence. Cadmium-sensitive proteins are also regulated via thioredoxin and glutaredoxin thiol redox systems [].
Protein Domain
Name: Adenosine deaminase domain
Type: Domain
Description: Adenosine deaminases ( ) are monomeric zinc dependent enzymes involved in purine metabolism. They are required for the breakdown of adenosine from food and for the turnover of nucleic acids in tissues. Adenosine deaminases convert adenosine to the related nucleoside inosine by the substitution of the amino group for a hydroxyl group. These enzymes are distantly related to AMP deaminases [ ] as they share three regions of sequence similarities; these regions are centred on residues which are proposed to play an important role in the catalytic mechanism of these two enzymes.This entry also includes a group of adenine deaminases more related to the bacterial and eukaryotic adenosine deaminases than to Bacillus subtilis adenine deaminase and its bacterial homologues [ ]. Adenine deaminase catalyses the hydrolytic deamination of adenine to hypoxanthine. This entry represents the main structural domain of adenosine deaminase proteins.
Protein Domain
Name: Pseudouridine synthase, RsuA/RluA-like
Type: Domain
Description: Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an α+β structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are [, ]:Pseudouridine synthase I, TruA.Pseudouridine synthase II, TruB, which contains and additional C-terminal PUA domain.Pseudouridine synthase RsuA. RluB, RluE and RluF are also part of this family.Pseudouridine synthase RluA. TruC, RluC and RluD belong to this family.Pseudouridine synthase TruD, which has a natural circular permutation in the catalytic domain, as well as an insertion of a family-specific α+β subdomain.This entry represents several different pseudouridine synthases from family 3, including: RsuA (acts on small ribosomal subunit), RluA, RluB, RluC, RluD, RluE and RluF (act on large ribosomal subunit) and TruC [ ].RsuA from Escherichia coli catalyses formation of pseudouridine at position 516 in 16S rRNA during assembly of the 30S ribosomal subunit [ , ]. RsuA consists of an N-terminal domain connected by an extended linker to the central and C-terminal domains. Uracil and UMP bind in a cleft between the central and C-terminal domains near the catalytic residue Asp 102. The N-terminal domain shows structural similarity to the ribosomal protein S4. Despite only 15% amino acid identity, the other two domains are structurally similar to those of the tRNA-specific psi-synthase TruA, including the position of the catalytic Asp. Our results suggest that all four families of pseudouridine synthases share the same fold of their catalytic domain(s) and uracil-binding site.RluB, RluC, RluD, RluE and RluF are homologous enzymes which each convert specific uridine bases in E. coli ribosomal 23S RNA to pseudouridine:RluB modifies uracil-2605.RluC modifies uracil-955, U-2504, and U-2580.RluD modifies uracil-1911, U-1915, and U-1917.RluE modifies uracil-3457.RluF modifies uracil-2604, and to a lesser extent U-2605.RluD also possesses a second function related to proper assembly of the 50S ribosomal subunit that is independent of Psi-synthesis [ , ]. Both RluC and RluD have an N-terminal S4 RNA binding domain. Despite the conserved topology shared by RluC and RluD, the surface shape and charge distribution are very different.
Protein Domain
Name: Mss4-like superfamily
Type: Homologous_superfamily
Description: This superfamily represents a structural domain with a complex fold consisting of several coiled β-sheets. This domain exists as a duplication, consisting of a tandem repeat of two similar structural motifs. These domains can be found in:Mss4, which contains a zinc-binding site.Translationally controlled tumour-associated protein TCTP, which contains an insertion of an α-helix hairpin, and which lacks a zinc-binding site.The C-terminal MsrB domain of peptide methionine sulphoxide reductase.Mss4 is a conserved accessory factor for Rab GTPases, which function as ubiquitous regulators of intracellular membrane trafficking [ ]. Mss4 acts to promote nucleotide release from exocytic but not endocytic Rab GTPases. Mss4 has a complex fold made of several coiled β-sheets, and consists of a duplication of tandem repeats of two similar structural motifs. It contains a zinc-binding site.Other proteins that show structure similarity to Mss4 include the translationally controlled tumour-associated proteins TCTPs, which contains an insertion of an alpha helical hairpin, and lacks the zinc-binding site. TCTPs are a highly conserved and abundantly expressed family of eukaryotic proteins that are implicated in both cell growth and the human acute allergic response [ ].The C-terminal MsrB domain of peptide methionine sulphoxide reductase PilB is structurally similar to Mss4. Methionine sulphoxide reductases protect against oxidative damage that can contribute to cell death. The tandem Msr domains (MsrA and MsrB) of the pilB protein from Neisseria gonorrhoeae each reduce different epimeric forms of methionine sulphoxide [ ].
Protein Domain
Name: Peptide methionine sulphoxide reductase MrsB domain
Type: Domain
Description: Peptide methionine sulphoxide reductase (Msr) reverses the inactivation of many proteins due to the oxidation of critical methionine residues by reducing methionine sulphoxide, Met(O), to methionine [ ]. It is present in most living organisms, and the cognate structural gene belongs to the so-called minimum gene set [, ].The domains: MsrA and MsrB, reduce different epimeric forms of methionine sulphoxide. This group represents MsrB, the crystal structure of which has been determined to 1.8A [ ]. The overall structure shows no resemblance to the structures of MsrA () from other organisms; though the active sites show approximate mirror symmetry. In each case, conserved amino acid motifs mediate the stereo-specific recognition and reduction of the substrate. Unlike the MsrA domain, the MsrB domain activates the cysteine or selenocysteine nucleophile through a unique Cys-Arg-Asp/Glu catalytic triad. The collapse of the reaction intermediate most likely results in the formation of a sulphenic or selenenic acid moiety. Regeneration of the active site occurs through a series of thiol-disulphide exchange steps involving another active site Cys residue and thioredoxin. In a number of pathogenic bacteria, including Neisseria gonorrhoeae, the MsrA and MsrB domains are fused; the MsrA being N-terminal to MsrB. This arrangement is reversed in Treponema pallidum. In N. gonorrhoeae and Neisseria meningitidis, a thioredoxin domain is fused to the N terminus. This may function to reduce the active sites of the downstream MsrA and MsrB domains.
Protein Domain      
Protein Domain
Name: S-adenosylmethionine decarboxylase, conserved site
Type: Conserved_site
Description: S-adenosylmethionine decarboxylase (AdoMetDC) [ ] catalyzes the removal of the carboxylate group of S-adenosylmethionine to form S-adenosyl-5'-3-methylpropylamine which then acts as the n-propylamine group donor in the synthesis of the polyamines spermidine and spermine from putrescine.The catalytic mechanism of AdoMetDC involves a covalently-bound pyruvoyl group. This group is post-translationally generated by a self-catalyzed intramolecular proteolytic cleavage reaction between a glutamate and a serine. This cleavage generates two chains, beta (N-terminal) and alpha (C-terminal). The N-terminal serine residue of the alpha chain is then converted by nonhydrolytic serinolysis into a pyruvyol group.
Protein Domain
Name: Mo25-like
Type: Family
Description: Mo25-like proteins are involved in both polarised growth and cytokinesis. In fission yeast Mo25 is localised alternately to the spindle pole body and to the site of cell division in a cell cycle dependent manner [ , ].
Protein Domain
Name: ShKT domain
Type: Domain
Description: BgK, a 37-residue peptide toxin from the sea anemone Bunodosoma granulifera, and ShK, a 35-residue peptide toxin from the sea anemone Stichodactyla helianthus, are potent inhibitors of K channels. There is a large superfamily of proteins that contains domains (referred to as ShKT domains) resembling these two toxins. Many of these proteins are metallopeptidases, whereas others are prolyl-4-hydroxylases, tyrosinases, peroxidases, oxidoreductases, or proteins containing epidermal growth factor-like domains, thrombospondin-type repeats, or trypsin-like serine protease domains []. The ShKT domain has also been called NC6 (nematode six-cysteine) domain [], SXC (six-cysteine) domain [, , , ] and ICR (ion channel regulator) [, ]. The ShKT domain is short (36 to 42 amino acids), with six conserved cysteines and a number of other conserved residues. The fold adopted by the ShKT domain contains two nearly perpendicular stretches of helices, with no additional canonical secondary structures []. The globular architecture of the ShKT domain is stabilised by three disulfides, one of them linking the two helices. In venomous creatures, the ShKT domain may have been modified to give rise to potent ion channel blockers, whereas the incorporation of this domain into plant oxidoreductases and prolyl hydroxylases and into worm astacin-like metalloproteases and trypsin-like serine proteases produced enzymes with potential channel-modulatory activity.Some proteins known to contain a ShKT domain are listed below:Caribbean sea anemone ShK, a potassium channel toxin [ ]. Sea anemone BgK, a potassium channel toxin [ ].Toxocara canis family of secreted mucins Tc-MUC-1 to -5, which are implicated in immune evasion. They combine two evolutionarily distinct modules, the mucin and ShkT domains [ , ].Some Caenorhabditis elegans astacin-like proteins (nematode astacins, NAS), metalloproteases [ ].Vertebrate cysteine-rich secretory proteins (Crisp) [ ]. Mammalian microfibrillar-associated protein 2 (MFAP2 or MAGP1), a matrix protein.Plant prolyl 4-hydroxylase.
Protein Domain
Name: Signal transduction histidine kinase, phosphotransfer (Hpt) domain
Type: Domain
Description: Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [ ]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [ , ].Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [, ]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. HKs can be roughly divided into two classes: orthodox and hybrid kinases [ , ]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.This entry represents a domain present at the N terminus in proteins which undergo autophosphorylation. The group includes, the gliding motility regulatory protein from Myxococcus xanthus and a number of bacterial chemotaxis proteins.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom