Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 1 to 100 out of 38750 for *

Category restricted to ProteinDomain (x)

<< First    < Previous  |  Next >    Last >>
0.014s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain      
Protein Domain
Name: Nucleotide-binding alpha-beta plait domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents nucleotide-binding domains with an α-β plait structure, which consists of either a ferredoxin-like (β-α-β)2 fold, such as that found in RNA-binding domains of various ribonucleoproteins or in viral DNA-binding domains [ , ]; or a β-(α)-β-α-β(2) fold, such as that found in the ribosomal protein L23 [].
Protein Domain
Name: ACT domain
Type: Domain
Description: The ACT domain is found in a variety of contexts and is proposed to be a conserved regulatory binding fold. ACT domains are linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. The archetypical ACT domain is the C-terminal regulatory domain of 3-phosphoglycerate dehydrogenase (3PGDH), which folds with a ferredoxin-like topology. A pair of ACT domains form an eight-stranded antiparallel sheet with two molecules of allosteric inhibitor serine bound in the interface. Biochemical exploration of a few other proteins containing ACT domains supports the suggestions that these domains contain the archetypical ACT structure [ ].Most of the proteins in which it is found are involved in amino acid and purine metabolism:aspartokinaseschorismate mutasesprephenate dehydrogenases (TyrA)prephenate dehydrataseshomoserine dehydrogenasesmalate dehydrogenasesphosphoglycerate dehydrogenasesphenylalanine and tryptophan-4-monooxygenasesphosphoribosylformylglycinamidine synthase (PurQ)uridylyl transferase and removing enzyme (GlnD)GTP pyrophosphokinase/phosphohydrolase (SpoT/RelA)tyrosine and phenol metabolism operon regulators (TyrR)several uncharacterised proteins from archaea, bacteria and plants that contain from one to four copies of this domain [ ].
Protein Domain
Name: Protein kinase-like domain superfamily
Type: Homologous_superfamily
Description: Protein kinases ( ) modify other proteins by chemically adding phosphate groups to them. This process is fundamental to most signalling and regulatory processes in the eukaryotic cell [ ]. The protein kinases contain a catalytic core that is common to both serine/threonine and tyrosine protein kinases. The catalytic domain contains the nucleotide-binding site and the catalytic apparatus in an inter-lobe cleft. Structurally it shares functional and structural similarities with the ATP-grasp fold, which is found in enzymes that catalyse the formation of an amide bond. The three-dimensional fold of the protein kinase catalytic domain is similar to domains found in several other proteins. These include:the catalytic domain of phosphoinositide-3-kinase (PI3K), which phosphorylates phosphoinositides and, as such, is involved in a number of fundamental cellular processes such as apoptosis, proliferation, motility and adhesion [ ] choline kinase, which catalyses the ATP-dependent phosphorylation of choline during the biosynthesis of phosphatidylcholine [ ] 3',5'-aminoglycoside phosphotransferase type IIIa, a bacterial enzyme that confers resistance to a range of aminoglycoside antibiotics [ ] This superfamily represents the protein-kinase domain and other related domains that share a similar structure.
Protein Domain
Name: Serine-threonine/tyrosine-protein kinase, catalytic domain
Type: Domain
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].This entry represents the catalytic domain found in a number of serine/threonine- and tyrosine-protein kinases. It does not include catalytic domain of dual specificity kinases.
Protein Domain
Name: Choline kinase, N-terminal
Type: Domain
Description: This domain is found N-terminal to choline/ethanolamine kinase regions in some plant and fungal choline kinase enzymes ( ). This region is only found in some members of the choline kinase family, and is therefore unlikely to contribute to catalysis.
Protein Domain
Name: BRCA2, OB1
Type: Domain
Description: BRCA2 participates in homologous recombination-mediated repair of double-strand DNA breaks [ , ]. It stimulates the displacement of Replication protein A (RPA), the most abundant eukaryotic ssDNA binding protein []. Mutations that map throughout the BRCA2 protein are associated with breast cancer susceptibility []. BRCA2 is a large nuclear protein and its most conserved region is the C-terminal BRCA2DBD. BRCA2DBD binds ssDNA in vitro, and is composed of five structural domains, three of which are OB folds (OB1, OB2, and OB3). BRCA2DBD OB2 and OB3 are arranged in tandem, and their mode of binding can be considered qualitatively similar to two OB folds of RPA1, DBD-A and DBD-B (the major DBDs of RPA) []. This entry represents OB1, which consists of a highly curved five-stranded β-sheet that closes on itself to form a β-barrel. OB1 has a shallow groove formed by one face of the curved sheet and is demarcated by two loops, one between beta 1 and beta 2 and another between beta 4 and beta 5, which allows for weak single strand DNA binding. The domain also binds the 70-amino acid DSS1 (deleted in split-hand/split foot syndrome) protein, which was originally identified as one of three genes that map to a 1.5-Mb locus deleted in an inherited developmental malformation syndrome [ ].
Protein Domain
Name: BRCA2 repeat
Type: Repeat
Description: The breast cancer type 2 susceptibility protein has a number of 39 amino acid repeats [ ] that are critical for binding to RAD51 (a key protein in DNA recombinational repair) and resistance to methyl methanesulphonate treatment [, , ]. BRCA2 is a breast tumour suppressor with a potential function in the cellular response to DNA damage. At the cellular level, expression is regulated in a cell-cycle dependent manner and peak expression of BRCA2 mRNA is found in S phase, suggesting BRCA2 may participate in regulating cell proliferation. There are eight repeats in BRCA2 designated as BRC1 to BRC8. BRC1, BRC2, BRC3, BRC4, BRC7, and BRC8 are highly conserved and bind to Rad51, whereas BRC5 and BRC6 are less well conserved and do not bind to Rad51 []. It has been suggested that BRCA2 plays a role in positioning Rad51 at the site of DNA repair or in removing Rad51 from DNA once repair has been completed.
Protein Domain
Name: Breast cancer type 2 susceptibility protein
Type: Family
Description: The breast cancer type 2 susceptibility protein (BRCA2) is a breast tumour suppressor involved in double-strand break repair and/or homologous recombination [ ]. BRCA2 gene expression is regulated in a cell-cycle dependent manner and peak expression of BRCA2 mRNA occurring in S phase, suggesting BRCA2 may participate in regulating cell proliferation. BRCA2, and related protein BRCA1, have transcriptional activation potential and the two proteins are associated with the activation of double-strand break repair and/or homologous recombination. The two proteins have been shown to coexist and colocalize in a biochemical complex. BRCA2 has a number of 39 amino acid repeats [] that are critical for binding to RAD51 (a key protein in DNA recombinational repair) and resistance to methyl methanesulphonate treatment [, , ]. There are eight repeats in BRCA2 designated as BRC1 to BRC8. BRC1, BRC2, BRC3, BRC4, BRC7, and BRC8 have high sequence identity and bind to Rad51, whereas BRC5 and BRC6 are less well conserved and are unable to bind Rad51 []. It has been suggested that BRCA2 plays a role in positioning Rad51 at the site of DNA repair or in removing Rad51 from DNA once repair has been completed.Mutations in BRCA1 and BRCA2 have been linked to an elevated risk of young onset breast cancer and confer a high risk of the disease through a dominantly inherited fashion [ ]. BRCA2 mutations are typically microdeletions.Homologues exist in plants: the BRCA2A and BRCA2B proteins from Arabidopsis thalianaare required for repair of breaks in double-stranded DNA and homologous recombination and in the prophase stage of meiosis are required for formation of RAD51 and DMC1 foci in males [ ].
Protein Domain
Name: Breast cancer type 2 susceptibility protein, helical domain
Type: Domain
Description: This entry represents a domain found in BRCA2 proteins. This domain adopts a helical structure, consisting of a four-helix cluster core (alpha 1, alpha 8, alpha 9, alpha 10) and two successive β-hairpins (beta 1 to beta 4). An approximately 50-amino acid segment that contains four short helices (alpha 2 to alpha 4), meanders around the surface of the core structure. In BRCA2, the alpha 9 and alpha 10 helices pack with BRCA-2_OB1 ( ) through van der Waals contacts involving hydrophobic and aromatic residues, and also through side-chain and backbone hydrogen bonds. This domain binds the 70-amino acid DSS1 (deleted in split-hand/split foot syndrome) protein, which was originally identified as one of three genes that map to a 1.5-Mb locus deleted in an inherited developmental malformation syndrome [ ].
Protein Domain
Name: Nucleic acid-binding, OB-fold
Type: Homologous_superfamily
Description: A five-stranded β-barrel was first noted as a common structure among four proteins binding single-stranded nucleic acids (staphylococcal nuclease and aspartyl-tRNA synthetase) or oligosaccharides (B subunits of enterotoxin and verotoxin-1), and has been termed the oligonucleotide/oligosaccharide binding motif, or OB fold, a five-stranded β-sheet coiled to form a closed β-barrel capped by an alpha helix located between the third and fourth strands []. Two ribosomal proteins, S17 and S1, are members of this class, and have different variations of the OB fold theme. Comparisons with other OB fold nucleic acid binding proteins suggest somewhat different mechanisms of nucleic acid recognition in each case [].There are many nucleic acid-binding proteins that contain domains with this OB-fold structure, including anticodon-binding tRNA synthetases, ssDNA-binding proteins (CDC13, telomere-end binding proteins), phage ssDNA-binding proteins (gp32, gp2.5, gpV), cold shock proteins, DNA ligases, RNA-capping enzymes, DNA replication initiators and RNA polymerase subunit RBP8 [ ].
Protein Domain
Name: SANT/Myb domain
Type: Domain
Description: The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [ ].The SANT domain is present in nuclear receptor co-repressors and in the subunits of many chromatin-remodelling complexes [ ]. It has a strong structural similarity to the DNA-binding domain of Myb-related proteins []. Both consist of tandem repeats of three α-helices that are arranged in a helix-turn-helix motif, each α-helix containing a bulky aromatic residue. Despite the overall similarity there are differences that indicate that the SANT domain is functionally divergent from the canonical Myb DNA-binding domain [].The myb/SANT domains can be classified into three groups: the myb-type HTH domain, which binds DNA, the SANT domain, which is a protein-protein interaction module, and the myb-like domain that can be involved in either of these functions. This entry represents a myb-like domain.
Protein Domain
Name: Homeobox-like domain superfamily
Type: Homologous_superfamily
Description: Homeobox domain (also known as homeodomain) proteins are transcription factors that share a related DNA binding homeodomain [ ]. The homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well conserved in many other animals, including vertebrates. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two α-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure. Many proteins contain homeodomains, including Drosophila Engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and HOX proteins. The homeodomain motif is very similar in sequence and structure to domains in a wide range of DNA-binding proteins, including recombinases, Myb proteins, GARP response regulators, human telomeric proteins (hTRF1), paired domain proteins (PAX), yeast RAP1, centromere-binding proteins CENP-B and ABP-1, transcriptional regulators (TyrR), AraC-type transcriptional activators, and tetracycline repressor-like proteins (TetR, QacR, YcdC) [ , , ].
Protein Domain
Name: SANT domain
Type: Domain
Description: The SANT domain is a motif of ~50 amino acids present in proteins involved in chromatin-remodelling and transcription regulation. This eukaryotic domain was identified in nuclear receptor co-repressors and named after switching-defective protein 3 (Swi3), adaptor 2 (Ada2), nuclear receptor co-repressor (N-CoR) and transcription factor (TF)IIIB [ ]. Although SANT domains show remarkable sequence and structural similarity to the DNA-binding helix-turn-helix (HTH) domain of the myb-like tandem repeat, their function is not DNA binding. Instead, SANT domains are protein-protein interaction modules and some can bind to histone tails (e.g. in Ada2 and SMRT). The SANT domain has been proposed to function as a histone-interaction module that couples histone-tail binding to enzyme catalysis for the remodelling of nucleosomes [, ].SANT domains are found in combination with other domains, such as the SWIRM domain ( ), the ZZ-type zinc finger (see ), the C2H2-type zinc finger, the GATA-type zinc finger ( ), the MPN-domain and DEAH ATP-helicase domain. The 3-dimensional structure of the SANT domain forms three alpha helices [ ] similar to the DNA-binding myb-type HTH domain. Because of the strong resemblance, the SANT domain can also be detected as a myb-like "DNA-binding"domain. Most SANT domains have acidic amino acids at the start of helix 2 and in helix 3, while myb-like DNA-binding domains have more positively charged residues, in particular in their third 'recognition' helix. The bulky aromatic and hydrophobic residues in the centre of helix 3 that are incompatible with DNA contacts of myb-like DNA-binding domains form another distinguishing property of SANT domains.
Protein Domain
Name: AMP-dependent synthetase/ligase domain
Type: Domain
Description: A number of prokaryotic and eukaryotic enzymes, which appear to act via an ATP-dependent covalent binding of AMP to their substrate, share a region of sequence similarity [ , , ]. This region is a Ser/Thr/Gly-rich domain that is further characterised by a conserved Pro-Lys-Gly triplet. This group of enzymes includes luciferase, long chain fatty acid Co-A ligase, long-chain fatty acid transport proteins that also function as acyl-CoA ligases, acetyl-CoA synthetase and various other closely-related synthetases [, , ].
Protein Domain
Name: AMP-binding, conserved site
Type: Conserved_site
Description: It has been shown that a number of prokaryotic and eukaryotic enzymes which all probably act via an ATP-dependent covalent binding of AMP to their substrate, share a region of sequence similarity [ , , , ]. These enzymes are:Insects luciferase (luciferin 4-monooxygenase) ( ). Luciferase produces light by catalysing the oxidation of luciferin in presence of ATP and molecular oxygen. Alpha-aminoadipate reductase ( ) from yeast (gene LYS2). This enzyme catalyses the activation of alpha-aminoadipate by ATP-dependent adenylation and the reduction of activated alpha-aminoadipate by NADPH. Acetate--CoA ligase ( ) (acetyl-CoA synthetase), an enzyme that catalyses the formation of acetyl-CoA from acetate and CoA. Long-chain-fatty-acid--CoA ligase ( ), an enzyme that activates long-chain fatty acids for both the synthesis of cellular lipids and their degradation via beta-oxidation. 4-coumarate--CoA ligase ( ) (4CL), a plant enzyme that catalyses the formation of 4-coumarate-CoA from 4-coumarate and coenzyme A; the branchpoint reactions between general phenylpropanoid metabolism and pathways leading to various specific end products. O-succinylbenzoic acid--CoA ligase ( ) (OSB-CoA synthetase) (gene menE) [ ], a bacterial enzyme involved in the biosynthesis of menaquinone(vitamin K2). 4-Chlorobenzoate--CoA ligase ( ) (4-CBA--CoA ligase) [ ], a Pseudomonas enzyme involved in the degradation of 4-CBA.Indoleacetate--lysine ligase ( ) (IAA-lysine synthetase) [ ], an enzyme from Pseudomonas syringae that converts indoleacetate to IAA-lysine.Bile acid-CoA ligase (gene baiB) from Eubacterium sp. (strain VPI 12708) [ ].This enzyme catalyses the ATP-dependent formation of a variety of C-24 bile acid-CoA. Crotonobetaine/carnitine-CoA ligase ( ) from Escherichia coli (gene caiC). L-(alpha-aminoadipyl)-L-cysteinyl-D-valine synthetase (ACV synthetase) from various fungi (gene acvA or pcbAB). This enzyme catalyzes the first step in the biosynthesis of penicillin and cephalosporin, the formation of ACV from the constituent amino acids. The amino acids seem to be activated by adenylation. It is a protein of around 3700 amino acids that contains three related domains of about 1000 amino acids. Gramicidin S synthetase I (gene grsA) from Brevibacillus brevis (Bacillus brevis). This enzyme catalyzes the first step in the biosynthesis of the cyclic antibiotic gramicidin S, the ATP-dependent racemization of phenylalanine ( ) Tyrocidine synthetase I (gene tycA) from B. brevis. The reaction carried out by tycA is identical to that catalyzed by GrsA Gramicidin S synthetase II (gene grsB) from B. brevis. This enzyme is a multifunctional protein that activates and polymerises proline, valine, ornithine and leucine. GrsB consists of four related domains. Enterobactin synthetase components E (gene entE) and F (gene entF) from E. coli. These two enzymes are involved in the ATP-dependent activation of respectively 2,3-dihydroxybenzoate and serine during enterobactin (enterochelin) biosynthesis. Cyclic peptide antibiotic surfactin synthase subunits 1, 2 and 3 from Bacillus subtilis. Subunits 1 and 2 contains three related domains while subunit 3 only contains a single domain. HC-toxin synthetase (gene HTS1) from Cochliobolus carbonum (Bipolaris zeicola). This enzyme activates the four amino acids (Pro, L-Ala, D-Ala and 2-amino-9,10-epoxi-8-oxodecanoic acid) that make up HC-toxin, a cyclic tetrapeptide. HTS1 consists of four related domains.There are also some proteins, whose exact function is not yet known, but which are, very probably, also AMP-binding enzymes. These proteins are:ORA (octapeptide-repeat antigen), a Plasmodium falciparum protein whose function is not known but which shows a high degree of similarity with the above proteins. AngR, a Vibrio anguillarum (Listonella anguillarum) protein. AngR is thought to be a transcriptional activator which modulates the anguibactin (an iron-binding siderophore) biosynthesis gene cluster operon. But we believe, that AngR is not a DNA-binding protein, but rather an enzyme involved in the biosynthesis of anguibactin. This conclusion is based on three facts: the presence of the AMP-binding domain; the size of AngR (1048 residues), which is far bigger than any bacterial transcriptional protein; and the presence of a probable S-acyl thioesterase immediately downstream of the gene for AngR. A hypothetical protein in MmsB 3'region in Pseudomonas aeruginosa. E. coli hypothetical protein YdiD. Yeast hypothetical protein YBR041w. Yeast hypothetical protein YBR222c. Yeast hypothetical protein YER147c.All these proteins contains a highly conserved region very rich in glycine, serine, and threonine which is followed by a conserved lysine. A parallel can be drawn between this type of domain and the G-x(4)-G-K-[ST] ATP-/GTP-binding 'P-loop' domain or the protein kinases G-x-G-x(2)-[SG]-x(10,20)-K ATP-binding domains (see and ).
Protein Domain
Name: Cardiolipin synthase N-terminal
Type: Domain
Description: This domain is often found at the very N terminus of proteins from the phospholipase D family, cardiolipin synthase subfamily. However, this domain is also found on its own in a large number of cases in proteins described as uncharacterised.
Protein Domain
Name: BRCT domain
Type: Domain
Description: The breast cancer susceptibility gene contains at its C terminus two copies of a conserved domain that was named BRCT for BRCA1 C terminus. This domain of about 95 amino acids is found in a large variety of proteins involved in DNA repair, recombination and cell cycle control [ , , ]. The BRCT domain is not limited to the C-terminal of protein sequences and can be found in multiple copies or in a single copy as in RAP1 and TdT. BRCT domains are often found as tandem-repeat pairs []. Some data [ ] indicate that the BRCT domain functions as a protein-protein interaction module.The structure of the first of the two C-terminal BRCT domains of the human DNA repair protein XRCC1 has been determined by X-ray crystallography [ ].Structures of the BRCA1 BRCT domains revealed a basis for a widely utilised head-to-tail BRCT-BRCT oligomerization mode [ ]. This conserved tandem BRCT architecture facilitates formation of the canonical BRCT phospho-peptide interaction cleft at a groove between the BRCT domains. BRCT domains disrupt peptide binding by directly occluding this peptide binding groove, or by disrupting key conserved BRCT core folding determinants [].
Protein Domain
Name: DNA-binding pseudobarrel domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a DNA recognition domain found in the restriction endonuclease EcoRII and numerous transcription factors. The EcoRII structure has been studied and forms an eight-stranded β-sheet with the strands in the order of b2, b5, b4, b3, b7, b6, b1 and b8. The strands are mostly antiparallel to each other except that b3 is parallel to b7. Alternatively, it may also be viewed as consisting of two mini β-sheets of four antiparallel β-strands, sheet I from β-strands b2, b5, b4, b3 and sheet II from strands b7, b6, b1, b8, folded into an open mixed β-barrel with a novel topology. Sheet I has a simple Greek key motif while sheet II does not [ ].
Protein Domain
Name: Strictosidine synthase, conserved region
Type: Domain
Description: This entry represents a conserved region found in strictosidine synthase ( ), a key enzyme in alkaloid biosynthesis. It catalyses the Pictet-Spengler stereospecific condensation of tryptamine with secologanin to form strictosidine [ ]. The structure of the native enzyme from the Indian medicinal plant Rauvolfia serpentina (Serpentwood) (Devilpepper) represents the first example of a six-bladed four-stranded β-propeller fold from the plant kingdom [].
Protein Domain
Name: Six-bladed beta-propeller, TolB-like
Type: Homologous_superfamily
Description: This superfamily represents a six-bladed β-propeller domain consisting of six 4-stranded β-sheet motifs. This domain can be found in TolB proteins (C-terminal), in soluble quinoprotein glucose dehydrogenase, in calcium-dependent phosphotriesterases, in the low density lipoprotein (LDL) receptor YWTD domain, in nidogen, and in serine/threonine-protein kinase (PknD) NHL repeat domain.TolB is a periplasmic protein from Escherichia coli that is part of the Tol-dependent translocation system involving group A and E colicins that is used to penetrate and kill cells [ , ]. TolB has two domains, an α-helical N-terminal domain that shares structural similarity with the C-terminal domain of transfer RNA ligases, and a β-propeller C-terminal domain that shares structural similarity with numerous members of the prolyl oligopeptidase family and, to a lesser extent, to class B metallo-beta-lactamases (although its does not necessarily occur at the C-terminal in these proteins) []. The C-terminal domain of TolB may mediate protein-protein interactions with colicins.
Protein Domain
Name: Purple acid phosphatase, N-terminal
Type: Domain
Description: Purple acid phosphatases (PAPs) are ubiquitous binuclear metal-containing acid hydrolases characterised by their acidic pH optima and their intense purple colour due to a TyrO-to-FeIII charge-transfer transition. The amino acid residues coordinating the metal ions are conserved in all PAPs. Active PAPs contain an FeIII ion coordinated to Tyr O, a His N, and an Asp O2, in addition to a divalent metal ion (Fe, Zn, or Mn) coordinated by a His N, a His N, and an Asn O. A hydroxide ion and an Asp O2 bridge the two metal ions [ ]. These enzymes share a high degree of homology within their N-termini [].
Protein Domain
Name: Purple acid phosphatase-like, N-terminal
Type: Homologous_superfamily
Description: Purple acid phosphatases (PAPs) are ubiquitous binuclear metal-containing acid hydrolases characterised by their acidic pH optima and their intense purple colour due to a TyrO-to-FeIII charge-transfer transition. The amino acid residues coordinating the metal ions are conserved in all PAPs. Active PAPs contain an FeIII ion coordinated to Tyr O, a His N, and an Asp O 2, in addition to a divalent metal ion (Fe, Zn, or Mn) coordinated by a His N, a His N, and an Asn O. A hydroxide ion and an Asp O2bridge the two metal ions [ ]. These enzymes share a high degree of homology within their N-termini [].This entry also includes phospholipase D, the structure of which has been solved from Bacillus subtilis and shows a purple acid phosphatase-like fold [ ].
Protein Domain
Name: YABBY protein
Type: Family
Description: YABBY proteins are a group of plant-specific transcription factors involved in diverse aspects of leaf, shoot and flower development [ , , ].
Protein Domain
Name: High mobility group box domain
Type: Domain
Description: High mobility group (HMG) box domains are involved in binding DNA, and may be involved in protein-protein interactions as well. The structure of the HMG-box domain consists of three helices in an irregular array. HMG-box domains are found in one or more copies in HMG-box proteins, which form a large, diverse family involved in the regulation of DNA-dependent processes such as transcription, replication, and strand repair, all of which require the bending and unwinding of chromatin. Many of these proteins are regulators of gene expression. HMG-box proteins are found in a variety of eukaryotic organisms, and can be broadly divided into two groups, based on sequence-dependent and sequence-independent DNA recognition; the former usually contain one HMG-box motif, while the latter can contain multiple HMG-box motifs.HMG-box domains can be found in single or multiple copies in the following protein classes: HMG1 and HMG2 non-histone components of chromatin; SRY (sex determining region Y protein) involved in differential gonadogenesis; the SOX family of transcription factors [ ]; sequence-specific LEF1 (lymphoid enhancer binding factor 1) and TCF-1 (T-cell factor 1) involved in regulation of organogenesis and thymocyte differentiation []; structure-specific recognition protein SSRP involved in transcription and replication; MTF1 mitochondrial transcription factor; nucleolar transcription factors UBF 1/2 (upstream binding factor) involved in transcription by RNA polymerase I; Abf2 yeast ARS-binding factor []; yeast transcription factors lxr1, Rox1, Nhp6b and Spp41; mating type proteins (MAT) involved in the sexual reproduction of fungi []; and the YABBY plant-specific transcription factors.
Protein Domain
Name: P-loop containing nucleoside triphosphate hydrolase
Type: Homologous_superfamily
Description: The P-loop NTPase fold is the most prevalent domain of the several distinct nucleotide-binding protein folds.The most common reaction catalysed by enzymes of the P-loop NTPase fold is the hydrolysis of the beta-gamma phosphate bond of a bound nucleoside triphosphate (NTP). The energy from NTP hydrolysis is typically utilised to induce conformational changes in other molecules, which constitutes the basis of the biological functions of most P-loop NTPases. P-loop NTPases show substantial substrate preference for either ATP or GTP [ ]. P-loop NTPases are characterised by two conserved sequence signatures, the Walker A motif (the P-loop proper) and Walker B motifs which bind, respectively, the beta and gamma phosphate moieties of the bound NTP, and a Mg2+ cation [].P-loop ATPase domains belong to one of the two major divisions. The kinase-GTPase (KG) division includes the kinases and GTPases, and the ASCE division, characterised by an additional strand in the core sheet, which is located between the P-loop strand and the Walker B strand. Most members of the ASCE division utilise ATP and members of this group include AAA+, ABC, PilT, HerA-FtsK, superfamily 1/2 (SF1/2) helicases, and the RecA/ATP-synthase superfamilies of ATPases, etc [, ].
Protein Domain
Name: Translational (tr)-type GTP-binding domain
Type: Domain
Description: Translational GTPases (trGTPases) are a family of proteins in which GTPase activity is stimulated by the large ribosomal subunit. This family includes translation initiation, elongation, and release factors and contains four subfamilies that are widespread, if not ubiquitous, in all three superkingdoms [ ]. The trGTPase family members include bacteria elongation factors, EFTu, EFG, and the initiation factor, IF2, and their archaeal homologues, the EF1, EF2, aeIF5b and aeIF2. They all contain two homologous N-terminal domains: a GTPase or G-domain, followed by an OB-domain. These translational proteins' G-domains are both structurally and functionally related to a larger family of GTPase G proteins []. This entry represents the G-domain of the trGTPases.The basic topology of the tr-type G domain consists of a six-stranded central β-sheet surrounded by five α-helices. Helices alpha2, alpha3 and alpha4 are on one side of the sheet, whereas alpha1 and alpha5 are on the other [ ]. GTP is bound by the CTF-type G domain in a way common for G domains involving five conserved sequence motifs termed G1-G5. The base is in contact with the NKxD (G4) and SAx (G5) motifs, and the phosphates of the nucleotide are stabilized by main- and side-chain interactions with the P loop GxxxxGKT (G1). The most severe conformational changes are observed for the two switch regions which contain the xT/Sx (G2) and DxxG (G3) motifs that function as sensors for the presence of the gamma-phosphate. A Mg(2+) ion is coordinated by six oxygen ligands with octahedral coordination geometry; two of the ligands are water molecules, two come from the beta- and gamma- phosphates, and two are provided by the side chains of G1 and G2 threonines []. In both prokaryotes and eukaryotes, there are three distinct types of elongation factors, EF-1alpha (EF-Tu), which binds GTP and an aminoacyl-tRNA and delivers the latter to the A site of ribosomes; EF-1beta (EF-Ts), which interacts with EF-1a/EF-Tu to displace GDP and thus allows the regeneration of GTP-EF-1a; and EF-2 (EF-G), which binds GTP and peptidyl-tRNA and translocates the latter from the A site to the P site. In EF-1-alpha, a specific region has been shown [] to be involved in a conformational change mediated by the hydrolysis of GTP to GDP. This region is conserved in both EF-1alpha/EF-Tu as well as EF-2/EF-G and thus seems typical for GTP-dependent proteins which bind non-initiator tRNAs to the ribosome. The GTP-binding protein synthesis factor family also includes the eukaryotic peptide chain release factor GTP-binding subunits [] and prokaryotic peptide chain release factor 3 (RF-3) []; the prokaryotic GTP-binding protein lepA and its homologue in yeast (GUF1) and Caenorhabditis elegans (ZK1236.1); yeast HBS1 []; rat statin S1 []; and the prokaryotic selenocysteine-specific elongation factor selB [].
Protein Domain
Name: Translation initiation factor IF- 2
Type: Family
Description: Initiation factor 2 (IF-2) is one of the three factors required for the initiation of protein biosynthesis in bacteria [ ]. IF-2 promotes the GTP-dependent binding of the initiator tRNA to the small subunit of the ribosome. IF-2 is a protein of about 70 to 95kDa that contains a central GTP-binding domain flanked by a highly variable N-terminal domain and a more conserved C-terminal domain. Some members of this group undergo protein self splicing that involves a post-translational excision of the intein followed by peptide ligation. The function of IF-2 in facilitating the proper binding of initiator methionyl-tRNA to the ribosomal P site appears to be universally conserved [ ].
Protein Domain
Name: Translation protein, beta-barrel domain superfamily
Type: Homologous_superfamily
Description: A β-barrel of circularly permuted topology is found in many transcription proteins, including initiation and elongation factors, and also some ribosomal proteins, although in these cases the fold is elaborated with additional structures. The β-barrel domain is represented by domain 2 of the elongation factors EF-Tu [ ]and eEF1A [ ], both of which function to recognise and transport aminoacyl-tRNA to the acceptor (A) site of the ribosome during the elongation process, and of EF-G [], which functions in translocating the peptidyl tRNA from the A site to the peptidyl (P) site. This domain is also present in initiation factors, in domain 2 of eIF2 gamma subunit [], and domains 2 and 4 of IF2/eIF5B [], both of which function to transport the initiator methionyl-tRNA to the ribosome. This β-barrel domain may be involved in interactions with the switch 2 region to stabilise the relative orientations of the domains, which undergo functionally important conformational changes between GTP- and GDP-bound states.
Protein Domain
Name: Translation elongation factor EFTu-like, domain 2
Type: Domain
Description: Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome [ , , ]. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding [ , ]. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).EF1A consists of three structural domains. This entry represents domain 2 of EF2, which adopts a β-barrel structure, and is involved in binding to both charged tRNA [ ]. This domain is structurally related to the C-terminal domain of EF2 (), to which it displays weak sequence matches. This domain is also found in other proteins such as translation initiation factor IF-2 and tetracycline-resistance proteins.
Protein Domain
Name: Small GTP-binding protein domain
Type: Domain
Description: Proteins with a small GTP-binding domain include Ras, RhoA, Rab11, translation elongation factor G, translation initiation factor IF-2, tetratcycline resistance protein TetM, CDC42, Era, ADP-ribosylation factors [ ], tdhF, and many others []. In some proteins the domain occurs more than once. Among them there is a large number of small GTP-binding proteins and related domains in larger proteins. Note that the alpha chains of heterotrimeric G proteins are larger proteins in which the NKXD motif is separated from the GxxxxGK[ST]motif (P-loop) by a long insert and are not easily detected by this model.
Protein Domain
Name: Ccc1 family
Type: Family
Description: This entry represents the Ccc1 family, which consists of a group of putative vacuolar ion transporters. Proteins in this family include yeast Ccc1, which has a role in calcium and manganese homeostasis [ ], and Arabidopsis VIT1, which serves as a vacuolar Fe2+ uptake transporter [].
Protein Domain
Name: Basic-leucine zipper domain
Type: Domain
Description: The basic-leucine zipper (bZIP) domain transcription factors [ ] of eukaryotic are proteins that contain a basic region mediating sequence-specific DNA-binding followed by a leucine zipper region required for dimerisation. Several structure of bZIP have been solved. The basic region and the leucine zipper form a contiguous α-helix where the four hydrophobic residues of the leucine zipper are oriented on one side. This conformation allows dimerisation in parallel and it bends the helices so that the newly functional dimer forms a flexible fork where the basic domains, at the N-terminal open end, can then interact with DNA. The two leucine zipper are therefore oriented perpendicular to the DNA [ ].
Protein Domain
Name: Fructose-1,6-bisphosphatase class 1
Type: Family
Description: This entry represents the fructose-1,6-bisphosphatase (FBPase) class 1 family. FBPase is a critical regulatory enzyme in gluconeogenesis that catalyses the removal of 1-phosphate from fructose 1,6-bis-phosphate to form fructose 6-phosphate [ , ]. It is involved in many different metabolic pathways and found in most organisms. FBPase requires metal ions for catalysis (Mg2+and Mn 2+being preferred) and the enzyme is potently inhibited by Li +. The fold of fructose-1,6-bisphosphatase was noted to be identical to that of inositol-1-phosphatase (IMPase) [ ]. Inositol polyphosphate 1-phosphatase (IPPase), IMPase and FBPase share a sequence motif (Asp-Pro-Ile/Leu-Asp-Gly/Ser-Thr/Ser) which has been shown to bind metal ions and participate in catalysis. This motif is also found in the distantly-related fungal, bacterial and yeast IMPase homologues. It has been suggested that these proteins define an ancient structurally conserved family involved in diverse metabolic pathways, including inositol signalling, gluconeogenesis, sulphate assimilation and possibly quinone metabolism [ ].This entry also includes sedoheptulose-1,7-bisphosphatase, which is a member of the FBPase class 1 family.
Protein Domain
Name: Ubiquitin interacting motif
Type: Conserved_site
Description: The Ubiquitin Interacting Motif (UIM), or 'LALAL-motif', is a stretch of about 20 amino acid residues, which was first described in the 26S proteasome subunit PSD4/RPN-10 that is known to recognise ubiquitin [ , ]. In addition, the UIM is found, often in tandem or triplet arrays, in a variety of proteins either involved in ubiquitination and ubiquitin metabolism, or known to interact with ubiquitin-like modifiers. Among the UIM proteins are two different subgroups of the UBP (ubiquitin carboxy-terminal hydrolase) family of deubiquitinating enzymes, one F-box protein, one family of HECT-containing ubiquitin-ligases (E3s) from plants, and several proteins containing ubiquitin-associated UBA and/or UBX domains []. In most of these proteins, the UIM occurs in multiple copies and in association with other domains such as UBA (), UBX ( ), ENTH, EH ( ), VHS ( ), SH3 ( ), HECT ( ), VWFA ( ), EF-hand calcium-binding, WD-40 ( ), F-box ( ), LIM ( ), protein kinase ( ), ankyrin ( ), PX ( ), phosphatidylinositol 3- and 4-kinase ( ), C2 ( ), OTU ( ), dnaJ ( ), RING-finger ( ) or FYVE-finger ( ). UIMs have been shown to bind ubiquitin and to serve as a specific targeting signal important for monoubiquitination. Thus, UIMs may have several functions in ubiquitin metabolism each of which may require different numbers of UIMs [ , , ]. The UIM is unlikely to form an independent folding domain. Instead, based on the spacing of the conserved residues, the motif probably forms a short α-helix that can be embedded into different protein folds [ ]. Some proteins known to contain an UIM are listed below: Eukaryotic PSD4/RPN-10/S5, a multi-ubiquitin binding subunit of the 26S proteasome. Vertebrate Machado-Joseph disease protein 1 (Ataxin-3), which acts as a histone-binding protein that regulates transcription; defects in Ataxin-3 cause the neurodegenerative disorder Machado-Joseph disease (MJD).Vertebrate epsin and epsin2. Vertebrate hepatocyte growth factor-regulated tyrosine kinase substrate (HRS). Mammalian epidermal growth factor receptor substrate 15 (EPS15), which is involved in cell growth regulation. Mammalian epidermal growth factor receptor substrate EPS15R. Drosophila melanogaster (Fruit fly) liquid facets (lqf), an epsin. Yeast VPS27 vacuolar sorting protein, which is required for membrane traffic to the vacuole.
Protein Domain
Name: Plant organelle RNA recognition domain
Type: Domain
Description: The plant organelle RNA recognition (PORR) domain, previously known as DUF860, is a component of group II intron ribonucleoprotein particles in maize chloroplasts. It is required for the splicing of the introns with which it associates, and promotes splicing in the context of a heterodimer with the RNase III-domain protein RNC1. Proteins containing this domain are predicted to localise to mitochondria or chloroplasts [ ]. It seems likely that most PORR proteins function in organellar RNA metabolism [].
Protein Domain      
Protein Domain
Name: Fructose-1,6-bisphosphatase, active site
Type: Active_site
Description: This entry represents fructose-1,6-bisphosphatase (FBPase), a critical regulatory enzyme in gluconeogenesis that catalyses the removal of 1-phosphate from fructose 1,6-bis-phosphate to form fructose 6-phosphate [ , ]. It is involved in many different metabolic pathways and found in most organisms. FBPase requires metal ions for catalysis (Mg2+and Mn 2+being preferred) and the enzyme is potently inhibited by Li +. The fold of fructose-1,6-bisphosphatase was noted to be identical to that of inositol-1-phosphatase (IMPase) [ ]. Inositol polyphosphate 1-phosphatase (IPPase), IMPase and FBPase share a sequence motif (Asp-Pro-Ile/Leu-Asp-Gly/Ser-Thr/Ser) which has been shown to bind metal ions and participate in catalysis. This motif is also found in the distantly-related fungal, bacterial and yeast IMPase homologues. It has been suggested that these proteins define an ancient structurally conserved family involved in diverse metabolic pathways, including inositol signalling, gluconeogenesis, sulphate assimilation and possibly quinone metabolism [].In mammalian FBPase, a lysine residue has been shown to be involved in the catalytic mechanism [ ]. The region around this residue is highly conserved and can be used as a signature pattern for FBPase and sedoheptulose-1,7-bisphosphatase (SBPase) an enzyme found plant chloroplasts and in photosynthetic bacteria that is functionally and structurally related to FBPase []. SBPase catalyses the hydrolysis of sedoheptulose 1,7-bisphosphate to sedoheptulose 7-phosphate, a step in the Calvin's reductive pentose phosphate cycle. This signature contains the active site lysine, however, it must be noted that, in some bacterial FBPase sequences, the active site lysine is replaced by an arginine.
Protein Domain
Name: Alpha/beta hydrolase fold-3
Type: Domain
Description: The α/β hydrolase fold [ ] is common to a number of hydrolytic enzymes of widely differing phylogenetic origin and catalytic function. The core of each enzyme is an α/β-sheet (rather than a barrel), containing 8 strands connected by helices []. The enzymes are believed to have diverged from a common ancestor, preserving the arrangement of the catalytic residues. All have a catalytic triad, the elements of which are borne on loops, which are the best conserved structural features of the fold. Esterase (EST) from Pseudomonas putida is a member of the α/β hydrolase fold superfamily of enzymes [].In most of the family members the β-strands are parallel, but some have an inversion of the first strands, which gives it an antiparallel orientation. The catalytic triad residues are presented on loops. One of these is the nucleophile elbow and is the most conserved feature of the fold. Some other members lack one or all of the catalytic residues. Some members are therefore inactive but others are involved in surface recognition. The ESTHER database [ ] gathers and annotates all the published information related to gene and protein sequences of this superfamily [].This entry represents the catalytic domain fold-3 of α/β hydrolase.
Protein Domain
Name: FAS1 domain
Type: Domain
Description: The FAS1 (fasciclin-like) domain is an extracellular module of about 140 amino acid residues. It has been suggested that the FAS1 domain represents an ancient cell adhesion domain common to plants and animals [ ]; related FAS1 domains are also found in bacteria [].The crystal structure of FAS1 domains 3 and 4 of fasciclin I from Drosophila melanogaster (Fruit fly) has been determined, revealing a novel domain fold consisting of a seven-stranded beta wedge and at least five alpha helices; two well-ordered N-acetylglucosamine groups attached to a conserved asparagine are located in the interface region between the two FAS1 domains [ ]. Fasciclin I is an insect neural cell adhesion molecule involved in axonal guidance that is attached to the membrane by a GPI-anchored protein.FAS1 domains are present in many secreted and membrane-anchored proteins. These proteins are usually GPI anchored and consist of: (i) a single FAS1 domain, (ii) a tandem array of FAS1 domains, or (iii) FAS1 domain(s) interspersed with other domains. Proteins known to contain a FAS1 domain include:Fasciclin I (4 FAS1 domains).Human TGF-beta induced Ig-H3 (BIgH3) protein (4 FAS1 domains), where the FAS1 domains mediate cell adhesion through an interaction with alpha3/beta1 integrin; mutation in the FAS1 domains result in corneal dystrophy [ ].Volvox major cell adhesion protein (2 FAS1 domains) [ ].Arabidopsis fasciclin-like arabinogalactan proteins (2 FAS1 domains) [ ].Mammalian stabilin protein, a family of fasciclin-like hyaluronan receptor homologues (7 FAS1 domains)[ ].Human extracellular matrix protein periostin (4 FAS1 domains).Bacterial immunogenic protein MPT70 (1 FAS1 domain) [ ].The FAS1 domains of both human periostin ( ) and BIgH3 ( ) proteins were found to contain vitamin K-dependent gamma-carboxyglutamate residues [ ]. Gamma-carboxyglutamate residues are more commonly associated with GLA domains (), where they occur through post-translational modification catalysed by the vitamin K-dependent enzyme gamma-glutamylcarboxylase.
Protein Domain
Name: RNA recognition motif domain
Type: Domain
Description: Many eukaryotic proteins containing one or more copies of a putative RNA-binding domain of about 90 amino acids are known to bind single-stranded RNAs [ , , ]. The largest group of single strand RNA-binding proteins is the eukaryotic RNA recognition motif (RRM) family that contains an eight amino acid RNP-1 consensus sequence [, ]. RRM proteins have a variety of RNA binding preferences and functions, and include heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing (SR, U2AF, Sxl), protein components of small nuclear ribonucleoproteins (U1 and U2 snRNPs), and proteins that regulate RNA stability and translation (PABP, La, Hu) [, , ]. The RRM in heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) appears to have two RRM-like domains with specialised features for protein recognition []. The motif also appears in a few single stranded DNA binding proteins.The typical RRM consists of four anti-parallel β-strands and two α-helices arranged in a β-α-β-β-α-β fold with side chains that stack with RNA bases. Specificity of RNA binding is determined by multiple contacts with surrounding amino acids. A third helix is present during RNA binding in some cases [ ]. The RRM is reviewed in a number of publications [, , ].This entry also includes some bacterial putative RNA-binding proteins.
Protein Domain
Name: Condensin complex subunit 2/barren
Type: Family
Description: This entry represents eukaryotic condensin complex subunit 2 proteins. Included in this group are several Barren protein homologues from several eukaryotic organisms. In Drosophila Barren (barr) is required for sister-chromatid segregation in mitosis. barr encodes a novel protein that is present in proliferating cells and has homologues in yeast and human. Mitotic defects in barr embryos become apparent during cycle 16, resulting in a loss of PNS and CNS neurons. Centromeres move apart at the metaphase-anaphase transition and Cyclin B is degraded, but sister chromatids remain connected, resulting in chromatin bridging. Barren protein localises to chromatin throughout mitosis. Colocalisation and biochemical experiments indicate that Barren associates with Topoisomerase II throughout mitosis and alters the activity of Topoisomerase II. It has been suggested that this association is required for proper chromosomal segregation by facilitating the decatenation of chromatids at anaphase [ ].
Protein Domain
Name: Tetratricopeptide repeat
Type: Repeat
Description: The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins [ , , ]. It mediates protein-protein interactions and the assembly of multiprotein complexes []. The TPR motif consists of 3-16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helix-turn-helix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel α-helices [ ]. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24 degrees within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B.
Protein Domain
Name: Peptidyl-prolyl cis-trans isomerase Fpr3/Fpr4-like
Type: Family
Description: FK506-binding proteins (FKBPs) are a particular class of peptidyl-prolyl cis-trans isomerases (PPIases) ( ) [ ]. This entry represents a group of nuclear FK506-binding proteins that act as histone chaperones. They each containing an extended acidic domain in addition to the conserved FK506-binding/peptidylprolyl isomerase (PPIase) domain. The PPIase domain has been shown to regulate histone H3 methylation, while the acidic domain has been shown to facilitate histone deposition and may regulate rDNA silencing [, ]. This entry includes Fpr3 and its paralogue, Fpr4, from budding yeasts. They have been shown to affect genome-wide genes transcription [ ].This entry also includes AtFKBP53 from Arabidopsis and SpFkbp39p from Schizosaccharomyces pombe. AtFKBP53 possesses histone chaperone activity and is required for repressing ribosomal gene expression in Arabidopsis [ ]. SpFkbp39p is a histone chaperone regulating rDNA silencing [].
Protein Domain
Name: Tetratricopeptide-like helical domain superfamily
Type: Homologous_superfamily
Description: The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins [ , , ]. It mediates protein-protein interactions and the assembly of multiprotein complexes []. The TPR motif consists of 3-16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helix-turn-helix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel α-helices [ ]. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24 degrees within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B. The domain represented in this superfamily consists of a multi-helical fold comprised of two curved layers of α-helices arranged in a regular right-handed superhelix, where the repeats that make up this structure are arranged about a common axis [ ]. These superhelical structures present an extensive solvent-accessible surface that is well suited to binding large substrates such as proteins and nucleic acids. The TPR is likely to be an ancient repeat, since it is found in eukaryotes, bacteria and archaea, whereas the PPR repeat is found predominantly in higher plants. The superhelix formed from these repeats can bind ligands at a number of different regions, and has the ability to acquire multiple functional roles [].
Protein Domain      
Protein Domain
Name: HhH-GPD domain
Type: Domain
Description: The HhH-GPD superfamily gets its name from its hallmark helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate [ , ]. This domain is found in a diverse range of structurally related DNA repair proteins that include: endonuclease III, and DNA glycosylase MutY, an A/G-specific adenine glycosylase. Both of these enzymes have a C-terminal iron-sulphur cluster loop (FCL). The methyl-CPG binding protein (MBD4) also contain a related domain that is a thymine DNA glycosylase [ ]. The family also includes DNA-3-methyladenine glycosylase II , 8-oxoguanine DNA glycosylases and other members of the AlkA family.
Protein Domain
Name: DNA glycosylase
Type: Homologous_superfamily
Description: DNA glycosylases act to repair oxidative damage in DNA. These proteins are redundant as there are several different types of DNA glycosylases that are able to compensate for one another. Examples include the endonuclease III subfamily, the mismatch glycosylases subfamily, the 3-methyladenine DNA glycosylases I subfamily, and the DNA repair glycosylases subfamily [ ].
Protein Domain
Name: Dienelactone hydrolase
Type: Domain
Description: Dienelactone hydrolases play a crucial role in chlorocatechol degradation via the modified ortho cleavage pathway. Enzymes induced in 4-fluorobenzoate-utilizing bacteria have been classified into three groups on the basis of their specificity towards cis- and trans-dienelactone [ ].Some proteins contain repeated small fragments of this domain (for example rat kan-1 protein).
Protein Domain
Name: Protein of unknown function DUF1068
Type: Family
Description: This family consists of several hypothetical plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.
Protein Domain
Name: Cellulose synthase
Type: Family
Description: Cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues, is the major component of wood and thus paper, and is synthesized by plants, most algae, some bacteria and fungi, and even some animals. The genes that synthesize cellulose in higher plants differ greatly from the well-characterised genes found in Acetobacter and Agrobacterium spp. More correctly designated as "cellulose synthase catalytic subunits", plant cellulose synthase (CesA) proteins are integral membrane proteins, approximately 1,000 amino acids in length. There are a number of highly conserved residues, including several motifs shown to be necessary for processive glycosyltransferase activity [ ].
Protein Domain
Name: CCT domain
Type: Domain
Description: The CCT (CONSTANS, CO-like, and TOC1) domain is a highly conserved basic module of ~43 amino acids, which is found near the C terminus of plant proteins often involved in light signal transduction. The CCT domain is found in association with other domains, such as the B-box zinc finger, the GATA-type zinc finger, the ZIM motif or the response regulatory domain. The CCT domain contains a putative nuclear localisation signal within the second half of the CCT motif and has been shown to be involved in nuclear localization and probably also has a role in protein-protein interaction [].
Protein Domain
Name: B-box-type zinc finger
Type: Domain
Description: This entry represents B-box-type zinc finger domains, which are around 40 residues in length. B-box zinc fingers can be divided into two groups, where types 1 and 2 B-box domains differ in their consensus sequence and in the spacing of the 7-8 zinc-binding residues. Several proteins contain both types 1 and 2 B-boxes, suggesting some level of cooperativity between these two domains. B-box domains are found in over 1500 proteins from a variety of organisms. They are found in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). TRIM proteins contain a type 2 B-box domain, and may also contain a type 1 B-box. In proteins that do not contain RING or coiled-coil domains, the B-box domain is primarily type 2. Many type 2 B-box proteins are involved in ubiquitination. Proteins containing a B-box zinc finger domain include transcription factors, ribonucleoproteins and proto-oncoproteins; for example, MID1, MID2, TRIM9, TNL, TRIM36, TRIM63, TRIFIC, NCL1 and CONSTANS-like proteins [].The microtubule-associated E3 ligase MID1 ( ) contains a type 1 B-box zinc finger domain. MID1 specifically binds Alpha-4, which in turn recruits the catalytic subunit of phosphatase 2A (PP2Ac). This complex is required for targeting of PP2Ac for proteasome-mediated degradation. The MID1 B-box coordinates two zinc ions and adopts a β/β/α cross-brace structure similar to that of ZZ, PHD, RING and FYVE zinc fingers [ , ].
Protein Domain
Name: Serine/threonine-protein kinase, active site
Type: Active_site
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].Eukaryotic protein kinases [ , , , ] are enzymesthat belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue, which is important for the catalytic activity of the enzyme []. This signature contains the active site aspartate residue.
Protein Domain      
Protein Domain
Name: Protein kinase domain
Type: Domain
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].Eukaryotic protein kinases [ , , , , ] are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue which is important for the catalytic activity of the enzyme [].This entry represents the protein kinase domain containing the catalytic function of protein kinases [ ]. This domain is found in serine/threonine-protein kinases, tyrosine-protein kinases and dual specificity protein kinases.
Protein Domain
Name: Protein kinase, ATP binding site
Type: Binding_site
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].Eukaryotic protein kinases [ , , , , ] are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases.This entry represents a conserved site, which is located in the N-terminal extremity of the catalytic domain, where there is a glycine-rich stretch of residues in the vicinity of a lysine residue. It is this lysine residue that has been shown to be involved in ATP binding.
Protein Domain
Name: Aminoacyl-tRNA synthetase, class II
Type: Domain
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].This entry recognises all class-II enzymes except for heterodimeric glycyl-tRNA synthetases and alanyl- tRNA synthetases.
Protein Domain
Name: Anticodon-binding
Type: Domain
Description: tRNA synthetases, or tRNA ligases are involved in protein synthesis. This domain is found in histidyl, glycyl, threonyl and prolyl tRNA synthetases [ ]. It is probably the anticodon binding domain [].
Protein Domain
Name: Threonyl/alanyl tRNA synthetase, SAD
Type: Domain
Description: The catalytically active form of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this SAD domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain [ ].
Protein Domain
Name: Threonyl/alanyl tRNA synthetase, class II-like, putative editing domain superfamily
Type: Homologous_superfamily
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology [ ]. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].This superfamily represents a structural domain containing a two-layer core alpha/beta structure: α-β(2)-α-β(2). This domain is thought to be a putative editing domain found in the N-terminal part of threonyl-tRNA synthetase (ThrRS), the C-terminal of alanyl-tRNA synthetase (AlaRS), and as the stand-alone hypothetical protein from the archaea Pyrococcus horikoshii [ ]; probable circular permutation of LuxS [, , ].
Protein Domain
Name: Aminoacyl-tRNA synthetase, class II (G/ P/ S/T)
Type: Domain
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].This domain is the core catalytic domain of tRNA synthetases and includes glycyl, prolyl, seryl and threonyl tRNA synthetases.
Protein Domain
Name: Threonine-tRNA ligase, class IIa
Type: Family
Description: Threonine-tRNA ligase (also known as Threonyl-tRNA synthetase) ( ) exists as a monomer and belongs to class IIa. The enzyme from Escherichia coli represses the translation of its own mRNA. The crystal structure of the complex between tRNA(Thr) and ThrRS show structural features that reveal novel strategies for providing specificity in tRNA selection. These include an amino-terminal domain containing a novel protein fold that makes minor groove contacts with the tRNA acceptor stem. The enzyme induces a large deformation of the anticodon loop, resulting in an interaction between two adjacent anticodon bases, which accounts for their prominent role in tRNA identity and translational regulation. A zinc ion found in the active site is implicated in amino acid recognition/discrimination []. The zinc ion may act to ensure that only amino acids that possess a hydroxyl group attached to the β-position are activated [].The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [, ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: Pentatricopeptide repeat
Type: Repeat
Description: This entry represents the PPR repeat.Pentatricopeptide repeat (PPR) proteins are characterised by tandem repeats of a degenerate 35 amino acid motif [ ]. PPR proteins are sequence-specific RNA-binding proteins that are involved in multiple aspects of RNA metabolism [, ]. They can bind a diversity of sequences that confers the variability in its functions []. Most have roles in mitochondria or plastids []. PPR repeats were discovered while screening Arabidopsis proteins for those predicted to be targeted to mitochondria or chloroplast [, ]. Some of these proteins have been shown to play a role in post-transcriptional processes within organelles [, ]. Plant genomes have between one hundred to five hundred PPR genes per genome whereas non-plant genomes encode two to six PPR proteins.The plant PPR protein family has been divided in two subfamilies on the basis of their motif content and organisation [ , ].The crystal structure of maize chloroplast PPR10 has been reported. The nineteen repeats of PPR10 are assembled into a right-handed superhelical spiral. PPR10 forms a homodimer and exhibits considerable conformational changes upon binding to its ssRNA target, with six nucleotides being specifically recognized by six corresponding PPR10 repeats [ ].
Protein Domain
Name: Domain of unknown function DUF4283
Type: Domain
Description: This domain is found in plant proteins including the uncharacterized protein At4g02000. Considering the very diverse range of other domains it is associated with, it is possible that this domain is a binding/guiding region. It contains two highly conserved tryptophan residues.
Protein Domain
Name: Acyl-CoA N-acyltransferase
Type: Homologous_superfamily
Description: This entry represents a structural domain found in several acyl-CoA acyltransferase enzymes. This domain has a 3-layer α/β/α structure that contains mixed β-sheets, and can be found in the following proteins:N-acetyl transferase (NAT) family members, including aminoglycoside N-acetyltransferases [ ], the histone acetyltransferase domain of P300/CBP associating factor PCAF [], the catalytic domain of GCN5 histone acetyltransferase [], and diamine acetyltransferase 1 [].Autoinducer synthetases, such as protein LasI [ ] and acyl-homoserinelactone synthase EsaI [].Leucyl/phenylalanyl-tRNA-protein transferase (LFTR), a close relative of the non-ribosomal peptidyltransferases; there is a deletion of the N-terminal half of the N-terminal NAT-like domain after the domain duplication/swapping events [ ].Ornithine decarboxylase antizyme, which may have evolved a different function for this domain, although the putative active site maps to the same location in the common fold.Arginine N-succinyltransferase, alpha chain, AstA, which contains an extra C-terminal domain that is similar to the double-psi β-barrel fold domain (missing one strand and untangled ψ-loops).Several proteins carry a duplication of this domain, which consists of two NAT-like domains swapped with the C-terminal strands, including:N-myristoyl transferase (NMT) [ ].FemXAB non ribosomal peptidyl transferases, including methicillin-resistance protein FemA (transfer glycyl residue from tRNA-Gly) [ ] and peptidyl transferase FemX [].Hypothetical protein cg14615-pa from Drosophila melanogaster (Fruit fly).
Protein Domain
Name: GNAT domain
Type: Domain
Description: The N-acetyltransferases (NAT) ([intenz:2.3.1.-]) are enzymes that use acetyl coenzyme A (CoA) to transfer an acetyl group to a substrate, a reaction implicated in various functions from bacterial antibiotic resistance to mammalian circadian rhythm and chromatin remodelling. The Gcn5-related N-acetyltransferases (GNAT) catalyse the transfer of the acetyl from the CoA donor to a primary amine of the acceptor. The GNAT proteins share a domain composed of four conserved sequence motifs A-D [, ]. This GNAT domain is named after yeast GCN5 (from General Control Nonrepressed) and related histone acetyltransferases (HATs) like Hat1 and PCAF. HATs acetylate lysine residues of N-terminal histone tails, resulting in transcription activation. Another category of GNAT, the aminoglycoside N-acetyltransferases, confer antibiotic resistance by catalysing the acetylation of amino groups in aminoglycoside antibiotics []. GNAT proteins can also have anabolic and catabolic functions in both prokaryotes and eukaryotes [, , , , ].The acetyltransferase/GNAT domain forms a structurally conserved fold of 6 to 7 β-strands (B) and 4 helices (H) in the topology B1-H1-H2-B2-B3-B4-H3-B5-H4-B6, followed by a C-terminal strand which may be from the same monomer or contributed by another [ , ]. Motifs D (B2-B3), A (B4-H3) and B (B5-H4) are collectively called the HAT core [, , ], while the N-terminal motif C (B1-H1) is less conserved.Some proteins known to contain a GNAT domain:Actinobacterial mycothiol acetyltransferase (MshD), which catalyses the transfer of acetyl from acetyl-CoA to desacetylmycothiol to form mycothiol. Yeast GCN5 and Hat1, which are histone acetyltransferases (EC 2.3.1.48).Human PCAF, a histone acetyltransferase.Mammalian serotonin N-acetyltransferase (SNAT) or arylalkylamine NAT (AANAT), which acetylates serotonin into a circadian neurohormone that mayparticipate in light-dark rhythms, and human mood and behaviour.Mammalian glucosamine 6-phosphate N-acetyltransferase (GNA1) (EC 2.3.1.4).Escherichia coli RimI and RimJ, which acetylate the N-terminal alanine ofribosomal proteins S18 and S5, respectively (EC 2.3.1.128).Mycobacterium tuberculosis aminoglycoside 2'-N-acetyltransferase (Aac), which acetylates the 2' hydroxyl or amino group of a broad spectrum ofaminoglycoside antibiotics.Bacillus subtilis BltD and PaiA, which acetylate spermine and spermidine.This entry represents the entire GNAT domain.
Protein Domain
Name: Myb/SANT-like domain
Type: Domain
Description: This domain, found in L10-interacting MYB domain-containing protein (LIMYB) and At2g29880 from Arabidopsis, is related to Myb/SANT-like DNA binding domains. LIMYB is a transcription factor that associates with ribosomal protein promoters. It is involved in global translation suppression as an antiviral immunity strategy in plants [ ].
Protein Domain
Name: Zinc knuckle CX2CX4HX4C
Type: Domain
Description: This zinc knuckle is a zinc binding motif composed of the sequence CX2CX4HX4C where X can be any amino acid.
Protein Domain
Name: Histone-fold
Type: Homologous_superfamily
Description: Histones mediate DNA organisation and play a dominant role in regulating eukaryotic transcription. The histone-fold consists of a core of three helices, where the long middle helix is flanked at each end by shorter ones. The histone fold is a structural element that facilitates heterodimerisation [ , , ]. Proteins displaying this structure include the nucleosome core histones, which form octomers composed of two copies of each of the four histones, H2A, H2B, H3 and H4; archaeal histone, which possesses only the core domain part of eukaryotic histone; and the TATA-box binding protein (TBP)-associated factors (TAF), where the histone fold is a common motif for mediating TAF-TAF interactions. TAF proteins include TAF(II)18 and TAF(II)28, which form a heterodimer, TAF(II)42 and TAF(II)62, which form a heterotetramer similar to (H3-H4)2, and the negative cofactor 2 (NC2) alpha and beta chains, which form a heterodimer. The TAF proteins are a component of transcription factor IID (TFIID), along with the TBP protein. TFIID forms part of the pre-initiation complex on core promoter elements required for RNA polymerase II-dependent transcription. The TAF subunits of TFIID mediate transcriptional activation of subsets of eukaryotic genes. The NC2 complex mediates the inhibition of TATA-dependent transcription through interactions with TBP.
Protein Domain
Name: Transcription factor CBF/NF-Y/archaeal histone domain
Type: Domain
Description: This domain is found in archaebacterial histones and histone-like transcription factors from eukaryotes.
Protein Domain
Name: Reverse transcriptase zinc-binding domain
Type: Domain
Description: This domain would appear to be a zinc-binding region of a putative reverse transcriptase.
Protein Domain
Name: Zinc finger, RING/FYVE/PHD-type
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This superfamily represents RING-, PHD-, and FYVE-type zinc finger domains, which share a common dimetal (zinc)-bound α/β structural fold, as well as the non-zinc-containing U-box domain, which is similar to the RING zinc finger only lacking the metal ion-binding residues (U-box associated with multi-ubiquitination).
Protein Domain
Name: Zinc finger, RING-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents RING-type zinc finger domains. The RING-finger is a specialised type of Zn-finger of 40 to 60 residues that binds two atoms of zinc, and is probably involved in mediating protein-protein interactions [ , , ]. There are two different variants, the C3HC4-type and a C3H2C3-type, which are clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as 'RING-H2 finger'. The RING domain is a protein interaction domain that has been implicated in a range of diverse biological processes. E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain. E3 ubiquitin-protein ligases determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3 []. Various RING fingers also exhibit binding to E2 ubiquitin-conjugating enzymes (Ubc's) [, , ].Several 3D-structures for RING-fingers are known [ , ]. The 3D structure of the zinc ligation system is unique to the RING domain and is referred to as the 'cross-brace' motif. The spacing of the cysteines in such a domain is C-x(2)-C-x(9 to 39)-C-x(1 to 3)-H-x(2 to 3)-C-x(2)-C-x(4 to 48)-C-x(2)-C. Metal ligand pairs one and three co-ordinate to bind one zinc ion, whilst pairs two and four bind the second.Note that in the older literature, some RING-fingers are denoted as LIM-domains. The LIM-domain Zn-finger is a fundamentally different family, albeit with similar Cys-spacing (see ).
Protein Domain
Name: EamA domain
Type: Domain
Description: EamA (named after the O-acetyl-serine/cysteine export gene in E. coli) domain is found in a wide range of proteins including the Erwinia chrysanthemi PecM protein, which is involved in pectinase, cellulase and blue pigment regulation, the Salmonella typhimurium PagO protein (function unknown), and some members of the solute carrier family group 35 (SLC35) nucleoside-sugar transporters. Many members of this family are classed as drug/metabolite transporters and have no known function. They are predicted to be integral membrane proteins and many of the proteins contain two copies of this domain [ ].
Protein Domain      
Protein Domain
Name: NB-ARC
Type: Domain
Description: This is the NB-ARC domain, a novel signalling motif found in bacteria and eukaryotes, shared by plant resistance gene products and regulators of cell death in animals [ ]. This domain has been structurally characterised in the human protein apoptotic protease-activating factor 1 (Apaf-1) []. It contains the three-layered α-β fold and subsequent short α-helical region characteristic of the AAA+ ATPase domain superfamily. While this domain is thought to bind and hyrolyse ATP, only ADP binding has been experimentally verified. It is proposed that binding and hydrolysis of ATP by this domain induces conformational changes the the overall protein, leading to formation of the apoptosome.
Protein Domain
Name: Pyruvate kinase
Type: Family
Description: Pyruvate kinase ( ) (PK) catalyses the final step in glycolysis [ , ], the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:ADP + phosphoenolpyruvate = ATP + pyruvate The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.PK helps control the rate of glycolysis, along with phosphofructokinase ( ) and hexokinase ( ). PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions [ ]. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.The structure of several pyruvate kinases from various organisms have been determined [ , , ]. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a β/α-barrel domain, a β-barrel domain (inserted within the β/α-barrel domain), and a 3-layer α/β/α sandwich domain.
Protein Domain      
Protein Domain
Name: Pyruvate kinase, barrel
Type: Domain
Description: Pyruvate kinase ( ) (PK) catalyses the final step in glycolysis [ , ], the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:ADP + phosphoenolpyruvate = ATP + pyruvate The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.PK helps control the rate of glycolysis, along with phosphofructokinase ( ) and hexokinase ( ). PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions [ ]. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.The structure of several pyruvate kinases from various organisms have been determined [ , , ]. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a β/α-barrel domain, a β-barrel domain (inserted within the β/α-barrel domain), and a 3-layer α/β/α sandwich domain.This entry represents the two barrel domains, the β/α-barrel, and the β-barrel inserted within it.
Protein Domain
Name: Pyruvate/Phosphoenolpyruvate kinase-like domain superfamily
Type: Homologous_superfamily
Description: Pyruvate kinase controls the exit from the glysolysis pathway, catalysing the transfer of phosphate from phosphooenolpyruvate (PEP) to ADP. Mammalian pyruvate kinase is a homotetramer, where each polypeptide subunit consists of four domains: N-terminal, A domain, B domain and C-terminal. Activation of the enzyme is believed to occur via the clamping down of the B domain onto the A domain to dehydrate the active site cleft. The N- and C-terminal domains are situated at inter-subunit contact sites, and could be involved in assembly and communication within the complex. The N-terminal domain has a TIM beta/α-barrel structure. Homologous TIM-barrel domains are found in the following proteins:N-terminal of pyruvate kinase ( ), which is interrupted by an all-beta domain [ ].C-terminal of pyruvate phosphate dikinase ( ), which has a similar mode of substrate binding to pyruvate kinase [ ].Phosphoenolpyruvate carboxylase ( ); this domain has additional helices [ ].Phosphenolpyruvate mutase( )/Isocitrate lyase ( ), where it forms a swapped dimer [ ].HpcH/HpaI aldolases, such as the beta subunit of citrate lyase, where it forms a swapped dimer, and contains a pyruvate kinase-type metal binding site [ ].Ketopantoate hydroxymethyltransferase PanB ( ), where a C-terminal helix exchange is observed in some enzymes [ ].
Protein Domain
Name: Pyruvate kinase, C-terminal
Type: Domain
Description: Pyruvate kinase ( ) (PK) catalyses the final step in glycolysis [ , ], the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:ADP + phosphoenolpyruvate = ATP + pyruvate The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.PK helps control the rate of glycolysis, along with phosphofructokinase ( ) and hexokinase ( ). PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions [ ]. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver. The structure of several pyruvate kinases from various organisms have been determined [ , , ]. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a β/α-barrel domain, a β-barrel domain (inserted within the β/α-barrel domain), and a 3-layer α/β/α sandwich domain.This entry represents the 3-layer α/β/α sandwich domain which contains the FBP (fructose 1,6-bisphosphate) binding site [ ]. This domain has a similar topology to the archaeal hypothetical protein, MTH1675 from Methanobacterium thermoautotrophicum.
Protein Domain
Name: Pyruvate kinase-like, insert domain superfamily
Type: Homologous_superfamily
Description: Pyruvate kinase ( ) (PK) catalyses the final step in glycolysis [ , ], the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:ADP + phosphoenolpyruvate = ATP + pyruvate The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.PK helps control the rate of glycolysis, along with phosphofructokinase ( ) and hexokinase ( ). PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions [ ]. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.The structure of several pyruvate kinases from various organisms have been determined [ , , ]. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a β/α-barrel domain, a β-barrel domain (inserted within the β/α-barrel domain), and a 3-layer α/β/α sandwich domain.This superfamily represents the β-barrel domain (note: it does not include the beta/α-barrel it is inserted into). This domain has a similar topology to the β-strand-rich C-terminal domain of molybdenum cofactor (MOCO) sulphurase (MOSC domain). MOSC domains are found alone in bacterial YiiM proteins, or fused to other domains, such as a NifS-like catalytic domain in MOCO sulphurase. The MOSC domain is predicted to be a sulphur-carrier domain that receives sulphur abstracted from pyridoxal phosphate-dependent NifS-like enzymes, using it for the formation of diverse sulphur-metal clusters [ ].
Protein Domain
Name: Pyruvate kinase, active site
Type: Active_site
Description: Pyruvate kinase ( ) (PK) catalyses the final step in glycolysis [ , ], the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:ADP + phosphoenolpyruvate = ATP + pyruvate The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.PK helps control the rate of glycolysis, along with phosphofructokinase ( ) and hexokinase ( ). PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions [ ]. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver. The structure of several pyruvate kinases from various organisms have been determined [ , , ]. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a β/α-barrel domain, a β-barrel domain (inserted within the β/α-barrel domain), and a 3-layer α/β/α sandwich domain.This entry represents an active site that includes a lysine residue which seems to be the acid/base catalyst responsible for the interconversion of pyruvate and enolpyruvate, and a glutamic acid residue implicated in the binding of the magnesium ion.
Protein Domain
Name: Pyruvate kinase, insert domain superfamily
Type: Homologous_superfamily
Description: Pyruvate kinase ( ) (PK) catalyses the final step in glycolysis [ , ], the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:ADP + phosphoenolpyruvate = ATP + pyruvate The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.PK helps control the rate of glycolysis, along with phosphofructokinase ( ) and hexokinase ( ). PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions [ ]. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.The structure of several pyruvate kinases from various organisms have been determined [ , , ]. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a β/α-barrel domain, a β-barrel domain (inserted within the β/α-barrel domain), and a 3-layer α/β/α sandwich domain.This superfamily represents the β-barrel domain (note: it does not include the β/α-barrel it is inserted into).
Protein Domain
Name: Archaeal Rpo6/eukaryotic RPB6 RNA polymerase subunit
Type: Family
Description: DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length [ ]. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3' direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs.RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700kDa, contain two non-identical large (>100kDa) subunits and an array of up to 12 different small (less than 50kDa) subunits.A RNAP component of 14 to 18kDa is shared by all three forms of eukaryotic RNA polymerases and has been sequenced in budding yeast (gene rpb6 or rpo26), in fission yeast (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) [ ]. It is evolutionary related [] to archaeal Rpo6 subunit (gene rpoK or rpo6). The archaeal protein is colinear with the C-terminal part of the eukaryotic subunit.This family includes both eukaryotic Rpb6 and archaeal Rpo6 subunit, also known as subunit K, as well as the evolutionary related RPB6 homologue from ASFV [ ].
Protein Domain
Name: RNA polymerase, subunit omega/Rpo6/RPB6
Type: Family
Description: In eukaryotes, there are three different forms of DNA-dependent RNA polymerases ( ) transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. A component of 14 to 18kDa shared by all three forms of eukaryotic RNA polymerases and which has been sequenced in budding yeast (gene RPB6 or RPO26), in Schizosaccharomyces pombe (Fission yeast) (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) is evolutionary related to the archaebacterial subunit Rpo6 (also known as subunit K). The archaebacterial protein is colinear with the C-terminal part of the eukaryotic subunit. The structures of the omega subunit and RBP6, and the structures of the omega/beta' and RPB6/RPB1 interfaces, suggest a molecular mechanism for the function of omega and RPB6 in promoting RNAP assembly and/or stability. The conserved regions of omega and RPB6 form a compact structural domain that interacts simultaneously with conserved regions of the largest RNAP subunit and with the C-terminal tail following a conserved region of the largest RNAP subunit. The second half of the conserved region of omega and RPB6 forms an arc that projects away from the remainder of the structural domain and wraps over and around the C-terminal tail of the largest RNAP subunit, clamping it in a crevice, and threading the C-terminal tail of the largest RNAP subunit through the narrow gap between omega and RPB6 [ ].
Protein Domain
Name: DNA-directed RNA polymerase, 14-18kDa subunit, conserved site
Type: Conserved_site
Description: DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length [ ]. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3' direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs.RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700kDa, contain two non-identical large (>100kDa) subunits and an array of up to 12 different small (less than 50kDa) subunits.A component of 14 to 18kDa shared by all three forms of eukaryotic RNA polymerases and which has been sequenced in budding yeast (gene RPB6 or RPO26), in fission yeast (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) [] is evolutionary related [] to archaeal subunit K (generpoK). The archaeal protein is colinear with the C-terminal part of the eukaryotic subunit.
Protein Domain      
Protein Domain
Name: ABC transporter type 1, transmembrane domain
Type: Domain
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ]; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).ABC transporters minimally contain two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). In certain bacterial transporters, these regions are found on different polypeptides. The function of the integral inner-membrane protein is to translocate the substrate across the membrane, as well as in substrate recognition [ , ].This entry represents the transmembrane domain in cases where the TMD and ABC region are found in the same protein, and corresponds to ABC type 1 from Transporter Classification Database (http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
Protein Domain
Name: ABC transporter-like, ATP-binding domain
Type: Domain
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].On the basis of sequence similarities a family of related ATP-binding proteins has been characterised [, , , , ]. The proteins belonging to this family also contain one or two copies of the 'A' consensus sequence [ ] or the 'P-loop' [].
Protein Domain
Name: ABC transporter-like, conserved site
Type: Conserved_site
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].On the basis of sequence similarities a family of related ATP-binding proteins has been characterised [ , , , , ]. The proteins belonging to this family also contain one or two copies of the 'A' consensus sequence [ ] or the 'P-loop' [].
Protein Domain
Name: AAA+ ATPase domain
Type: Domain
Description: The AAA+ superfamily of ATPases is found in all kingdoms of living organisms where they participate in diverse cellular processes including membrane fusion, proteolysis and DNA replication. Although the terms AAA+ and AAA are often used loosely and interchangeably, the classical AAA family members are distinguished by their possession of the SRH region in the AAA module. Many AAA+ proteins are involved in similar processes to those of AAA proteins (facilitation of protein folding and unfolding, assembly or disassembly of proteins complexes, protein transport and degradation), but others function in replication, recombination, repair and transcription. For a review see [ ]. The proteins in this superfamily are characterised by the structural conservation of a central ATPase domain of about 250 amino acids called the AAA+ module. Typically, the AAA+ domain can be divided into two structural subdomains, an N-terminal P-loop NTPase α-β-α subdomain that is connected to a smaller C-terminal all-α subdomain. The α-β-α subdomain adopts a Rossman fold and contains several motifs involved in ATP binding and hydrolysis, including classical motifs Walker A and Walker B [ , ]. The all-α subdomain [], is much less conserved across AAA+ proteins. This entry represents the AAA+ ATPase domain found in a range of proteins, including Holliday junction ATP-dependent DNA helicase RuvB from Mycobacterium sp.
Protein Domain      
Protein Domain
Name: Glycosyl hydrolases family 18 (GH18) active site
Type: Active_site
Description: The glycosyl hydrolases family 18 (GH18) is widely distributed in all domains of life. The GH18 family contains hydrolytic enzymes with chitinase or endo-N-acetyl-beta-D-glucosaminidase (ENGase) activity as well as chitinase like lectins (chi-lectins/proteins (CLPs). Chitinases (EC 3.2.1.14) are hydrolytic enzymes that cleave the beta-1,4-bond releasing oligomeric, dimeric (chitobiose) or monomeric (N-actetylglucosamine, GlcNAc) products. ENGases (EC ) hydrolyse the beta-1,4 linkage in the chitobiose core of N-linked glycans from glycoproteins leaving one GlcNAc residue on the substrate. CLPs do not display chitinase activity but some of them have been reported to have specific functions and carbohydrate binding property. The catalytic domain of GH18s may be connected to one or several substrate binding modules (CBMs), which enhance binding of enzymes to insoluble substrates. Certain GH18s also contain peptide signals for localization such as an N-terminal secretion peptide, a C-terminal glycosyl-phosphatidylinositol (GPI) anchor signal for attachment to the plasma-membrane, or N- or O-linked glycosylation sites for oligosaccharide modifications [ , , , , , , , , ].The catalytic domain of GH18s has a common (beta/alpha)8 triosephosphate isomerase (TIM)-barrel structure, which consists of a barrel-like framework made from eight internal parallel β-strands that are alternately connected by eight exterior α-helices. The active site motif DxxDxDxE is essential for the activity of the GH18 catalytic domain. The Glu (E) in this motif acts as the catalytic proton donor, and the last Asp (D(3))is supposed to contribute to the stabilization of the essential distortion of the substrate [ , , , , ].This entry represents the active site of GH18 members predominantly found in bacteria and eukaryota.
Protein Domain
Name: Glycoside hydrolase family 18, catalytic domain
Type: Domain
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.The glycosyl hydrolases family 18 (GH18) is widely distributed in all kingdoms and contains hydrolytic enzymes with chitinase or endo-N-acetyl-beta-D-glucosaminidase (ENGase) activity as well as chitinase-like lectins (chi-lectins/proteins (CLPs). Chitinases ( ) are hydrolytic enzymes that cleave the beta-1,4-bond releasing oligomeric, dimeric (chitobiose) or monomeric (N-actetylglucosamine, GlcNAc) products. ENGases ( ) hydrolyze the beta-1,4 linkage in the chitobiose core of N-linked glycans from glycoproteins leaving one GlcNAc residue on the substrate. CLPs do not display chitinase activity but some of them have been reported to have specific functions and carbohydrate binding property [ ]. This family also includes glycoproteins from mammals, such as oviduct-specific glycoproteins.The catalytic domain of GH18s has a common (beta/alpha)8 triosephosphate isomerase (TIM)-barrel structure, which consists of a barrel-like framework made from eight internal parallel β-strands that are alternately connected by eight exterior α-helices. The active site motif DxxDxDxE is essential for the activity of the GH18 catalytic domain. [ , , ].
Protein Domain
Name: Glycoside hydrolase superfamily
Type: Homologous_superfamily
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.This entry represents the catalytic TIM β/α barrel common to many different families of glycosyl hydrolases found in all groups of organisms including viruses and Gene Transfer Agents (GTA) [ ]. In the GTA of Rhodobacter capsulatus (Rhodopseudomonas capsulata) a glycosyl hydrolase domain is associated with ORFg15 (RCAP_rcc01698) [, see Fig.1, in ]. Structures have been determined for several proteins containing this glycosyl hydrolase domain, including family 13 glycosyl hydrolases (such as alpha-amylase) [ ], beta-glycanases [], family 1 glycosyl hydrolases (such as beta-glucosidase) [], type II chitinases [], 1,4-beta-N-acetylmuraminidases [], and beta-N-acetylhexosaminidases [].
Protein Domain      
Protein Domain
Name: Concanavalin A-like lectin/glucanase domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents the concanavalin A-like domain, which has a sandwich structure of 12-14 β-strands in two sheets with a complex topology. Proteins containing this domain include: Legume lectinsGlycosyl hydrolases family 16Galectin (animal S-lectin)Laminin G-like modulePentraxinClostridium neurotoxinsExotoxin AVibrio cholera sialidaseLeech intramolecular trans-sialidaseGlycosyl hydrolase family 7Xylanase/endoglucanaseCalnexin/calreticulinLectin leg-likevp4 sialic acid binding proteinTrypanosoma sialidaseThrombospondinHypothetical protein YesUAlginate lyaseGlycosyl hydrolases family 32Peptidase A4Alpha-L-arabinofuranosidase BSPRY domain containing proteinsBeta-D-xylosidaseSO2946-likeMAM domain containing proteins
Protein Domain
Name: Legume lectin domain
Type: Domain
Description: Lectins are carbohydrate-binding proteins. Leguminous lectins form one of the largest lectin families and resemble each other in their physicochemical properties, though they differ in their carbohydrate specificities. They bind either glucose/mannose or galactose [ ]. Carbohydrate-binding activity depends on the simultaneous presence of both acalcium and a transition metal ion [ ]. The exact function of legume lectins is not known, but they may be involved in the attachment of nitrogen-fixing bacteria to legumes and in the protection against pathogens [, ].Some legume lectins are proteolytically processed to produce two chains, beta (which corresponds to the N-terminal) and alpha (C-terminal) [ ]. The lectin concanavalin A (conA) from jack bean is exceptional in that the two chains are transposed and ligated (by formation of a new peptide bond). The N terminus of mature conA thus corresponds to that of the alpha chain and the C terminus to the beta chain []. Though the legume lectins monomer is structurally well conserved, their quaternary structures vary widely [].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom