Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 501 to 600 out of 38750 for *

Category restricted to ProteinDomain (x)

0.015s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: CRAL/TRIO, N-terminal domain
Type: Domain
Description: This all-alpha domain is often found N-terminal to CRAL-TRIO domains ( ). Its function is unknown.
Protein Domain
Name: CRAL-TRIO lipid binding domain
Type: Domain
Description: The CRAL-TRIO domain is a protein structural domain that binds small lipophilic molecules [ ]. The domain is named after cellular retinaldehyde-binding protein (CRALBP) and TRIO guanine exchange factor.The CRAL-TRIO domain is found in GTPase-activating proteins (GAPs), guanine nucleotide exchange factors (GEFs) and a family of hydrophobic ligand binding proteins, including the yeast SEC14 protein and mammalian retinaldehyde- and alpha-tocopherol-binding proteins. The domain may either constitute all of the protein or only part of it [ , , , ].The structure of the domain in SEC14 proteins has been determined [ ]. The structure contains several alpha helices as well as a beta sheet composed of 6 strands. Strands 2,3,4 and 5 form a parallel beta sheet with strands 1 and 6 being anti-parallel. The structure also identified a hydrophobic binding pocket for lipid binding.
Protein Domain      
Protein Domain
Name: Ribosomal biogenesis regulatory protein
Type: Family
Description: This is a family of eukaryotic ribosomal biogenesis regulatory proteins.
Protein Domain
Name: Peptidase C13, legumain
Type: Family
Description: Asparaginyl endopeptidase, also known as legumain, is a family of cysteine proteases found in many organisms. This group of cysteine peptidases belong to the MEROPS peptidase family C13 (legumain family, clan CD). A type example is legumain from Canavalia ensiformis (Jack bean, Horse bean) [ ]. Although legumains were first described from beans (also known as Vacuolar Processing Enzymes), homologues have been identified in plants, protozoa, vertebrates, and helminths [, ]. In blood-feeding helminths, asparaginyl endopeptidases (sometimes described as hemoglobinases) have been located in the gut and are considered to be involved in host hemoglobin digestion [, , , ].Also included in the family C13 of cysteine peptidases are GPI-anchor transamidases, which share significant homology with legumains. GPI-anchor transamidases mediate glycosylphosphatidylinositol (GPI) anchoring in the endoplasmic reticulum [ , ].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: DNA (cytosine-5)-methyltransferase 1, replication foci domain
Type: Domain
Description: This domain is part of a cytosine specific DNA methyltransferase enzyme (DNMT). It functions non-catalytically to target the protein towards replication foci. This allows the DNMT1 protein to methylate the correct residues. This domain targets DMAP1 and HDAC2 to the replication foci during the S phase of mitosis. They are thought to have some importance in conversion of critical histone lysine moieties [ ].A structure exists for the human cytosine specific DNA methyltransferase replication foci domain [ ].
Protein Domain
Name: SWIB/MDM2 domain
Type: Domain
Description: The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation [ , , ]. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins [], and can be found fused to the C terminus of DNA topoisomerase in Chlamydia. This domain is also found in the Saccharomyces cerevisiae SNF12 protein, the eukaryotic initiation factor 2 (eIF2) []and the Arabidopsis thaliana At1g31760 protein [].MDM2 is an oncoprotein that acts as a cellular inhibitor of the p53 tumour suppressor by binding to the transactivation domain of p53 and suppressing its ability to activate transcription [ ]. p53 acts in response to DNA damage, inducing cell cycle arrest and apoptosis. Inactivation of p53 is a common occurrence in neoplastic transformations. The core of MDM2 folds into an open bundle of four helices, which is capped by two small 3-stranded β-sheets. It consists of a duplication of two structural repeats. MDM2 has a deep hydrophobic cleft on which the p53 α-helix binds; p53 residues involved in transactivation are buried deep within the cleft of MDM2, thereby concealing the p53 transactivation domain.The SWIB and MDM2 domains are homologous and share a common fold.
Protein Domain
Name: DEK, C-terminal
Type: Domain
Description: DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C-terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients [ , ]. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family []. This domain is also found in chitin synthase proteins like , and in protein phosphatases such as .
Protein Domain
Name: SWIB domain
Type: Domain
Description: The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation [ , , ]. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins [], and can be found fused to the C terminus of DNA topoisomerase in Chlamydia. This domain is also found in the Saccharomyces cerevisiae SNF12 protein, the eukaryotic initiation factor 2 (eIF2) []and the Arabidopsis thaliana At1g31760 protein [].
Protein Domain
Name: PRONE domain
Type: Domain
Description: In plants, the small GTP-binding proteins called Rops work as signalling switches that control growth, development and plant responses to various environmental stimuli. Rop proteins (Rho of plants, Rac-like and atRac in Arabidopsis thaliana belong to the Rho family of Ras-related GTP-binding proteins that turn on signalling pathways by switching from a GDP-bound inactive to a GTP-bound active conformation. Activation depends on guanine nucleotide exchange factors (GEFs) that catalyse the otherwise slow GDP dissociation for subsequent GTP binding. The plant-specific RopGEFs represent a unique family of exchange factor that display no homology to any known RhoGEFs from animals and fungi. They comprise a highly conserved catalytic domain termed PRONE (plant-specific Rop nucleotide exchanger) with exclusive substrate specificity for members of the Rop family. The PRONE domain has been shown to be necessary and sufficient to promote nucleotide release from Rop [ , , ].The PRONE domain can be divided into three highly conserved subdomains separated by two short stretches of variable amino acid residues [ , ]. It is approximately 370 residues in length and displays an almost all α-helical structure except for a β-turn that protrudes from the main body of the molecule. The overall structure of the PRONE domain can be divided into two subdomains, the first one including helices alpha1-5 and alpha13, the second alpha6-12 [, ].
Protein Domain
Name: Cation/H+ exchanger, CPA1 family
Type: Family
Description: Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers [ , ]. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia [, , , ]. Human NHE is also involved in heart disease, cell growth and in cell differentiation []. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9) [, ]. These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N terminus and a large cytoplasmic region at the C terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport [].This entry represents the cation:proton antiporter family 1 (CPA1), which includes Na+/H+ exchangers, K+/H+ exchangers and Na+(K+,Li+,Rb+)/H+ exchangers.
Protein Domain
Name: Cation/H+ exchanger
Type: Domain
Description: Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers [ , ]. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia [, , , ]. Human NHE is also involved in heart disease, cell growth and in cell differentiation [ ]. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9) [, ]. These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N terminus and a large cytoplasmic region at the C terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport [].This entry represents a number of cation/proton exchangers, including Na+/H+ exchangers, K+/H+ exchangers and Na+(K+,Li+,Rb+)/H+ exchangers.
Protein Domain
Name: PLC-like phosphodiesterase, TIM beta/alpha-barrel domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a structural domain consisting of a TIM beta/α-barrel. These domains are found in several phospholipase C (PLC) like phosphodiesterases, including:Mammalian phospholipase C isozyme D1 [ ].Bacterial phosphatidylinositol-specific phospholipase C [ ].Glycerophosphodiester phosphodiesterases, such as GlpQ from Escherichia coli and UgpQ from Agrobacterium tumefaciens [ ].Phospholipase C (PLC) isozymes are directly activated by heterotrimeric G proteins and Ras-like GTPases to hydrolyse phosphatidylinositol 4,5-bisphosphate into the second messengers diacylglycerol and inositol 1,4,5-trisphosphate. PLC enzymes often play central roles in various signalling cascades [ ].
Protein Domain      
Protein Domain
Name: CobW/HypB/UreG, nucleotide-binding domain
Type: Domain
Description: This domain is found in HypB, a hydrogenase expression/formation protein, and urease accessory protein UreG. Both these proteins contain a P-loop nucleotide binding motif [ , ]. HypB has GTPase activity and is a guanine nucleotide binding protein []. UreG is a GTPase in charge of nucleotide hydrolysis required for activation of the urease enzyme []. Both GTPases are involved in nickel binding. HypB can store nickel and is required for nickel dependent hydrogenase expression []. UreG is required for functional incorporation of the urease nickel metallocentre []. GTP hydrolysis may required by these proteins for nickel incorporation into other nickel proteins [].Other proteins containing this domain include P47K ( ), a Pseudomonas chlororaphis protein needed for nitrile hydratase expression, CobW ( ), which may be involved in cobalamin biosynthesis in Pseudomonas denitrificans [ ], and YjiA []. Both CobW and YjiA are members of the metal homeostasis-associated COG0523 family of GTPases [].
Protein Domain
Name: Zinc chaperone CobW-like, C-terminal
Type: Domain
Description: Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase [ ]. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase []. There are at least two distinct cobalamin biosynthetic pathways in bacteria [ ]:Aerobic pathway that requires oxygen and in which cobalt is inserted late in the pathway [ ]; found in Pseudomonas denitrificans and Rhodobacter capsulatus. Anaerobic pathway in which cobalt insertion is the first committed step towards cobalamin synthesis [ , ]; found in Salmonella typhimurium, Bacillus megaterium, and Propionibacterium freudenreichii subsp. shermanii. Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) [ ]. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis [ , ]. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids.This entry represents the C-terminal domain found in CobW, as well as in P47K ( ), a Pseudomonas chlororaphis protein needed for nitrile hydratase expression [ ].
Protein Domain
Name: DNA methylase, C-5 cytosine-specific, active site
Type: Active_site
Description: C-5 cytosine-specific DNA methylases (C5 Mtase) are enzymes that specifically methylate the C-5 carbon of cytosines in DNA [ , , ]. Such enzymes are found in the proteins described below.As a component of type II restriction-modification systems in prokaryotes and some bacteriophages. Such enzymes recognise a specific DNA sequence where they methylate a cytosine. In doing so, they protect DNA from cleavage by type II restriction enzymes that recognise the same sequence. The sequences of a large number of type II C-5 Mtases are known. In vertebrates, there are a number of C-5 Mtases that methylate CpG dinucleotides. The sequence of the mammalian enzyme is known. C-5 Mtases share a number of short conserved regions. This conserved region contains a conserved Pro-Cys dipeptide in which the cysteine has been shown to be involved in the catalytic mechanism; it appears to form a covalent intermediate with the C6 position of cytosine [ ].
Protein Domain
Name: C-5 cytosine methyltransferase
Type: Family
Description: C-5 cytosine-specific DNA methylases ( ) (C5 Mtase) are enzymes that specifically methylate the C-5 carbon of cytosines in DNA to produce C5-methylcytosine [ , , ]. In mammalian cells, cytosine-specific methyltransferases methylate certain CpG sequences, which are believed to modulate gene expression and cell differentiation. In bacteria, these enzymes are a component of restriction-modification systems and serve as valuable tools for the manipulation of DNA [, ]. The structure of HhaI methyltransferase (M.HhaI) has been resolved to 2.5 A []: the molecule folds into 2 domains - a larger catalytic domain containing catalytic and cofactor binding sites, and a smaller DNA recognition domain.This entry also includes tRNA (cytosine(38)-C(5))-methyltransferase ( ), also known as DNMT2 (DNA (cytosine-5)-methyltransferase-like protein 2), that specifically methylates cytosine 38 in the anticodon loop of tRNA (Asp) [ ].
Protein Domain
Name: Bromo adjacent homology (BAH) domain
Type: Domain
Description: The BAH (bromo-adjacent homology) is commonly found in chromatin-associated proteins [ ]. It is found in proteins such as eukaryotic DNA (cytosine-5) methyltransferases , the origin recognition complex 1 (Orc1) proteins, as well as several proteins involved in transcriptional regulation. The BAH domain appears to act as a protein-protein interaction module specialised in gene silencing, as suggested for example by its interaction within yeast Orc1p with the silent information regulator Sir1p. The BAH module might therefore play an important role by linking DNA methylation, replication and transcriptional regulation [ ].
Protein Domain
Name: BES1/BZR1 plant transcription factor, N-terminal
Type: Domain
Description: This entry represents the N-terminal regions of several plant transcription factors. It is classified as BES1/BZR1, a plant-specific transcription factor that cooperates with transcription factors such as BIM1 to regulate brassinosteroid-induced genes [ ].Proteins containing this domain are transcriptional repressors involved in controlling the response to Brassinosteroids (BRs). BRs are plant hormones that play essential roles in growth and development. BZR1 binds directly to DNA repressing the synthesis of genes involved in BR synthesis. Phosphorylation of BZR1 by BIN1 targets BZR1 to the 20S proteosome, while dephosphorylation leads to nuclear accumulation of BZR1 [ ].
Protein Domain
Name: Glycoside hydrolase, family 14
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Glycoside hydrolase family 14 comprises enzymes with only one known activity; beta-amylase ( ). A Glu residue has been proposed as a catalytic residue, but it is not known if it is the nucleophile or the proton donor. Beta-amylase [ , ] is an enzyme that hydrolyses 1,4-alpha-glucosidic linkages in starch-type polysaccharide substrates so as to remove successive maltose units from the non-reducing ends of the chains. Beta-amylase is present in certain bacteria as well as in plants.Three highly conserved sequence regions are found in all known beta-amylases. The first of these regions is located in the N-terminal section of the enzymes and contains an aspartate which is known to be involved in the catalytic mechanism [ ]. The second, located in a more central location, is centred around a glutamate which is also involved in the catalytic mechanism [].The 3D structure of a complex of soybean beta-amylase with an inhibitor (alpha-cyclodextrin) has been determined to 3.0A resolution by X-ray diffraction [ ]. The enzyme folds into large and small domains: the large domain has a (beta alpha)8 super-secondary structural core, while the smaller is formed from two long loops extending from the beta-3 and beta-4 strands of the (beta alpha)8 fold. The interface of the two domains, together with shorter loops from the (beta alpha)8 core, form a deep cleft, in which the inhibitor binds. Two maltose molecules also bind in the cleft, one sharing a binding site with alpha-cyclodextrin, and the other sitting more deeply in the cleft [].
Protein Domain
Name: Alpha-ketoglutarate-dependent dioxygenase AlkB-like
Type: Domain
Description: AlkB is a DNA repair enzyme that removes methyl adducts and some larger alkylation lesions from endocyclic positions on purine and pyrimidine bases. It is a dioxygenase that requires molecular oxygen, alpha-ketoglutarate and iron [ , ]. The catalytic core of AlkB is homologous to other Fe-2OG dioxygenases [ ]. Two discrete global conformations have been observed for AlkB, differing in accessibility of a putative oxygen-diffusion tunnel [].
Protein Domain
Name: RmlC-like cupin domain superfamily
Type: Homologous_superfamily
Description: RmlC (dTDP (deoxythymidine diphosphates)-4-dehydrorhamnose 3,5-epimerase; ) is a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria [ ]. RmlC is a dimer, each monomer being formed from two β-sheets arranged in a β-sandwich, where the substrate-binding site is located between the two sheets of both monomers.Other protein families contain domains that share this fold, including glucose-6-phosphate isomerase ( ); germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities [ ]; auxin-binding protein []; seed storage protein 7S []; acireductone dioxygenase []; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase () [ ], phosphomannose isomerase () [ ] and homogentisate dioxygenase () [ ], the last three sharing a 2-domain fold with storage protein 7s.
Protein Domain
Name: RmlC-like jelly roll fold
Type: Homologous_superfamily
Description: RmlC (deoxythymidine diphosphates-4-dehydrorhamnose 3,5-epimerase; ) is a mainly beta class protein with a jelly roll-like topology. It is a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria [ ]. This entry represents the domain with the jelly roll-like fold. Other protein families containing this domain include glucose-6-phosphate isomerase ( ); germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities [ ]; auxin-binding protein []; seed storage protein 7S []; acireductone dioxygenase []; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase () [ ], phosphomannose isomerase () [ ] and homogentisate dioxygenase () [ ], the last three sharing a 2-domain fold with storage protein 7s.The cAMP-binding domains found in the cAMP receptor protein (CRP) family display a similar β-roll architecture consisting of eight antiparallel β-strands and three helical segments [ ]. These proteins include CooA, a CO-sensing haem protein that functions as a transcription activator [], and the CnbD (cyclic nucleotide binding domain) of the HCN cation channel in which cAMP binding modulates gating of the channel [].
Protein Domain
Name: Pirin, N-terminal domain
Type: Domain
Description: Eukaryotic pirins are highly conserved nuclear proteins that may function as transcriptional regulators with a role in apoptosis [ , ]. Prokaryotic homologues have also been identified. Both bacterial and human pirins have been shown to possess quercetinase activity [], although this is not universally true for all family members - YhaK (), for example, displays no such enzymatic activity [ ].Pirin is composed of two structurally similar domains arranged face to face. Although the two domains are similar, the C-terminal domain of pirin differs from the N-terminal domain as it does not contain a metal binding site and its sequence does not contain the conserved metal-coordinating residues [ ].Pirin is considered a member of the cupin superfamily on the basis of primary sequence and structural similarity. The presence of a metal binding site in the N-terminal β-barrel of pirin, may be significant in its interaction with Bcl-3 and nuclear factor I (NFI) and role in regulating NF-kappaB transcription factor activity [ ].This entry represents the Pirin N-terminal domain.
Protein Domain
Name: Pirin
Type: Family
Description: Eukaryotic pirins are highly conserved nuclear proteins that may function as transcriptional regulators with a role in apoptosis [ , ]. Prokaryotic homologues have also been identified. Both bacterial and human pirins have been shown to possess quercetinase activity [], although this is not universally true for all family members - YhaK (), for example, displays no such enzymatic activity [ ].Pirin is composed of two structurally similar domains arranged face to face. Although the two domains are similar, the C-terminal domain of pirin differs from the N-terminal domain as it does not contain a metal binding site and its sequence does not contain the conserved metal-coordinating residues [ ].Pirin is considered a member of the cupin superfamily on the basis of primary sequence and structural similarity. The presence of a metal binding site in the N-terminal β-barrel of pirin, may be significant in its interaction with Bcl-3 and nuclear factor I (NFI) and role in regulating NF-kappaB transcription factor activity [ ].
Protein Domain
Name: Pirin, C-terminal domain
Type: Domain
Description: Eukaryotic pirins are highly conserved nuclear proteins that may function as transcriptional regulators with a role in apoptosis [ , ]. Prokaryotic homologues have also been identified. Both bacterial and human pirins have been shown to possess quercetinase activity [], although this is not universally true for all family members - YhaK (), for example, displays no such enzymatic activity [ ].Pirin is composed of two structurally similar domains arranged face to face. Although the two domains are similar, the C-terminal domain of pirin differs from the N-terminal domain as it does not contain a metal binding site and its sequence does not contain the conserved metal-coordinating residues [ ].Pirin is considered a member of the cupin superfamily on the basis of primary sequence and structural similarity. The presence of a metal binding site in the N-terminal β-barrel of pirin, may be significant in its interaction with Bcl-3 and nuclear factor I (NFI) and role in regulating NF-kappaB transcription factor activity [ ].This entry represents the Pirin C-terminal domain.
Protein Domain
Name: Glycosyltransferase 61
Type: Family
Description: Glycosyltransferase 61 family members are further processed into a mature form. Proteins in this family includes O-linked-mannose beta-1,4-N-acetylglucosaminyltransferase 2 (POMGnT2, also known as EOGTL) [ ] and EGF domain-specific O-linked N-acetylglucosamine transferase (EOGT) []. This entry also includes plant beta-(1,2)-xylosyltransferase [ ].
Protein Domain
Name: Protein of unknown function DUF716
Type: Family
Description: These sequences are a family of uncharacterised hypothetical proteins restricted to eukaryotes ( ) represents a sequence from Nicotiana tabacum (Common tobacco) which is up regulated in response to TMV infection.
Protein Domain
Name: Pectinacetylesterase/NOTUM
Type: Family
Description: This family includes protein Notum from animals and pectinacetylesterase (PAE) from plants. Notum is a carboxylesterase that removes an essential palmitoleate moiety from Wnt proteins. Notum constitutes the first known extracellular protein deacylase [ , ]. PAEs catalyse the deacetylation of pectin, a major compound of primary cell walls [, ].
Protein Domain
Name: Ribosomal protein S8e, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].A number of eukaryotic and archaeal ribosomal proteins have been grouped based on sequence similarities []. One of these families, S8e, consists of a number of proteins with either about 220 amino acids (in eukaryotes) or about 125 amino acids (in archaea).
Protein Domain
Name: Ribosomal protein S8e
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].A number of eukaryotic and archaeal ribosomal proteins have been grouped based on sequence similarities []. One of these families, S8e, consists of a number of proteins with either about 220 amino acids (in eukaryotes) or about 125 amino acids (in archaea).
Protein Domain
Name: Ribosomal protein S8e/ribosomal biogenesis NSA2
Type: Family
Description: A number of eukaryotic and archaeal ribosomal proteins have been grouped based on sequence similarities []. One of these families, S8e, consists of a number of proteins with either about 220 amino acids (in eukaryotes) or about 125 amino acids (in archaea).This entry also contains proteins annotated as NSA2, which are though to be involved in ribosomal biogenesis of the 60S ribosomal subunit, having a role in the quality control of pre-60S particles. They are a component of the pre-66S ribosomal particle.
Protein Domain
Name: Cytokinin riboside 5'-monophosphate phosphoribohydrolase LOG
Type: Family
Description: This entry represents a cytokinin-activating enzyme working in the direct activation pathway. It is a phosphoribohydrolase that converts inactive cytokinin nucleotides to the biologically active free-base forms [ , ]. The proteins in this entry belong to the LOG family of proteins.
Protein Domain
Name: Glycosyl transferase, family 1
Type: Domain
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Proteins containing this domain transfer UDP, ADP, GDP or CMP linked sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. The bacterial enzymes are involved in various biosynthetic processes that include exopolysaccharide biosynthesis, lipopolysaccharide core biosynthesis and the biosynthesis of the slime polysaccaride colanic acid. Mutations in this domain of the human N-acetylglucosaminyl-phosphatidylinositol biosynthetic protein are the cause of paroxysmal nocturnal hemoglobinuria (PNH) [ ], an acquired hemolytic blood disorder characterised by venous thrombosis, erythrocyte hemolysis, infections and defective hematopoiesis.
Protein Domain      
Protein Domain
Name: Chloramphenicol acetyltransferase-like domain superfamily
Type: Homologous_superfamily
Description: Chloramphenicol acetyltransferase (CAT) catalyses the acetyl-CoA dependent acetylation of chloramphenicol, resulting in the inactivation of the antibiotic. The crystal structure of the type III enzyme from Escherichia coli with chloramphenicol bound has been determined. CAT is a trimer of identical subunits (monomer Mr 25,000) and the trimeric structure is stabilised by a number of hydrogen bonds, some of which result in the extension of a β-sheet across the subunit interface. Chloramphenicol binds in a deep pocket located at the boundary between adjacent subunits of the trimer, such that the majority of residues forming the binding pocket belong to one subunit while the catalytically essential histidine belongs to the adjacent subunit. His195 is appropriately positioned to act as a general base catalyst in the reaction, and the required tautomeric stabilisation is provided by an unusual interaction with a main-chain carbonyl oxygen [ ].This superfamily represents a domain characteristic of trimeric enzymes with the active sites being located in between subunits, including chloramphenicol acetyltransferase.
Protein Domain
Name: Nodulin-like
Type: Domain
Description: This entry represents a conserved region within plant nodulin-like proteins and a number of uncharacterised proteins.
Protein Domain
Name: Mechanosensitive ion channel MscS
Type: Family
Description: This entry represents a family of small conductance mechanosensitive channels (MscS).Mechanosensitive (MS) channels provide protection against hypo-osmotic shock, responding both to stretching of the cell membrane and to membrane depolarisation. They are present in the membranes of organisms from the three domains of life: bacteria, archaea, and eukarya [ ]. There are two families of MS channels: large-conductance MS channels (MscL) and small-conductance MS channels (MscS or YGGB). The pressure threshold for MscS opening is 50% that of MscL []. The MscS family is much larger and more variable in size and sequence than the MscL family. Much of the diversity in MscS proteins occurs in the size of the transmembrane regions, which ranges from three to eleven transmembrane helices, although the three C-terminal helices are conserved.MscS folds as a homo-heptamer with a cylindrical shape, and can be divided into transmembrane and extramembrane regions: an N-terminal periplasmic region, a transmembrane region, and a C-terminal cytoplasmic region (middle and C-terminal domains). The transmembrane region forms a channel through the membrane that opens into a chamber enclosed by the extramembrane portion, the latter connecting to the cytoplasm through distinct portals [, ].In the fission yeast Schizosaccharomyces pombe the mechanosensitive ion channel proteins are known as msy1 and msy2 [ , ].
Protein Domain
Name: LSM domain superfamily
Type: Homologous_superfamily
Description: This domain superfamily is found as the core structure in Lsm (like-Sm) proteins and bacterial Lsm-related Hfq proteins, and as the middle domain of the mechanosensitive channel protein MscS. In each case, the domain adopts a core structure consisting of an open β-barrel with an SH3-like topology.Lsm proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [ , ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. These snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. Other snRNPs, such as U7 snRNP, can contain different Lsm proteins. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins.The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA [ ]. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.The middle domain of the mechanosensitive channel of small conductance protein (MscS or YggB) structurally resembles an Lsm protein. MscS is a mechanosensitive channel present in the membrane of bacteria, archaea and eukarya that responds both to stretching of the cell membrane and to membrane depolarisation [ ]. MscS folds as a homo-heptamer with a cylindrical shape, and can be divided into transmembrane and extramembrane regions: an N-terminal periplasmic region, a transmembrane region, and a C-terminal cytoplasmic region. The C-terminal cytoplasmic region can be further divided into middle and C-terminal domains, which together create a framework that connects to the cytoplasm through distinct openings. The middle domain exhibits an Lsm-like structure, consisting of five β-strands that pack together with those of other subunits to form a barrel-like sheet extending around the entire protein.
Protein Domain
Name: Ribosomal protein L13e
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].The ribosomal protein L13e is widely found in vertebrates [ ], Drosophila melanogaster, plants, yeast, amongst others.
Protein Domain
Name: Ribosomal protein L13e, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].The ribosomal protein L13e is widely found in vertebrates [ ], Drosophila melanogaster, plants, yeast, amongst others.
Protein Domain
Name: Vacuolar protein sorting-associated protein 62
Type: Family
Description: Vps62 is a vacuolar protein sorting (VPS) protein required for cytoplasm to vacuole targeting of proteins [ ].
Protein Domain
Name: K Homology domain, type 1
Type: Domain
Description: The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids that is present in a wide variety of quite diverse nucleic acid-binding proteins [ ]. It has been shown to bind RNA [, ]. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitroRNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently [ ].According to structural [ , , ] analysis the KH domain can be separated in two groups. The first group or type-1 contain a β-α-α-β-β-α structure, whereas in the type-2 the two last β-sheet are located in the N-terminal part of the domain (α-β-β-α-α-β). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helices 1 and 2 in type-1 and between helices 2 and 3 in type-2. Proteins known to contain a type-1 KH domain include bacterial polyribonucleotide nucleotidyltransferase (); vertebrate Fragile X messenger ribonucleoprotein 1 (FMR1); eukaryotic heterogeneous nuclear ribonucleoprotein K (hnRNP K), one of at least 20 major proteins that are part of hnRNP particles in mammalian cells; mammalian poly(rC) binding proteins; Artemia salina glycine-rich protein GRP33; yeast PAB1-binding protein 2 (PBP2); vertebrate vigilin; and human high-density lipoprotein binding protein (HDL-binding protein).
Protein Domain
Name: K Homology domain
Type: Domain
Description: The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. An evolutionarily conserved sequence of around 70 amino acids, the KH domain is present in a wide variety of nucleic acid-binding proteins. The KH domain binds RNA, and can function in RNA recognition [ ]. It is found in multiple copies in several proteins, where they can function cooperatively or independently. For example, in the AU-rich element RNA-binding protein KSRP, which has 4 KH domains, KH domains 3 and 4 behave as independent binding modules to interact with different regions of the AU-rich RNA targets []. The solution structure of the first KH domain of FMR1 [] and of the C-terminal KH domain of hnRNP K [] determined by nuclear magnetic resonance(NMR) revealed a β-α-α-β-β-α structure. Proteins containing KH domains include: Bacterial and organelle PNPases [ ].Archaeal and eukaryotic exosome subunits [ ].Eukaryotic and prokaryotic RS3 ribosomal proteins [ ].Vertebrate Fragile X messenger ribonucleoprotein 1 (FMR1) [ ].Vigilin, which has 14 KH domains [ ].AU-rich element RNA-binding protein KSRP.hnRNP K, which contains 3 KH domains.Human onconeural ventral antigen-1 (NOVA-1) [ ].According to structural analyses [ , , ], the KH domain can be separated in two groups - type 1 and type 2.
Protein Domain
Name: Domain of unknown function DUF3447
Type: Domain
Description: This presumed domain is functionally uncharacterised. This domain is found in eukaryotes. This domain is about 80 amino acids in length. This domain is found associated with [ ]. This domain has a conserved SHN sequence motif. It seems likely that this region represents divergent Ankyrin repeats.
Protein Domain
Name: Ankyrin repeat
Type: Repeat
Description: The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators [], cytoskeletal, ion transporters and signal transducers [, ]. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures [ , , , ]. Each repeat folds into a helix-loop-helix structure with a β-hairpin/loop region projecting out from the helices at a 90oangle. The repeats stack together to form an L-shaped structure [ , ].
Protein Domain      
Protein Domain      
Protein Domain
Name: Kinesin motor domain
Type: Domain
Description: Kinesin [ , , ] is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central α-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.The kinesin motor domain comprises five motifs, namely N1 (P-loop), N2 (Switch I), N3 (Switch II), N4 and L2 (KVD finger) [ ]. It has a mixed eight stranded β-sheet core with flanking solvent exposed α-helices and a small three-stranded antiparallel β-sheet in the N-terminal region [].A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain [ , ]:Drosophila melanogaster claret segregational protein (ncd). Ncd is required for normal chromosomal segregation in meiosis, in females, and in early mitotic divisions of the embryo. The ncd motor activity is directed toward the microtubule's minus end.Homo sapiens CENP-E [ ]. CENP-E is a protein that associates with kinetochores during chromosome congression, relocates to the spindle midzone at anaphase, and is quantitatively discarded at the end of the cell division. CENP-E is probably an important motor molecule in chromosome movement and/or spindle elongation.H. sapiens mitotic kinesin-like protein-1 (MKLP-1), a motor protein whose activity is directed toward the microtubule's plus end.Saccharomyces cerevisiae KAR3 protein, which is essential for nuclear fusion during mating. KAR3 may mediate microtubule sliding during nuclear fusion and possibly mitosis.S. cerevisiae CIN8 and KIP1 proteins which are required for the assembly of the mitotic spindle. Both proteins seem to interact with spindle microtubules to produce an outwardly directed force acting upon the poles.Emericella nidulans (Aspergillus nidulans) bimC, which plays an important role in nuclear division.A. nidulans klpA.Caenorhabditis elegans unc-104, which may be required for the transport of substances needed for neuronal cell differentiation.C. elegans osm-3.Xenopus laevis Eg5, which may be involved in mitosis.Arabidopsis thaliana KatA, KatB and katC.Chlamydomonas reinhardtii FLA10/KHP1 and KLP1. Both proteins seem to play a role in the rotation or twisting of the microtubules of the flagella.C. elegans hypothetical protein T09A5.2.The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.
Protein Domain
Name: Kinesin-like protein
Type: Family
Description: Kinesin [ , , ] is a microtubule-associated force-producing protein that play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.This entry includes kinesin heavy chain and kinesin-like proteins, which are members of the kinesin superfamily (also known as KIFs). KIFs constitute 15 kinesin families [ ]. They are important molecular motors that directionally transport various cargos, including membranous organelles, protein complexes and mRNAs [].
Protein Domain      
Protein Domain
Name: C2 domain
Type: Domain
Description: The C2 domain is a Ca 2+-dependent membrane-targeting module found in many cellular proteins involved in signal transduction or membrane trafficking. C2 domains are unique among membrane targeting domains in that they show wide range of lipid selectivity for the major components of cell membranes, including phosphatidylserine and phosphatidylcholine. This C2 domain is about 116 amino-acid residues and is located between the two copies of the C1 domain in Protein Kinase C and the protein kinase catalytic domain [ ]. Regions with significant homology [] to the C2-domain have been found in many proteins. The C2 domain is thought to be involved in calcium-dependent phospholipid binding [] and in membrane targetting processes such as subcellular localisation. The 3D structure of the C2 domain of synaptotagmin has been reported [], the domain forms an eight-stranded β-sandwich constructed around a conserved 4-stranded motif, designated a C2 key []. Calcium binds in a cup-shaped depression formed by the N- and C-terminal loops of the C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel β-sandwich.
Protein Domain      
Protein Domain
Name: ArsR-like helix-turn-helix domain
Type: Domain
Description: This domain is found in the arsenical resistance operon repressor (ArsR) and similar prokaryotic, metal regulated homodimeric repressors. The ArsR subfamily of helix-turn-helix bacterial transcription regulatory proteins (winged helix topology) includes several proteins that appear to dissociate from DNA in the presence of metal ions [ , , , ].
Protein Domain
Name: HRDC-like superfamily
Type: Homologous_superfamily
Description: This superfamily represents the HRDC (helicase and RNaseD C-terminal) domain, which comprises two orthogonally packed α-hairpin subdomains, and is involved in interactions with DNA and protein. The HRDC (helicase and RNaseD C-terminal) domain is found at the C terminus of many RecQ helicases, including the human Bifunctional 3'-5' exonuclease/ATP-dependent helicase WRN and RecQ-like DNA helicase BLM (previously known as Werner and Bloom syndrome proteins) [ ]. RecQ helicases have been shown to unwind DNA in an ATP-dependent manner. The structure of the HRDC domain consists of a 4-5 helical bundle of two orthogonally packed α-hairpins, and as such it resembles auxiliary domains in bacterial DNA helicases and other proteins that interact with nucleic acids. A positively charged region on the surface of the HRDC domain is able to interact with DNA.The HRDC domain is also present in eukaryotic and archaeal RNA polymerase II subunit RBP4, the N-terminal of which forms a heterodimerisation α-hairpin [ , ].
Protein Domain
Name: DEAD/DEAH box helicase domain
Type: Domain
Description: Proteins with this domain include the DEAD and DEAH box helicases. Helicases are involved in unwinding nucleic acids. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression [ , ].
Protein Domain
Name: HRDC domain
Type: Domain
Description: The HRDC (helicase and RNaseD C-terminal) domain is comprised of two orthogonally packed α-hairpin subdomains, and is involved in interactions with DNA and protein. It has been suggested that this domain plays a role dissolving double Holliday junctions efficiently [ ].HRDC domains are found at the C terminus of many RecQ helicases, including the human Bifunctional 3'-5' exonuclease/ATP-dependent helicase WRN and RecQ-like DNA helicase BLM [ , ]. RecQ helicases have been shown to unwind DNA in an ATP-dependent manner. The structure of the HRDC domain consists of a 4-5 helical bundle of two orthogonally packed alpha-hairpins, and as such it resembles auxiliary domains in bacterial DNA helicases and other proteins that interact with nucleic acids. A positively charged region on the surface of the HRDC domain is able to interact with DNA.The HRDC domain is also present in eukaryotic and archaeal RNA polymerase II subunit RBP4, the N-terminal of which forms a heterodimerisation α-hairpin [, ].The HRDC domain has a putative role in nucleic acid binding. Mutations in the HRDC domain associated with the human BLM gene result in Bloom Syndrome (BS), an autosomal recessive disorder characterised by proportionate pre- and postnatal growth deficiency; sun-sensitive, telangiectatic, hypo- and hyperpigmented skin; predisposition to malignancy; and chromosomal instability [ ].
Protein Domain
Name: DNA helicase, ATP-dependent, RecQ type
Type: Family
Description: The ATP-dependent DNA helicase RecQ ( ) is involved in genome maintenance [ ]. All homologues tested to date unwind paired DNA, translocating in a 3' to 5' direction and several have a preference for forked or 4-way DNA structures (e.g. Holliday junctions) or for G-quartet DNA. The yeast protein, Sgs1, is present in numerous foci that coincide with sites of de novosynthesis DNA, such as the replication fork, and protein levels peak during S-phase. A model has been proposed for Sgs1p action in the S-phase checkpoint response, both as a 'sensor' for damage during replication and a 'resolvase' for structures that arise at paused forks, such as the four-way 'chickenfoot' structure. The action of Sgs1p may serve to maintain the proper amount and integrity of ss DNA that isnecessary for the binding of RPA (replication protein A, the eukaryotic ss DNA-binding protein)-DNA pol complexes. Sgs1p would thus function by detecting (or resolving) aberrant DNA structures, and would thus contribute to the full activation of the DNA-dependent protein kinase, Mec1p and the effector kinase, Rad53p. Its ability to bind both the large subunit of RPA and theRecA-like protein Rad51p, place it in a unique position to resolve inappropriate fork structures that can occur when either the leading or lagging strand synthesis is stalled. Thus, RecQ helicases integrate checkpoint activation and checkpoint response.
Protein Domain
Name: RQC domain
Type: Domain
Description: This entry represents the RQC domain, which is a DNA-binding domain found only in RecQ family enzymes [ ]. RecQ family helicases can unwind G4 DNA, and play important roles at G-rich domains of the genome, including the telomeres, rDNA, and immunoglobulin switch regions. This domain has a helix-turn-helix structure and acts as a high affinity G4 DNA binding domain []. Binding of RecQ to Holliday junctions involves both the RQC and the HRDC domains.
Protein Domain
Name: Kelch-type beta propeller
Type: Homologous_superfamily
Description: Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified [ ]. This sequence motif represents one β-sheet blade, and several of these repeats can associate to form a β-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein (also known as ring canal kelch protein), creating a 6-bladed β-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded antiparallel β-sheet motif that forms the repeat unit in a super-barrel structural fold [].The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila [ ]. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase [].This entry represents the 6-bladed Kelch β-propeller, which consists of six 4-stranded β-sheet motifs (or six Kelch repeats).
Protein Domain      
Protein Domain
Name: GAGA-binding transcriptional activator
Type: Family
Description: This family includes GAGA-binding protein protein (gbp) from Soybean that binds to GAGA element dinucleotide repeat DNA [ ]. It seems likely that the region which defines this family mediates DNA binding. This putative domain contains several conserved cysteines and a histidine suggesting this may be a zinc-binding DNA interaction domain.
Protein Domain
Name: ENTH/VHS
Type: Homologous_superfamily
Description: This superfamily represents domains with a multi-helical, α-α 2-layered structural fold as found in: the ENTH domain of Epsin; the VHS domain of Hrs, Tom1, and ADP-ribosylation factors; the RPR domain of PCF11 protein; and the N-terminal domain of phosphoinositide-binding clathrin adaptor.The epsin NH2-terminal homology (ENTH) domain is a membrane interacting module composed of a superhelix of α-helices. It is present at the NH2-terminus of proteins that often contain consensus sequences for binding to clathrin coat components and their accessory factors, and therefore function as endocytic adaptors. ENTH domain containing proteins have additional roles in signalling and actin regulation and may have yet other actions in the nucleus. The ENTH domain is structurally similar to the VHS domain.The ENTH domain is approximately 150 amino acids long. The ENTH domain forms a compact globular structure, composed of eight α-helices connected by loops of varying length. Three helical hairpins that are stacked consecutively with a right-handed twist determine the general topology of the domain. This stacking gives the ENTH domain a rectangular appearance when viewed face on. The most highly conserved amino acids fall roughly into two classes: internal residues that are involved in packing and therefore are necessary for structural integrity, and solvent accessible residues that may be involved in protein-protein interactions [ ].VHS domains are found at the N termini of select proteins involved in intracellular membrane trafficking. The domain consists of eight helices arranged in a superhelix. The surface of the domain has two main features: a basic patch on one side due to several conserved positively charged residues on helix 3 and a negatively charged ridge on the opposite side, formed by residues on helix 2. Comparison of the two VHS domains and the ENTH domain reveals a conserved surface, composed of helices 2 and 4, that is utilised for protein-protein interactions. In addition, VHS domain-containing proteins are also often localized to membranes. It has therefore been suggested that the conserved positively charged surface of helix 3 in VHS and ENTH domains plays a role in membrane binding [ ].
Protein Domain
Name: ANTH domain superfamily
Type: Homologous_superfamily
Description: The AP180 N-terminal homology (ANTH) domain is a membrane binding domain found in endocytotic accessory proteins, such as AP180. The ANTH domain is involved in phosphatidylinositol 4,5-bisphosphate (also known as PIP2) binding and is also responsible for membrane localisation of AP180. The ANTH domain containing proteins appear to be universal elements in nucleation of clathrin coats [ , ]. The N-terminal phosphoinositide-binding domain of CALM and AP180 consist of nine alpha helices forming a solenoid structure. This is most similar to the ENTH domain of epsin, with the first seven helices of epsin superimposing well on those of CALM. However, in epsin the final alpha8 helix folds back across the others, whereas in CALM and AP180 the final three long helices continue the solenoidal pattern [ ].
Protein Domain
Name: AP180 N-terminal homology (ANTH) domain
Type: Domain
Description: The AP180 N-terminal homology (ANTH) domain is a membrane binding domain found in endocytotic accessory proteins, such as AP180. AP180 has been implicated in the formation of clathrin-coated pits. The ANTH domain is involved in phosphatidylinositol 4,5-bisphosphate (also known as PIP2) binding. The ANTH domain containing proteins appear to be universal elements in nucleation of clathrin coats [ , ].
Protein Domain
Name: ENTH domain
Type: Domain
Description: The ENTH (Epsin N-terminal homology) domain is approximately 150 amino acids in length and is always found located at the N-termini of proteins. The domain forms a compact globular structure, composed of 9 α-helices connected by loops of varying length. The general topology is determined by three helical hairpins that are stacked consecutively with a right hand twist [ ]. An N-terminal helix folds back, forming a deep basic groove thatforms the binding pocket for the Ins(1,4,5)P3 ligand [ ]. The ligand is coordinated by residues from surrounding α-helices and all three phosphates are multiply coordinated. The coordination of Ins(1,4,5)P3 suggests that ENTH is specific for particular head groups.Proteins containing this domain have been found to bind PtdIns(4,5)P2 and PtdIns(1,4,5)P3 suggesting that the domain may be a membrane interacting module. The main function of proteins containing this domain appears to be to act as accessory clathrin adaptors in endocytosis, Epsin is able to recruit and promote clathrin polymerisation ona lipid monolayer, but may have additional roles in signalling and actin regulation [ ]. Epsin causes a strong degree of membrane curvature andtubulation, even fragmentation of membranes with a high PtdIns(4,5)P2 content. Epsin binding to membranes facilitates their deformation by insertion of the N-terminal helix into the outer leaflet of the bilayer, pushing the head groupsapart. This would reduce the energy needed to curve the membrane into a vesicle, making it easier for the clathrin cage to fix and stabilise the curved membrane. This points to a pioneering role for epsin in vesiclebudding as it provides both a driving force and a link between membrane invagination and clathrin polymerisation.
Protein Domain
Name: Glycoside hydrolase family 47
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Glycoside hydrolase family 47 comprises enzymes with only one known activity; alpha-mannosidase ( ). Alpha-mannosidase is involved in the maturation of Asn-linked oligo-saccharides [ ]. The enzyme hydrolyses terminal 1,2-linked alpha-D-mannoseresidues in the oligo-mannose oligosaccharide man(9)(glcnac)(2) in a calcium-dependent manner. The mannose residues are trimmed away to produce,first, man(8)glcnac(2), then a man(5)(glcnac)(2) structure.
Protein Domain
Name: Nonaspanin (TM9SF)
Type: Family
Description: The transmembrane 9 superfamily protein (TM9SF) may function as a channel or small molecule transporter. Proteins in this group are endosomal integral membrane proteins.
Protein Domain
Name: Spermidine/putrescine-binding periplasmic protein
Type: Family
Description: Bacterial high affinity transport systems are involved in active transport of solutes across the cytoplasmic membrane. The protein components of these traffic systems include one or two transmembrane protein components, one or two membrane-associated ATP-binding proteins and a high affinity periplasmic solute-binding protein. In Gram-positive bacteria, which are surrounded by a single membrane and therefore have no periplasmic region, the equivalent proteins are bound to the membrane via an N-terminal lipid anchor. These homologue proteins do not play an integral role in the transport process per se, but probably serve as receptors to trigger or initiate translocation of the solute through the membrane by binding to external sites of the integral membrane proteins of the efflux system. In addition at least some solute-binding proteins function in the initiation of sensory transduction pathways.The bacterial Spermidine/putrescine-binding periplasmic protein (PotD) is involved in the polyamine transport system. It is required for the activity of the bacterialperiplasmic transport system of putrescine and spermidine [ , ]. This protein has two domains connected throughtwo β-strands, which form a hinge at the bottom of the central cleft, and this hinge lies and one short peptide segment [].Similar proteins with specificities for putrecine and spermidine are also included in this family, such as Putrescine-binding periplasmic protein PotF from Escherichia coli, more specifically involved in putrescine uptake [ , , ] and Spermidine-binding periplasmic protein SpuE from Pseudomonas aeruginosa [] respectively.Putrescine/cadaverine-binding protein and Putrescine/agmatine-binding protein from P. aeruginosa also belong to this entry [ ].
Protein Domain
Name: TAFII-230 TBP-binding
Type: Domain
Description: In eukaryotes, the general transcription factor TFIID helps to regulate transcription by RNA polymerase II from class II promoters. TFIID consists of TATA-box-binding proteins (TBP) and TBP-associated factors (TAFIIs), which together mediate both activation and inhibition of transcription. In Drosophila, the N-terminal region of TAFII-230 (the TFIID 230kDa subunit) binds directly to TBP, thereby inhibiting the binding of TBP to the TATA box. The N-terminal domain is comprised of three alpha helices and a beta hairpin, which forms the core that occupies the DNA-binding surface of TBP [].
Protein Domain
Name: Transcription initiation factor TFIID subunit 1, histone acetyltransferase domain
Type: Domain
Description: Transcription initiation factor TFIID is a multimeric protein complex that plays a central role in mediating promoter responses to various activators and repressors. The complex includes TATA binding protein (TBP) and various TBP-associated factors (TAFS). TFIID is a RNA polymerase II-specific TATA-binding protein-associated factor (TAF) that is essential for viability.This group represents a transcription initiation factor TFIID subunit 1 (TAF1, also known as cell cycle gene 1 protein) [ , , ]. This is the largest subunit and the core scaffold of the complex, contains Ser/Thr kinase domains which can autophosphorylate or transphosphorylate other transcription factors including TP53, GTF2A1 and GTF2F1 [, ], and has acetyltransferase activity towards histones H3 and H4 []. It is essential for progression of the G1 phase of the cell cycle [].This entry represents the histone acetyltransferase domain (HAT, formerly known as DUF3591) that is found centrally in the protein TAF1 from eukaryotes. This region is highly conserved from yeast to human. X-ray determination of the crystal structure show it has a compact architecture that consists of a winged helix (WH) domain that folds on top of a triple barrel and a C-terminal α-helical region. The WH domain has intrinsic DNA-binding activity [ , , , ].
Protein Domain
Name: Cysteine peptidase, histidine active site
Type: Active_site
Description: Thiol (cysteine) proteases (EC 3.4.22.-) [ ] are a family of proteolytic enzymes which contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby histidine side chain; an asparagine completes the essential catalytic triad.Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad (cysteine-histidene) or triad [ ].Modification of the catalytic triad, especially of its first amino acid (cysteine), has been postulated as a suitable target for a chemical modulation of enzyme function. This is the case for silicateins, where the cysteine residue has been replaced by a serine [ ]. Silicateins represent a group of enzymes possessing bi-functional activity; in addition to the silica-condensing activity, they possess a proteolytic (cathepsin-like) activity [].The sequences around the three active site residues are well conserved. This entry represents the histidine active site. The catalytic triad consists of this entry, and . This catalytic triad detects mainly proteases of the C1 family, including papain and several cathepsins.
Protein Domain
Name: Cysteine peptidase, asparagine active site
Type: Active_site
Description: Thiol (cysteine) proteases (EC 3.4.22.-) [ ] are a family of proteolytic enzymes which contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby histidine side chain; an asparagine completes the essential catalytic triad.Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad (cysteine-histidene) or triad [ ].Modification of the catalytic triad, especially of its first amino acid (cysteine), has been postulated as a suitable target for a chemical modulation of enzyme function. This is the case for silicateins, where the cysteine residue has been replaced by a serine [ ]. Silicateins represent a group of enzymes possessing bi-functional activity; in addition to the silica-condensing activity, they possess a proteolytic (cathepsin-like) activity [].The sequences around the three active site residues are well conserved. This entry represents the asparagine active site. The catalytic triad consists of this entry, and . This catalytic triad detects mainly proteases of the C1 family, including papain and several cathepsins.
Protein Domain
Name: Cysteine peptidase, cysteine active site
Type: Active_site
Description: Thiol (cysteine) proteases (EC 3.4.22.-) [ ] are a family of proteolytic enzymes which contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby histidine side chain; an asparagine completes the essential catalytic triad.Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad (cysteine-histidene) or triad [ ].Modification of the catalytic triad, especially of its first amino acid (cysteine), has been postulated as a suitable target for a chemical modulation of enzyme function. This is the case for silicateins, where the cysteine residue has been replaced by a serine [ ]. Silicateins represent a group of enzymes possessing bi-functional activity; in addition to the silica-condensing activity, they possess a proteolytic (cathepsin-like) activity [].The sequences around the three active site residues are well conserved. This entry represents the cysteine active site. The catalytic triad consists of this entry, and . This catalytic triad detects mainly proteases of the C1 family, including papain and several cathepsins. A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Peptidase C1A
Type: Family
Description: This group of cysteine peptidases belong to MEROPS peptidase family C1, sub-family C1A (papain family, clan CA). It includes related cysteine proteinases such as actinidin [ ]. This entry also includes proteins classed as non-peptidase homologues such as the catalytically inactive tubulointerstitial nephritis antigen (TIN-Ag) []. These have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity [ ]. Members of the papain family are widespread, found in baculovirus [], eubacteria, yeast, and practically all protozoa, plants and mammals []. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals []. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate [].
Protein Domain
Name: Peptidase C1A, papain C-terminal
Type: Domain
Description: This group of proteins belong to the cysteine peptidase family C1, sub-family C1A (papain family, clan CA). The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, carboxypeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity [ ]. Members of the papain family are widespread, found in bacteria, archaea, fungi, and practically all protozoa, plants and mammals [], and some viruses such as baculoviruses []. The proteins are typically lysosomal or secreted. The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159. Most papain-like cysteine peptidases are irreversibly inhibited by the synthetic inhibitor E64 []. Leupeptin is a reversible inhibitor but is also an inhibitor of chymotrypsin-like serine peptidases.A papain-like cysteine proteinase is typically synthesised as an inactive precursor (or zymogen) with an N-terminal propeptide. Activation requires removal of the propeptide. The propeptide is required for the proper folding of the newly synthesised enzyme, maintaining the peptidase in an inactive state and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. A propeptide can exhibit high selectivity for inhibition of the peptidase from which it originates [ ].The subfamily includes the following well characterised peptidases: Animal lysosomal peptidases such as cathepsins B (EC 3.4.22.1), L (EC 3.4.22.15), H (EC 3.4.22.16), S (EC 3.4.22.27), K (EC 3.4.22.38), F (EC 3.4.22.41), O (EC 3.4.22.42), V (EC 3.4.22.43) and X (a carboxypeptidase, EC 3.4.18.1).Plant peptidases such as papain (EC 3.4.22.2), ficin (EC 3.4.22.3), chymopapain (EC 3.4.22.6), asclepain A (EC 3.4.22.7), actinidin (EC 3.4.22.14), glycyl endopeptidase (EC 3.4.22.25), caricain (EC 3.4.22.30), ananain (EC 3.4.22.31), stem bromelain (EC 3.4.22.32 and fruit bromelain (EC 3.4.22.33). Protozoan peptidases such as histolysain (EC 3.4.22.35) and cruzipain (EC 3.4.22.51).Viral peptidases such as V-cath (EC 3.4.22.50).There are also proteins in the family that are not peptidases because one or more of the active site residues is not conserved. These include testin, tubulointerstitial nephritis antigen and silicatein.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Cathepsin propeptide inhibitor domain (I29)
Type: Domain
Description: This entry represents a peptidase inhibitor domain, which belongs to MEROPS peptidase inhibitor family I29. The domain is also found at the N terminus of a variety of peptidase precursors that belong to MEROPS peptidase subfamily C1A; these include cathepsin L, papain, and procaricain ( ) [ ]. It forms an α-helical domain that runs through the substrate-binding site, preventing access. Removal of this region by proteolytic cleavage results in activation of the enzyme. This domain is also found, in one or more copies, in a variety of cysteine peptidase inhibitors such as salarin [].
Protein Domain
Name: Formin, FH2 domain
Type: Domain
Description: Formin homology (FH) proteins play a crucial role in the reorganisation of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis [ ]. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains []. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain is required for the self-association of formin proteins through the ability of FH2 domains to directly bind each other [], and may also act to inhibit actin polymerisation []. The FH3 domain () is less well conserved and may be important for determining intracellular localisation of formin family proteins. In addition, some formins can contain a GTPase-binding domain (GBD) ( ) required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD). This entry represents the FH2 domain, which was shown by X-ray crystallography to have an elongated, crescent shape containing three helical subdomains [ ].
Protein Domain
Name: Formin-like family, plant
Type: Family
Description: Formins (formin homology proteins) proteins play a crucial role in the reorganisation of the actin cytoskeleton and associate with the fast-growing end (barbed end) of actin filaments [ , ]. This entry represents the formin homologues from plants. Seed plants have two formin clades with numerous paralogues []. They can be classified as class I and class II formins. Class I formins includes a N-terminal membrane insertion signal, a predicted extracytoplasmic Pro-rich stretch, a transmembrane region, and C-terminal FH1 and FH2 domains []. Though class II formins usually contain a N-terminal PTEN domain related to the human PTEN protein (implied in pathogenesis of the Parkinson disease) [], the N-termini of type-II plant formins do not contain any recognisable domain that can provide a clue to their biological function.
Protein Domain
Name: FERM/acyl-CoA-binding protein superfamily
Type: Homologous_superfamily
Description: This superfamily represents a structural domain with a core structure consisting of a 3-helical closed bundle with a left-handed twist, in an up-and-down arrangement. This structural motif occurs as subdomain 2 within FERM domains, as well as in acyl-CoA-binding proteins. The FERM domain (band F ezrin-radixin-moesin homology domains) has such a structure, acting as a common membrane-binding module involved in localising proteins to the plasma membrane [ ]. Proteins containing FERM include cytoskeletal proteins such as erythrocyte membrane protein 4.1R, talin, and the ezrin-radixin-moesin protein family, as well as several protein tyrosine kinases and phosphatases, and the neurofibromatosis 2 tumour suppressor protein merlin. The ezrin-radixin-moesin protein family function is to crosslink the actin filaments of cytoskeletal structures to the plasma membrane.In addition, acyl-CoA-binding protein (ACBP) contains a domain with a similar 3-helical bundle structure. ACBP plays an important role in fatty acid metabolism, maintaining a pool of fatty acyl-CoA molecules in the cell [ ].
Protein Domain
Name: Acyl-CoA-binding protein, ACBP
Type: Domain
Description: Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters [ ]. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor [].ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species [ ].Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats ( ) [ ].The ACB domain consists of four α-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is boundthrough specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group onthe adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein [].Other proteins containing an ACB domain include: Endozepine-like peptide (ELP) (gene DBIL5) from mouse [ ]. ELP is a testis-specific ACBP homologue that may be involved in the energy metabolism of the mature sperm.MA-DBI, a transmembrane protein of unknown function which has been found in mammals. MA-DBI contains a N-terminal ACB domain.DRS-1 [ ], a human protein of unknown function that contains a N-terminal ACB domain and a C-terminal enoyl-CoA isomerase/hydratase domain.
Protein Domain
Name: Amino acid/polyamine transporter I
Type: Family
Description: Amino acid permeases are integral membrane proteins involved in the transport of amino acids into the cell. A number of such proteins have been found to be evolutionary related [ , , ]. These proteins include several yeast specific and general amino acid permeases; Emericella nidulans (Aspergillus nidulans) proline transport protein (gene prnB); Trichoderma harzianum amino acid permease INDA1; Salmonella typhimurium L-asparagine permease (gene ansP); and several Escherichia coli and other bacterial permeases and transport proteins. These proteins seem to contain up to 12 transmembrane segments. This entry consists of members of the amino acid-polyamine-organocation (APC) superfamily [].Also included in this entry is the methylthioribose transporter mtrA from Bacillus subtilis, which transports methylthioribose into the cell [ ].
Protein Domain
Name: Zinc finger, CW-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a CW-type zinc finger motif, named for its conserved cysteine and tryptophan residues. It is predicted to be a highly specialised mononuclear four-cysteine (C4) zinc finger that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including chromatin methylation status and early embryonic development. Weak homology to members of further evidences these predictions. The domain is found exclusively in vertebrates, vertebrate-infecting parasites and higher plants [ ].
Protein Domain
Name: Pyridoxal phosphate-dependent decarboxylase
Type: Family
Description: Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). Pyridoxal 5'-phosphate (PLP) is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination [ , , ]. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors []. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy [].PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the ε-amino group of an active site lysine residue on the enzyme. The α-amino group of the substrate displaces the lysine ε-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic [ ].A number of pyridoxal-dependent decarboxylases share regions of sequence similarity, particularly in the vicinity of a conserved lysine residue, which provides the attachment site for the pyridoxal-phosphate (PLP) group [ , ]. Among these enzymes are aromatic-L-amino-acid decarboxylase (L-dopa decarboxylase or tryptophan decarboxylase), which catalyses the decarboxylation of tryptophan to tryptamine []; tyrosine decarboxylase, which converts tyrosine into tyramine; histidine decarboxylase, which catalyses the decarboxylation of histidine to histamine []; L-aspartate decarboxylase, which converts aspartate to beta-alanine []; and phenylacetaldehyde synthase that catalyses the decarboxylation of L-phenylalanine to 2-phenylethylamine []. These enzymes belong to the group II decarboxylases [, ].
Protein Domain
Name: Glutamate decarboxylase
Type: Family
Description: This entry represents glutamate decarboxylase (Gad; ) it is a pyridoxal 5'-phosphate (PLP)-dependent enzyme, which catalyses the irreversible α-decarboxylation of L-glutamate to gamma-aminobutyrate (GABA). This enzyme is widely distributed amongst eukaryotes and prokaryotes, but its function varies in different organisms [ ].GadD has a crucial role in the vertebrate central nervous system where it is responsible for the synthesis of GABA, the major inhibitory neurotransmitter. In the majority of vertebrates Gad occurs in two isoforms, Gad65 and Gad67, both active at neutral pH [ ]. Gad isoforms (GadA and GadB) have also been reported in some bacterial species, including the Gram-negative bacterium [] and Gram-positive bacterium [].A unique feature of plant and yeast Gad is the presence of a calmodulin (CaM)-binding domain in the C-terminal region. In Saccharomyces cerevisiae (Baker's yeast), Gad expression is required for normal oxidative stress tolerance [ ]. In plants, Gad is thought to be a stress-adapter chaperonin sensing Ca2+ signals.
Protein Domain
Name: PTR2 family proton/oligopeptide symporter, conserved site
Type: Conserved_site
Description: The transport of peptides into cells is a well-documented biological phenomenon which is accomplished by specific, energy-dependent transporters found in a number of organisms as diverse as bacteria and humans. The PTR family of proteins is distinct from the ABC-type peptide transporters and was uncovered by sequence analyses of a number of recently discovered peptide transport proteins [ ]. These proteins that seem to be mainly involved in the intake of small peptides with the concomitant uptake of a proton [].These integral membrane proteins are predicted to comprise twelve transmembrane regions.This entry describes two conserved sites. The first conserved site is found within a region that includes the end of the second transmembrane region, a cytoplasmic loop as well as the third transmembrane region. The second conserved site corresponds to the core of the fifth transmembrane region.
Protein Domain
Name: Remorin, C-terminal
Type: Domain
Description: Remorins are plant-specific plasma membrane-associated proteins. In tobacco remorin co-purifies with lipid rafts. Most remorins have a variable, proline-rich N-half and a more conserved C-half that is predicted to form coiled coils. Consistent with this, circular dichroism studies have demonstrated that much of the protein is α-helical. Remorins exist in plasma membrane preparations as oligomeric structures and form filaments in vitro. The proteins can bind polyanions including the extracellular matrix component oligogalacturonic acid (OGA). In vitro, remorin in plasma membrane preparations is phosphorylated (principally on threonine residues) in the presence of OGA and thus co-purifies with a protein kinases(s). The biological functions of remorins are unknown but roles as components of the membrane/cytoskeleton are possible [ ].
Protein Domain
Name: Zinc finger, CCCH-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents C-x8-C-x5-C-x3-H (CCCH) type Zinc finger (Znf) domains. Proteins containing CCCH Znf domains include Znf proteins from eukaryotes involved in cell cycle or growth phase-related regulation, e.g. human TIS11B (butyrate response factor 1), a probable regulatory protein involved in regulating the response to growth factors, and the mouse TTP growth factor-inducible nuclear protein, which has the same function. The mouse TTP protein is induced by growth factors. Another protein containing this domain is the human splicing factor U2AF 35kDa subunit, which plays a critical role in both constitutive and enhancer-dependent splicing by mediating essential protein-protein interactions and protein-RNA interactions required for 3' splice site selection. It has been shown that different CCCH-type Znf proteins interact with the 3'-untranslated region of various mRNA [ , ]. This type of Znf is very often present in two copies.
Protein Domain
Name: Glycoside hydrolase family 31
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Glycoside hydrolase family 31 comprises enzymes with several known activities; alpha-glucosidase ( ), alpha-galactosidase ( ); glucoamylase ( ), sucrase-isomaltase ( ); isomaltase ( ); alpha-xylosidase ( ); alpha-glucan lyase ( ). Glycoside hydrolase family 31 groups a number of glycosyl hydrolases on the basis of sequence similarities [, , ].An aspartic acid has been implicated [ ] in the catalytic activity of sucrase,isomaltase, and lysosomal alpha-glucosidase.
Protein Domain
Name: Galactose mutarotase-like domain superfamily
Type: Homologous_superfamily
Description: Proteins with this domain belong to the galactose mutarotase-like structural superfamily. The domain has a distorted supersandwich structure consisting of 18 strands in two sheets, and probably functions to bind carbohydrates in enzymes that act on sugars. Domains with this structure occur in several protein families, including galactose mutarotase () [ ]; domain 5 of beta-galactosidase (); the central domain of hyaluronate lyase-like enzymes, such as chondroitinase AC ( ), xanthan lyase ( ), chrondroitin ABC lyase I, and hyaluronate lyase ( ) itself [ ]; the N-terminal domain of lactobacillus maltose phosphorylase () and bacterial glucoamylase ( ) [ , ]; the C-terminal domain of eukaryotic alpha-mannosidase () and archaeon 4-alpha-glucanotransferase ( ) [ ].
Protein Domain
Name: Glycoside hydrolase family 31, N-terminal domain
Type: Domain
Description: This domain is found in proteins that belong to the glycoside hydrolase family 31. The domain appears to be similar to the galactose mutarotase superfamily.
Protein Domain
Name: THH1/TOM1/TOM3 domain
Type: Domain
Description: This domain is found in plant proteins including THH1/TOM1/TOM3 from Arabidopsis. TOM1 and TOM3 are transmembrane proteins necessary for the efficient multiplication of tobamoviruses [ ]. THH1 supports tobamovirus multiplication, but to a lesser extent than TOM1 and TOM3 []. Members containing this domain are part of the GPCR superfamily and involved in stress tolerance [, ].
Protein Domain
Name: Vps72/YL1 family
Type: Family
Description: This entry include a group of proteins involved in chromatin remodelling, including Vps72 (vacuolar protein sorting-associated protein 72) from budding yeasts. Vps72 is a Htz1-binding component of the SWR1 complex, which is required for the incorporation of the histone variant H2AZ into chromatin [ ]. It is also required for vacuolar protein sorting in budding yeasts [].The Vps72 homologue from animals, YL-1, is a deposition-and-exchange histone chaperone specific for H2AZ1, specifically chaperones H2AZ1 and deposits it into nucleosomes. It is component of the SRCAP and Tip60 complexes, and mediates the ATP-dependent exchange of histone H2AZ1/H2B dimers for nucleosomal H2A/H2B, leading to transcriptional regulation of selected genes by chromatin remodeling [ , ].
Protein Domain
Name: Vps72/YL1, C-terminal
Type: Domain
Description: This domain is found at the C terminus in proteins of the Vps72/YL1 family [ , ], in which it represents a proline-rich domain []. These proteins are involved in chromatin remodelling [, ]. This domain is also found in proteins that do not belong to the Vps72/YL1 family.
Protein Domain      
Protein Domain
Name: Glutamine amidotransferase type 2 domain
Type: Domain
Description: A large group of biosynthetic enzymes are able to catalyse the removal of the ammonia group from glutamine and then to transfer this group to a substrate to form a new carbon-nitrogen group. This catalytic activity is known as glutamine amidotransferase (GATase) [ ]. The GATase domain exists either as a separate polypeptidic subunit or as part of a larger polypeptide fused in different ways to a synthase domain. On the basis of sequence similarities two classes of GATase domains have been identified [, ]: class-I (also known as trpG-type or triad) and class-II (also known as purF-type or Ntn). Class-II (or type 2) GATase domains have been found in the following enzymes:Amido phosphoribosyltransferase (glutamine phosphoribosylpyrophosphate amidotransferase). An enzyme which catalyses the first step in purine biosynthesis, the transfer of the ammonia group of glutamine to PRPP to form 5-phosphoribosylamine (gene purF in bacteria, ADE4 in yeast).Glucosamine--fructose-6-phosphate aminotransferase. This enzyme catalyses a key reaction in amino sugar synthesis, the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine (gene glmS in Escherichia coli, nodM in Rhizobium, GFA1 in yeast).Asparagine synthetase (glutamine-hydrolyzing). This enzyme is responsible for the synthesis of asparagine from aspartate and glutamine.Glutamate synthase (gltS), an enzyme which participates in the ammonia assimilation process by catalysing the formation of glutamate from glutamine and 2-oxoglutarate. Glutamate synthase is a multicomponent iron-sulphur flavoprotein and three types occur which use a different electron donor: NADPH-dependent gltS (large chain), ferredoxin-dependent gltS and NADH-dependent gltS [ ].The active site is formed by a cysteine present at the N-terminal extremity of the mature form of all these enzymes [ , , , ]. Two other conserved residues, Asn and Gly, form an oxyanion hole for stabilisation of the formed tetrahedral intermediate. An insert of ~120 residues can occur between the conserved regions []. In some class-II GATases (for example in Bacillus subtilis or chicken amido phosphoribosyltransferase) the enzyme is synthesised with a short propeptide which is cleaved off post-translationally by a proposed autocatalytic mechanism. Nuclear-encoded Fd-dependent gltS have a longer propeptide which may contain a chloroplast-targeting peptide in addition to the propeptide that is excised on enzyme activation.The 3-D structure of the GATase type 2 domain forms a four layer alpha/beta/beta/alpha architecture which consists of a fold similar to the N-terminal nucleophile (Ntn) hydrolases. These have the capacity for nucleophilic attack and the possibility of autocatalytic processing. The N-terminal position and the folding of the catalytic Cys differ strongly from the Cys-His-Glu triad which forms the active site of GATases of type 1.
Protein Domain
Name: Amidophosphoribosyltransferase
Type: Family
Description: Purine nucleotides are synthesised both via the de novo pathway and via the salvage pathway and are vital for cell functions and cell proliferation through DNA and RNA syntheses and ATP energy supply. Amidophosphoribosyltransferase ( ) is the rate-limiting enzyme in the de novo pathway of purine ribonucleotide synthesis and is regulated by feedback inhibition by AMP and GMP [ ].5-phospho-beta-D-ribosylamine + diphosphate + L-glutamate = L-glutamine + 5-phospho-alpha-D-ribose 1-diphosphate + H2OThis family contains sequences which are members of the MEROPS peptidase family C44 (glutamine phosphoribosylpyrophosphate amidotransferase precursor, clan PB(C)) and sequences which are classed as non-peptidase homologues. These are sequences either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.
Protein Domain
Name: Fascin
Type: Family
Description: This entry represents fascin (FSCN) and its homologues, including FSCN1/2/3 from mammals and protein singed from fruit flies. Fascin is a globular actin cross-linking protein, which functions in forming parallel actin bundles in cell protrusions that are key specialisations of the plasma membrane for environmental guidance and cell migration [ ]. Human FSCN1 organises filamentous actin into bundles with a minimum of 4.1:1 actin/fascin ratio. Overexpression of fascin has been linked to a more aggressive clinical course of cancer [].
Protein Domain
Name: Glycoside hydrolase, family 5
Type: Domain
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Glycoside hydrolase family 5 comprises enzymes with several known activities; endoglucanase ( ); beta-mannanase ( ); exo-1,3-glucanase ( ); endo-1,6-glucanase ( ); xylanase ( ); endoglycoceramidase ( ). The microbial degradation of cellulose and xylans requires several types of enzymes. Fungi and bacteria produces a spectrum of cellulolytic enzymes (cellulases) and xylanases which, on the basis of sequence similarities, can be classified into families. One of these families is known as the cellulase family A [ ] or as the glycosyl hydrolases family 5 []. One of the conserved regions in this family contains a conserved glutamic acid residue which is potentially involved [] in the catalytic mechanism.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom