Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 14201 to 14300 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.036s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Phosphatidylinositol 4-phosphate 3-kinase C2 domain-containing subunit alpha, PX domain
Type: Domain
Description: This entry represents the PX domain found in phosphatidylinositol 4-phosphate 3-kinase C2 domain-containing subunit alpha (PI3K-C2alpha). The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others []. PI3K-C2alpha, the class II alpha isoform of PI3K, plays key roles in clathrin assembly and clathrin-mediated membrane trafficking, insulin signaling, vascular smooth muscle contraction, and the priming of neurosecretory granule exocytosis [, , ]. The PX domain is involved in targeting of proteins to PI-enriched membranes, and may also be involved in protein-protein interaction [, ].The Phosphoinositide 3-Kinase (PI3K) family of enzymes catalyzes the phosphorylation of the 3-hydroxyl group of the inositol ring of phosphatidylinositol. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis [ , ]. PI3Ks are divided into three main classes (I, II, and III) based on their substrate specificity, regulation, and domain structure. Class II PI3Ks preferentially use PI as a substrate to produce PI3P, but can also phosphorylate PI4P to produce PI(3,4)P2. They function as monomers and do not associate with any regulatory subunits. Class II enzymes contain an N-terminal Ras binding domain, a lipid binding C2 domain, a PI3K homology domain of unknown function, an ATP-binding cataytic domain, a PX domain, and a second C2 domain at the C terminus [ ].
Protein Domain
Name: Photosynthetic reaction centre, L/M
Type: Family
Description: The photosynthetic apparatus in non-oxygenic bacteria consists of light-harvesting (LH) protein-pigment complexes LH1 and LH2, which use carotenoid and bacteriochlorophyll as primary donors [ ]. LH1 acts as the energy collection hub, temporarily storing it before its transfer to the photosynthetic reaction centre (RC) []. Electrons are transferred from the primary donor via an intermediate acceptor (bacteriopheophytin) to the primary acceptor (quinine Qa), and finally to the secondary acceptor (quinone Qb), resulting in the formation of ubiquinol QbH2. RC uses the excitation energy to shuffle electrons across the membrane, transferring them via ubiquinol to the cytochrome bc1 complex in order to establish a proton gradient across the membrane, which is used by ATP synthetase to form ATP [ , , ]. The core complex is anchored in the cell membrane, consisting of one unit of RC surrounded by LH1; in some species there may be additional subunits [ ]. RC consists of three subunits: L (light), M (medium), and H (heavy). Subunits L and M provide the scaffolding for the chromophore, while subunit H contains a cytoplasmic domain []. In Rhodopseudomonas viridis, there is also a non-membranous tetrahaem cytochrome (4Hcyt) subunit on the periplasmic surface. This entry describes the photosynthetic reaction centre L and M subunits, and the homologous D1 (PsbA) and D2 (PsbD) photosystem II (PSII) reaction centre proteins from cyanobacteria, algae and plants. The D1 and D2 proteins only show approximately 15% sequence homology with the L and M subunits, however the conserved amino acids correspond to the binding sites of the phytochemically active cofactors. As a result, the reaction centres (RCs) of purple photosynthetic bacteria and PSII display considerable structural similarity in terms of cofactor organisation.The D1 and D2 proteins occur as a heterodimer that form the reaction core of PSII, a multisubunit protein-pigment complex containing over forty different cofactors, which are anchored in the cell membrane in cyanobacteria, and in the thylakoid membrane in algae and plants. Upon absorption of light energy, the D1/D2 heterodimer undergoes charge separation, and the electrons are transferred from the primary donor (chlorophyll a) via pheophytin to the primary acceptor quinone Qa, then to the secondary acceptor Qb, which like the bacterial system, culminates in the production of ATP. However, PSII has an additional function over the bacterial system. At the oxidising side of PSII, a redox-active residue in the D1 protein reduces P680, the oxidised tyrosine then withdrawing electrons from a manganese cluster, which in turn withdraw electrons from water, leading to the splitting of water and the formation of molecular oxygen. PSII thus provides a source of electrons that can be used by photosystem I to produce the reducing power (NADPH) required to convert CO2 to glucose [ , ].Also in this entry is the light-dependent chlorophyll f synthase (ChlF) from cyanobacteria such as Chlorogloeopsis fritschii. ChlF synthesizes chlorophyll f or chlorophyllide f, which is able to absorb far red light, probably by oxidation of chlorophyll a or chlorophyllide a and reduction of plastoquinone [ ].
Protein Domain
Name: Peptidase S9, serine active site
Type: Active_site
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [ , ].This signature defines the active site of the serine peptidases belonging to MEROPS peptidase family S9 (prolyl oligopeptidase family, clan SC). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. Examples of protein families containing this active site are:Prolyl endopeptidase ( ) (PE) (also called post-proline cleaving enzyme). PE is an enzyme that cleaves peptide bonds on the C-terminal sideof prolyl residues. The sequence of PE has been obtained from Sus scrofa (Pig) and from bacteria (Flavobacterium meningosepticum and Aeromonas hydrophila); there is a high degree of sequence conservationbetween these sequences. Escherichia coli protease II ( ) (oligopeptidase B) (gene prtB) which cleaves peptide bonds on the C-terminal side of lysyl and argininylresidues. Dipeptidyl peptidase IV ( ) (DPP IV). DPP IV is an enzyme that removes N-terminal dipeptides sequentially from polypeptides havingunsubstituted N-termini provided that the penultimate residue is proline. Saccharomyces cerevisiae (Baker's yeast) vacuolar dipeptidyl aminopeptidase A (DPAP A) (gene: STE13) which is responsible for the proteolytic maturation of the alpha-factor precursor.Yeast vacuolar dipeptidyl aminopeptidase B (DPAP B) (gene: DAP2).Acylamino-acid-releasing enzyme ( ) (acyl-peptide hydrolase). This enzyme catalyzes the hydrolysis of the amino-terminal peptide bond ofan N-acetylated protein to generate a N-acetylated amino acid and a protein with a free amino-terminus.This signature contains the conserved serine residue that has been experimentally shown (in E. coli protease II as well as in pig and bacterial PE) to be necessary for the catalytic mechanism. This serine, which is part of the catalytic triad (Ser, His, Asp), is generally located about 150 residues away from the C-terminal extremity ofthese enzymes (which are all proteins that contain about 700 to 800 amino acids).
Protein Domain
Name: Diguanylate cyclase/phosphodiesterase
Type: Family
Description: Members of this group are signal transduction proteins that are direct oxygen sensors and are involved in regulation of cellular processes via the effector molecule cyclic diguanylate (c-di-GMP, bis(3',5')-cyclic diguanylic acid). They contain PAS/PAC, GGDEF, and EAL domains and have diguanylate cyclase and phosphodiesterase activities. Related groups with similar domain architectures contain different versions of PAS/PAC domain, and are thought to have different, often not yet determined biological functions.Escherichia coli Dos (YddU or DosP) and Komagataeibacter xylinus (Gluconacetobacter xylinus or Acetobacter xylinum) PdeA1 proteins have been shown to be direct, haem-based oxygen sensors [, , ]. Their N-terminal PAS domains are responsible for haem-binding [, ]. PAS/PAC is a ubiquitous intracellular sensory domain. It is located in the cytoplasm and sense changes in redox potential in the electron transport system or overall cellular redox status. PAS domains can monitor changes in light, oxygen or small ligands in a cell, and sense environmental factors that cross the cell membrane and/or affect cell metabolism [, , ]. In the haem-containing subgroup of PAS domains, the haem pocket acts as a ligand-specific trap []. The ligand binding to a haem-containing PAS domain leads to either activation or inhibition of a regulated (catalytic) domain (here, GGDEF and/or EAL domains). Phosphodiesterase activity with cAMP of E. coli Dos has been shown to be regulated by the haem redox state []. Similarly, Komagataeibacter xylinus PdeA1 is regulated by reversible binding of O2to the haem [ ].The catalytic function of the members of this group has also been experimentally determined.Cyclic di-GMP (c-di-GMP) is the specific nucleotide regulator of beta-1,4-glucan (cellulose) synthase in Komagataeibacter xylinus [ ]. In a study of the regulation of biosynthesis of extracellular cellulose in Komagataeibacter xylinus [], the search for the enzymes that synthesise and hydrolyse cyclic di-GMP resulted in the identification of six proteins with identical domain architecture containing PAS, GGDEF and EAL domains. Three of them exhibited diguanylate cyclase activity (Dgc1-3), and three others - phosphodiesterase activity (PdeA1-3) [, ]. Likewise, E. coli Dos has been shown to have phosphodiesterase activity [].Genetic complementation using genes from three different bacteria encoding proteins with GGDEF domains as the only element in common indicate that the GGDEF domain is responsible for the diguanylate cyclase activity of these proteins [ ]. Even prior to these results, the notion that the GGDEF domain is a diguanylate cyclase was supported by the detailed analysis of its sequence, which shows conservation of the proposed nucleotide-binding loop in alignment with eukaryotic adenylate cyclases []. By exclusion, the EAL domain emerged as the best candidate for the role of c-di-GMP phosphodiesterase. Indeed, the sequence of this domain contains several conserved aspartates, which could participate in metal binding and form a phosphodiesterase active site [ ]. It is not clear what differences make one subgroup of these proteins to act as phosphodiesterases, and another - as diguanylate cyclases, while containing both domains.For additional information please see [ , ].
Protein Domain
Name: TRIAD supradomain
Type: Domain
Description: The TRIAD (Two RING fingers and a DRIL, double RING finger linked) or RBR (RING-BetweenRING-RING) family of zinc finger proteins contains a tripartite motif of three double zinc fingers, the first of which, RING1, is a typical RING finger with a C3HC4 signature of conserved cysteine and histidine residues. The second (In-Between-Ring, IBR, BetweenRING or DRIL) and third (RING2) are dissimilar to RINGs but share notable similarity, as they have similar spacing of cysteines and some conserved residues. The cysteine and histidine rich TRIAD domain architecture is highly conserved and found mostly in eukaryotes. TRIAD E3 ligases (E3s) are complicated multi-domain enzymes that contain a variety of domains in addition to their TRIAD supradomain. The three fingers that define the TRIAD supradomain always appear in the same order RING1-IBR-RING2, but the position of the supradomain itself relative to other domains varies. All characterised proteins containing the TRIAD supradomain have been found to possess E3 ligase activity. TRIAD E3s differ fundamentally from their eponymous RING E3 cousins by virtue of their possessing an active site, a feature lacking in all RING-type E3s. Similar to canonical RINGs, the RING1 finger of the TRIAD supradomain binds E2s loaded with Ub (E2-Ubs). However, RING2s contain an essential active-site Cys that receives Ub from E2-Ub to generate a covalent E3-Ub intermediate [ , , , ].The three fingers coordinates two Zn ions. RING1 is the only domain with a classical C3HC4 cross-brace zinc-coordination topology. IBR and RING2 fingers do not only share structural similarity but also have a completely distinct topology from classical RINGs. The IBR finger adopts a bilobal fold about the two zinc-binding sites. This arrangement brings the N-terminal of the domain within close proximity to its C-terminal. The RING2 has the same domain topology as the IBR finger and coordinates its two zinc atoms in a sequential fashion. The RING2 finger contains a conserved Cys residue that is not involved in Zn coordination but serves as the active site to which Ub is attached. While they resemble RING2s in topology, IBR fingers do not contain an active-site Cys. IBRs and their linkers on either side have been implicated in binding Ub during Ub transfer reactions, but the exact function of IBRs remains unknown [ , , , ].This entry represents the TRIAD supradomain found in a number of proteins with different functions, such as Ariadne (ARI) proteins, implicated in the regulation of translation, cellular proliferation, and developmental processes; TRIAD proteins, associated with the regulation of myeloid progenitors proliferation, NF-kappaB signaling, and membrane trafficking or Parkin, implicated in a range of biological processes, including autophagy of damaged mitochondria (mitophagy), cell survival pathways, and vesicle trafficking. This supradomain is also present in heme-oxidized IRP2 ubiquitin ligase 1L (HOIL-1L) and HOIL-1L interacting protein (HOIP) which form the two linear ubiquitin chain assembly complex (LUBAC), associated with B-cell function, regulation of apoptosis, oncogenesis, and diverse autoimmune diseases.
Protein Domain
Name: Interleukin-1 receptor type II
Type: Family
Description: Interleukin-1 alpha and interleukin-1 beta (IL-1 alpha and IL-1 beta) are cytokines that participate in the regulation of immune responses, inflammatory reactions, and hematopoiesis []. Two types of IL-1 receptor, each with three extracellular immunoglobulin (Ig)-like domains, limited sequence similarity (28%) and different pharmacological characteristics have been cloned from mouse and human cell lines: these have been termed type I and type II receptors []. The receptors both exist in transmembrane (TM) and soluble forms: the soluble IL-1 receptor is thought to be post-translationally derived from cleavage of the extracellular portion of the membrane receptors.Both IL-1 receptors appear to be well conserved in evolution, and map to the same chromosomal location []. The receptors can both bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA).The crystal structures of IL1A and IL1B [ ] have been solved, showing them to share the same 12-stranded β-sheet structure as both the heparin binding growth factors and the Kunitz-type soybean trypsin inhibitors []. The β-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel β-barrel. Several regions, especially the loop between strands 4 and 5, have been implicated in receptor binding.The Vaccinia virus genes B15R and B18R each encode proteins with N-terminal hydrophobic sequences, possible sites for attachment of N-linked carbohydrate and a short C-terminal hydrophobic domain []. These propertiesare consistent with the mature proteins being either virion, cell surface or secretory glycoproteins. Protein sequence comparisons reveal that the gene products are related to each other (20% identity) and to the Ig superfamily. The highest degree of similarity is to the human and murine interleukin-1 receptors, although both proteins are related to a wide range of Ig superfamily members, including the interleukin-6 receptor. A novel method for virus immune evasion has been proposed in which the product of one or both of these proteins may bind interleukin-1 and/or interleukin-6, preventing these cytokines reaching their natural receptors [ ]. A similar gene product from Cowpox virus (CPV) has also been shown to specifically bind murine IL-1 beta [].This entry represents Interleukin-1 receptor, type II, the mature type II IL-1 receptor consists of (i) a ligand binding portion comprising three Ig-like domains; (ii) a single TM domain; and (iii) a short cytoplasmic domain of 29 amino acids [ ]. This contrasts with the ~215 amino acid cytoplasmic domain of the type I receptor, suggesting that the two IL-1 receptors may interact with different signal transduction pathways. The type II receptor is expressed in a number of different tissues, including both B and T lymphocytes, and can be induced in several cell types by treatment with phorbol ester. Both IL-1 receptors appear to be well conserved in evolution, and map to the same chromosomal location. Like the type I receptor, the human type II IL-1 receptor can bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA) [].
Protein Domain
Name: Neuromedin U receptor, type 1
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups []. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Neuromedin U is a neuropeptide, first isolated from porcine spinal cord and expressed widely in the gastrointestinal, genitourinary and central nervoussystems [ ]. Neuromedin U has potent contractile activity on smooth muscle and this activity is believed to reside within the C-terminal portion of the peptide, which is highly conserved between species. Other roles for the peptide include: regulation of blood flow and ion transport in the intestine, regulation of adrenocortical function and increased blood pressure []. The roles of neuromedin U in the central nervous systemare poorly understood, but may include: regulation of food intake, neuroendocrine control, modulation of dopamine actions and involvement inneuropsychiatric disorders. Two G protein-coupled receptor subtypes, with differing expression patterns, have been identified and shown to bindneuromedin U. The neuromedin U type 1 receptor (NMU1) is expressed predominantly in theperiphery, with highest levels in the gastrointestinal and urogenital systems, particularly in the testes []. The receptor is also found in thekidney, pancreas, lung, trachea, adrenal cortex, liver and mammary glands []. Within the small intestine and ileum, NMU1 is specifically expressed in goblet cells. In the central nervous system, the receptor isexpressed only at much lower levels and has been detected most abundantly in the cerebellum, dorsal root ganglia, hippocampus and spinal cord.Binding of neuromedin U to the receptor results in phospholipase C activation and increased intracellular calcium concentrations throughcoupling to Gq proteins.
Protein Domain
Name: Duffy antigen/chemokine receptor
Type: Family
Description: Chemokines (chemotactic cytokines) are a family of chemoattractant molecules. They attract leukocytes to areas of inflammation and lesions, and play a key role in leukocyte activation. Originally defined as host defense proteins, chemokines are now known to play a much broader biological role [ ]. They have a wide range of effects in many different cell types beyond the immune system, including, for example, various cells of the central nervous system [], and endothelial cells, where they may act as either angiogenic or angiostatic factors [].The chemokine family is divided into four classes based on the number and spacing of their conserved cysteines: 2 Cys residues may be adjacent (the CC family); separated by an intervening residue (the CXC family); have only one of the first two Cys residues (C chemokines); or contain both cysteines, separated by three intervening residues (CX3C chemokines).Chemokines exert their effects by binding to rhodopsin-like G protein-coupled receptors on the surface of cells. Following interaction with their specific chemokine ligands, chemokine receptors trigger a flux in intracellular calcium ions, which cause a cellular response, including the onset of chemotaxis. There are over fifty distinct chemokines and least 18 human chemokine receptors [ ]. Although the receptors bind only a single class of chemokines, they often bind several members of the same class with high affinity. Chemokine receptors are preferentially expressed on important functional subsets of dendritic cells, monocytes and lymphocytes, including Langerhans cells and T helper cells [, ]. Chemokines and their receptors can also be subclassified into homeostatic leukocyte homing molecules (CXCR4, CXCR5, CCR7, CCR9) versus inflammatory/inducible molecules (CXCR1, CXCR2, CXCR3, CCR1-6, CX3CR1).This entry represents the Duffy antigen/chemokine receptor, DARC (Duffy Antigen for Chemokines). It is also known as Fy protein [ , ], and was originally identified as a blood group antigen. DARC has been found to act as a multi-specific receptor for chemokines of both the C-C and C-X-C families including CCL2, CCL5, CXCL1 and CXCL4 [, , , , ], it has also been shown to internalise chemokines but not scavenge them []. Although DARC is a 7-transmembrane protein, sharing a high content of α-helical secondary structure typical of chemokine structures [], the characteristic rhodopsin-like signature is virtually absent. As a result, unlike classical chemokine receptors DARC does not signal through G-proteins, so is regarded as an atypical chemokine receptor. DARC was initially described on red blood cells, but subsequent studies have demonstrated DARC protein expression on renal endothelial and epithelial cells and in Purkinje cells of the cerebellum, even in Duffy-negative individuals whose red cells lack DARC [ , , , , ]. DARC is believed to play an important role in endothelial cells, since expression on these cell types is highly conserved, whereas the function on RBCs appears to be dispensable in order to confer resistance to malaria []. There is evidence suggesting a role for DARC in neutrophil migration from the blood into the tissues [] and in modulating inflammatory response [, , , , ].
Protein Domain
Name: Peptidase C45
Type: Family
Description: The peptidase C45 family includes acyl-coenzyme A:6-aminopenicillanic-acid-acyltransferases from fungi. The active site residue for members of this family and family T1 is C-terminal to the autolytic cleavage site. In Penicillium chrysogenum, A:6-aminopenicillanic-acid-acyltransferase serves as the last enzyme in penicillin biosynthetic pathway, which converts isopenicillin N (IPN) to penicillin G, using phenyl-acetyl-CoA or phenoxyacetyl-CoA as acyl donors [ ]. The active mature form of this enzyme is formed by autoproteolysis of the immature precursor, which leads to the exposure of a flexible pocket that was previously buried []. This entry also includes a number of uncharacterised proteins from bacteria, archaea, animals and plants.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families []. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Peptidase A22A, presenilin
Type: Family
Description: This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family), subfamily A22A, the type example being presenilin 1 from Homo sapiens (Human).Presenilins are polytopic transmembrane (TM) proteins, mutations in which are associated with the occurrence of early-onset familial Alzheimer'sdisease, a rare form of the disease that results from a single-gene mutation [, ]. Alzheimer's disease is associated with the formation of extracellular deposits of amyloid, which contain aggregates of the amyloid-beta peptide. The β-peptides are released from the Alzheimer's amyloid precursor protein (APP) by the action of two peptidase activities: "beta-secretase"cleaves at the N terminus of the peptide, and "gamma-secretase"cleaves at the C terminus. The gamma-secretase cleavage occurs in a transmembrane segment of APP. Presenilin, which exists in a complex with nicastrin, APH-1 and PEN-2, has been identified as gamma-secretase from its deficiency [ ] and mutation of its active site residues [], but proteolytic activity has only been directly demonstrated on a peptide derived from APP [].Presenilin-1 is also known to process notch proteins [ ] and syndecan-3 [].Presenilin has nine transmembrane regions with the active site aspartic acid residues located on TM6, within a Tyr-Asp motif, and TM7, within a Gly-Xaa-Gly-Asp motif [ ]. The protein autoprocesses to form an amino-terminal fragment (TMs 1-6) and a C-terminal fragment (TMs 7-9) []. The tertiary structure of the human gamma-sectretase complex has been solved []. Nicastrin is extracellular, whereas presenilin-1, APH-1 and PEN-2 are all transmembrane proteins. The transmembrane regions of all three proteins form a horseshoe shape.Aspartic peptidases, also known as aspartyl proteases ([intenz:3.4.23.-]), are widely distributed proteolytic enzymes [, , ] known to exist in vertebrates, fungi, plants, protozoa, bacteria, archaea, retroviruses and some plant viruses. All known aspartic peptidases are endopeptidases. A water molecule, activated by two aspartic acid residues, acts as the nucleophile in catalysis. Aspartic peptidases can be grouped into five clans, each of which shows a unique structural fold [].Peptidases in clan AA are either bilobed (family A1 or the pepsin family) or are a homodimer (all other families in the clan, including retropepsin from HIV-1/AIDS) [ ]. Each lobe consists of a single domain with a closed β-barrel and each lobe contributes one Asp to form the active site. Most peptidases in the clan are inhibited by the naturally occurring small-molecule inhibitor pepstatin [].Clan AC contains the single family A8: the signal peptidase 2 family. Members of the family are found in all bacteria. Signal peptidase 2 processes the premurein precursor, removing the signal peptide. The peptidase has four transmembrane domains and the active site is on the periplasmic side of the cell membrane. Cleavage occurs on the amino side of a cysteine where the thiol group has been substituted by a diacylglyceryl group. Site-directed mutagenesis has identified two essential aspartic acid residues which occur in the motifs GNXXDRX and FNXAD (where X is a hydrophobic residue) [ ]. No tertiary structures have been solved for any member of the family, but because of the intramembrane location, the structure is assumed not to be pepsin-like.Clan AD contains two families of transmembrane endopeptidases: A22 and A24. These are also known as "GXGD peptidases"because of a common GXGD motif which includes one of the pair of catalytic aspartic acid residues. Structures are known for members of both families and show a unique, common fold with up to nine transmembrane regions [ ]. The active site aspartic acids are located within a large cavity in the membrane into which water can gain access [].Clan AE contains two families, A25 and A31. Tertiary structures have been solved for members of both families and show a common fold consisting of an α-β-alpha sandwich, in which the beta sheet is five stranded [ , ].Clan AF contains the single family A26. Members of the clan are membrane-proteins with a unique fold. Homologues are known only from bacteria. The structure of omptin (also known as OmpT) shows a cylindrical barrel containing ten beta strands inserted in the membrane with the active site residues on the outer surface [ ].There are two families of aspartic peptidases for which neither structure nor active site residues are known and these are not assigned to clans. Family A5 includes thermopsin, an endopeptidase found only in thermophilic archaea. Family A36 contains sporulation factor SpoIIGA, which is known to process and activate sigma factor E, one of the transcription factors that controls sporulation in bacteria [ ].
Protein Domain
Name: Peptidase C10, streptopain
Type: Family
Description: This group of cysteine peptidases belong to MEROPS peptidase family C10 (streptopain family, clan CA). Streptopain is a cysteine protease found in Streptococcus pyogenes that shows some structural and functional similarity to papain (family C1) [ , ]. The order of the catalytic cysteine/histidine dyad is the same and the surrounding sequences are similar. The two proteins also show similar specificities, both preferring a hydrophobic residue at the P2 site [, ].Streptopain shows a high degree of sequence similarity to the S. pyogenes exotoxin B, and strong similarity to the prtT gene product of Porphyromonas gingivalis (Bacteroides gingivalis), both of which have been included in the family [].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Signal recognition particle, SRP54 subunit
Type: Family
Description: This entry represents the SRP54 subunit of the signal recognition particle protein translocation system.The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [ , ]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [].The bacterial homologues of the SRP54 protein and SRP RNA are Ffh and 4.5S RNA. They comprise a minimal bacterial SRP that can target ribosome-nascent chain complexes to the plasma membrane via interaction with FtsY, the bacterial homologue of the SRP receptor [, ].
Protein Domain
Name: Pyridoxal 5'-phosphate synthase subunit PdxS/SNZ
Type: Family
Description: The family of pyridoxal 5'-phosphate synthase subunits, known as the PdxS/SNZ family, occur in organisms in four kingdoms and form one of the most highly conserved families [ ]. A PdxS/SNZ protein has a classic (beta/alpha)8-barrel fold, consisting of eight parallel β-strands alternating with eight alpha helices. PdxS subunits form two hexameric rings [ ]. Proteins are involved in vitamin B6 biosynthesis.The term vitamin B6 is used to refer collectively to the compound pyridoxine and its vitameric forms, pyridoxal, pyridoxamine, and their phosphorylated derivatives. Vitamin B6 is required by all organisms and plays an essential role as a co-factor for enzymatic reactions. Plants, fungi, bacteria, archaebacteria, and protists synthetize vitamin B6. Animals and some highly specialised obligate pathogens obtain it nutritionally. Vitamin B6 has two distinct biosynthetic pathways, which do not coexist in any organism. The pdxA/pdxJ pathway, that has been extensively characterised in Escherichia coli, is found in the gamma subdivision of the proteobacteria. A second pathway of vitamin B6 synthesis involving the pdxS/SNZ and pdxT/SNO protein families, which are completely unrelated in sequence to the pdxA/pdxJ proteins, is found in plants, fungi, protists, archaebacteria and most bacteria [, , ].PdxS/SNZ and pdxT/SNO proteins form a complex which serves as a glutamine amidotransferase to supply ammonia as a source of the ring nitrogen of vitamin B6 [ ]. PdxT/SNO and pdxS/SNZ appear to encode respectively the glutaminase subunit, which produces ammonia from glutamine, and the synthase subunit, which combines ammonia with five- and three-carbon phosphosugars to form vitamin B6 [].
Protein Domain
Name: Signal peptidase complex subunit 2
Type: Family
Description: This family represents the Signal peptidase complex subunit 2 (SPCS2) and its homologues, such as Spc2 from budding yeasts. The signal peptidase complex cleaves the signal sequence from proteins targeted to the endoplasmic reticulum (ER). Mammalian signal peptidase is as a complex of five different polypeptide chains [ ], while the budding yeast SPC comprises four proteins []. Budding yeast Spc2 has been shown to be a nonessential component of the signal peptidase complex []. Spc2 has been shown to enhance the enzymatic activity of the SPC and facilitate the interactions between different components of the translocation site []. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins [ ].
Protein Domain
Name: ATP-dependent Clp protease proteolytic subunit
Type: Family
Description: Clp is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin [ ] and is a member of peptidase family S14. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence of ATP [, ], although the P subunit alone does possess some catalytic activity []. This entry represents the P subunit.Proteases highly similar to ClpP have been found to be encoded in the genome of bacteria, metazoa, some viruses and in the chloroplast of plants, but seems to be absent in archaea, mollicutes and some fungi []. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium [ ]. Active site consists of the triad Ser, His and Asp []; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function [, ].
Protein Domain
Name: Polymerase/histidinol phosphatase, N-terminal
Type: Domain
Description: This domain is associated with the N terminus of members of the PHP superfamily, this includes: subunit of bacterial DNA polymerase III, eukaryotic DNA polymerase, X-family of DNA polymerases,histidinol phosphatases,and a number of uncharacterised protein families.In common for all PHP proteins is the presence of four conserved sequence motifs that contain invariant histidine and aspartate residues implicated in metal ion coordination. As part of DNA polymerases, the PHP domain was suggested to hydrolyse pyrophosphate and thereby shift the reaction equilibrium toward nucleotide polymerisation. However, it cannot be ruled out that the PHP domain possesses a nuclease activity, particularly in the repair polymerases of the X-family. No functional information is available for standalone proteins that belong to the PHP superfamily. The crystal structure of the YcdX protein from Escherichia coli has been determined to 1.6-A resolution. YcdX has an unusual topology of a α7-β7 barrel compared with the more common α8-β8 (TIM) barrel. The C-terminal helix caps the barrel on the N-terminal side. The deep cleft at the C-terminal side of the barrel contains the three zinc binding residues. These residues are invariant in the YcdX family confirming their functional importance. Only four proteins with known structures have a similar trinuclear zinc catalytic site. All four (nuclease P1, endonuclease IV, alkaline phosphatase, and phospholipase C) hydrolyse the phosphoester bond. This finding suggests a similar activity for YcdX. YcdX is among the genes significantly induced in response to the DNA damage, therefore indicating that members of the YcdX family may be involved in DNA repair [ ].
Protein Domain
Name: Lipocalin family conserved site
Type: Conserved_site
Description: The lipocalins are a diverse, interesting, yet poorly understood family of proteins composed, in the main, of extracellular ligand-binding proteins displaying high specificity for small hydrophobic molecules []. Functions of these proteins include transport of nutrients, control of cell regulation, pheromone transport, cryptic colouration, and the enzymatic synthesis of prostaglandins. For example, retinol-binding protein 4 transfers retinol from the stores in the liver to peripheral tissues [].The crystal structures of several lipocalins have been solved and show a novel 8-stranded anti-parallel β-barrel fold well conserved within the family. Sequence similarity within the family is at a much lower level and would seem to be restricted to conserved disulphides and 3 motifs, which form a juxtaposed cluster that may act as a common cell surface receptor site [, ]. By contrast, at the more variable end of the fold are found an internal ligand binding site and a putative surface for the formation of macromolecular complexes []. The anti-parallel β-barrel fold is also exploited by the fatty acid-binding proteins, which function similarly by binding small hydrophobic molecules. Similarity at the sequence level, however, is less obvious, being confined to a single short N-terminal motif.This entry represents the Lipocalin conserved site. The sequences of most members of the family, the core or kernal lipocalins, are characterised by three short conserved stretches of residues [ ]. Others, the outlier lipocalin group, share only one or two of these. This signature pattern was built around the first, common to all outlier and kernal lipocalins, which occurs near the start of the first β-strand.
Protein Domain
Name: Transthyretin/hydroxyisourate hydrolase
Type: Family
Description: This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU [ , ]. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage [, ]. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence.Transthyretin (formerly prealbumin) is one of 3 thyroid hormone-binding proteins found in the blood of vertebrates [ ]. It is produced in the liver and circulates in the bloodstream, where it binds retinol and thyroxine (T4) [, , ]. It differs from the other 2 hormone-binding proteins (T4-binding globulin and albumin) in 3 distinct ways: (1) the gene is expressed at a high rate in the brain choroid plexus; (2) it is enriched in cerebrospinal fluid; and (3) no genetically caused absence has been observed, suggesting an essential role in brain function, distinct from that played in the bloodstream []. The protein consists of around 130 amino acids, which assemble as a homotetramer that contains an internal channel in which T4 is bound. Within this complex, T4 appears to be transported across the blood-brain barrier, where, in the choroid plexus, the hormone stimulates further synthesis of transthyretin. The protein then diffuses back into the bloodstream, where it binds T4 for transport back to the brain [].
Protein Domain
Name: Transthyretin/hydroxyisourate hydrolase domain
Type: Domain
Description: This family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU [ , ]. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage [, ]. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence.Transthyretin (formerly prealbumin) is one of 3 thyroid hormone-binding proteins found in the blood of vertebrates [ ]. It is produced in the liver and circulates in the bloodstream, where it binds retinol and thyroxine (T4) [, , ]. It differs from the other 2 hormone-binding proteins (T4-binding globulin and albumin) in 3 distinct ways: (1) the gene is expressed at a high rate in the brain choroid plexus; (2) it is enriched in cerebrospinal fluid; and (3) no genetically caused absence has been observed, suggesting an essential role in brain function, distinct from that played in the bloodstream []. The protein consists of around 130 amino acids, which assemble as a homotetramer that contains an internal channel in which T4 is bound. Within this complex, T4 appears to be transported across the blood-brain barrier, where, in the choroid plexus, the hormone stimulates further synthesis of transthyretin. The protein then diffuses back into the bloodstream, where it binds T4 for transport back to the brain [].
Protein Domain
Name: Major facilitator superfamily
Type: Family
Description: Among the different families of transporter, only two occur ubiquitously in all classifications of organisms. These are the ATP-Binding Cassette (ABC) superfamily and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients [ , , ].The major facilitator superfamily (MFS) of membrane proteins represents the largest family of secondary transporters with members from Archaea to Homo sapiens. MFS proteins target a wide spectrum of substrates, including ions, carbohydrates, lipids, amino acids and peptides, nucleosides and other small molecules in both directions across the membrane, in many instances catalysing active transport by transducing the energy stored in an proton electrochemical gradient into a concentration gradient of substrate [ ]. One remarkable characteristic of the MFS is the high sequence variety within the superfamily. The sequences identity ranges around 12-18% but regions of functional similarity (e.g., substrate- or H-binding sites) align for only very closely related MFS transporters. A hydrophobic amino acid content of 60-70% of most MFS members, high alfa-helix content and an inherent symmetry of the proteins with regard to helix kinks and bends provides nonspecific overlapping of residues and probably accounts for the reported similarities. Structure from representative members show 12 transmembrane sections (TMSs) surrounding a central cavity, forming a semi-symmetrical structure. MFS includes 105 families based on phylogenetic analysis, sequence alignments, overlap of hydropathy plots, compatibility of repeat units, similarity of complexity profiles of transmembrane segments, shared protein domains and 3D structural similarities between transport proteins [].
Protein Domain
Name: NT-type C2 domain
Type: Domain
Description: The C2 domain is one of the most prevalent eukaryotic lipid-binding domains deployed in diverse functional contexts. Many C2 domainsbind directly to membrane lipids and display a wide range of lipid selectivity, with preference for anionic phosphatidylserine (PS) andphosphatidylinositol-phosphates (PIPs).Despite their limited sequence similarity, all C2 domains contain at their core a compact β-sandwich composed of two four-stranded beta sheets withhighly variable inter-strand regions that might contain one or more alpha- helices.The NT-type C2 domain shows a diverse range of domain architectures but it is nearly always found at the N-termini of proteins that contain it. Hence, ithas been named the N-terminal C2 (NT-C2) family. It is typically coupled with a coiled-coil domain, that could mediate di/oligo-merization and the DIL(Dilute) domain. It is also coupled with the Calponin homology (CH) domain in EHBP1 proteins, Filamin/ABP280repeats and Mg2+ transporter MgtE N-terminal domain in proteins from chlorophyte algae such as Micromonas and Ostreococcus tauri.Thus, a common theme across the NT-type C2 domain proteins is the combination to several different domains with microfilament-binding or actin-related roles(i.e. such as CH, DIL, and Filamin). Other conserved groups of the NT-type C2 proteins prototyped by EEIG1, PMI1, and SYNC1 have their own distinct C-terminal conserved extensions that are restricted to these groups and might mediate specific interactions. The primary function of the NT-type C2 domainappears to be the linking of actin/microfilament-binding adaptors to the membrane and to act as a link that tethers endosomal vesicles to thecytoskeleton in course of their intracellular trafficking [ , ].
Protein Domain
Name: Glutaredoxin
Type: Domain
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This entry represents Glutaredoxin.
Protein Domain
Name: Activator of Hsp90 ATPase AHSA1-like, N-terminal
Type: Domain
Description: This entry includes a group of heat shock protein interacting proteins, including AHSA1/2 from animals and Aha1/Hch1 from budding yeasts, and it represents a domain found at the N-terminal of Aha1 and AHSA1/2, while in Hch1 is the only domain. Aha1 adopts a secondary structure consisting of an N-terminal α-helix leading into a four-stranded meandering antiparallel β-sheet, followed by a C-terminal α-helix. The two α-helices are packed together, with the β-sheet curving around them. The N-terminal domain of Aha1 interacts with the central segment of Hsp90 which induces the conformational rearrangements of Hsp90 that favor the N-terminal domain-dimerized state of the chaperone and ends leads to the stimulation of its ATPase activity []. Activator of 90kDa heat shock protein ATPase Aha1/AHSA1 (AHSA1/p38, ) is known to interact with the middle domain of Hsp90, and stimulate its ATPase activity [ , ], where one Aha1/AHSA1 molecule per Hsp90 dimer is sufficient for this stimulation. It is probably a general up regulator of Hsp90 function, particularly contributing to its efficiency in conditions of increased stress []. It is also known to interact with the cytoplasmic domain of the VSV G protein, and may thus be involved in protein transport []. It has also been reported as being under expressed in Down's syndrome.In budding yeasts, both Hch1 and Aha1 bind to the middle domain of Hsp90 and stimulate ATPase activity [ , ]. However, Aha1 but not Hch1 stimulated the intrinsic ATPase activity of Hsp90 5-fold []. Hch1 and Aha1 may regulate Hsp90 function in distinct ways [].
Protein Domain
Name: Zinc finger, H2C2-type, histone UAS binding
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents an H2C2-type zinc finger that binds to histone upstream activating sequence (UAS) elements found in histone gene promoters [ ].
Protein Domain
Name: Rab27a/b
Type: Family
Description: This entry includes Rab27a and its highly homologous isoform, Rab27b. Unlike most Rab proteins whose functions remain poorly defined, Rab27a has many known functions. Rab27a has multiple effector proteins, and depending on which effector it binds, Rab27a has different functions as well as tissue distribution and/or cellular localization. Putative functions have been assigned to Rab27a when associated with the effector proteins Slp1, Slp2, Slp3, Slp4, Slp5, DmSlp, rabphilin, Dm/Ce-rabphilin, Slac2-a, Slac2-b, Slac2-c, Noc2, JFC1, and Munc13-4 [ , ]. Rab27a has been associated with several human diseases, including hemophagocytic syndrome (Griscelli syndrome or GS), Hermansky-Pudlak syndrome, and choroidermia. In the case of GS, a rare, autosomal recessive disease, a Rab27a mutation is directly responsible for the disorder[]. When Rab27a is localized to the secretory granules of pancreatic beta cells, it is believed to mediate glucose-stimulated insulin secretion, making it a potential target for diabetes therapy [, ]. When bound to JFC1 in prostate cells, Rab27a is believed to regulate the exocytosis of prostate- specific markers [].Rabs are regulated by GTPase activating proteins (GAPs), which interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins [ , ].
Protein Domain
Name: Frizzled-4, transmembrane domain
Type: Domain
Description: Frizzleds are seven transmembrane-spanning proteins that constitute an unconventional class of G protein-coupled receptors [ ]. They have important regulatory roles during embryonic development [, ].Frizzleds expose their large N terminus on the extracellular side. The N-terminal, extracellular cysteine-rich domain (CRD) has been implicated as the Wnt binding domain and its structure has been solved [ ]. The cysteine-rich domain of Frizzled (Fz) is shared with other receptor tyrosine kinases that have roles in development including the muscle-specific receptor tyrosine kinase (MuSK), the neuronal specific kinase (NSK2), and ROR1 and ROR2. The cytoplasmic side of many Fz proteins has been shown to interact with the PDZ domains of PSD-95 family members and is thought to have a role in the assembly of signalling complexes. The conserved cytoplasmic motif of Fz, Lys-Thr-X-X-X-Trp, is required for activation of the beta-catenin pathway, and for membrane localisation and phosphorylation of Dsh.Three main signaling pathways are activated by agonist-activated Frizzled proteins: the Fz/beta-catenin pathway, the Fz/Ca2 pathway and the Fz/PCP (planar cell polarity) pathway [ ]. The Wnt/beta-catenin pathway is the best studied signalling pathway involving Fz receptors. In the Wnt/beta-catenin pathway the first downstream cytoplasmic components activated by Fz signalling include Dishevelled (Dsh) and/or its regulatory kinases.This entry represents the C-terminal transmembrane domain of Frizzled-4. Together with Frizzled-8, this protein has a major role in controlling ureteric growth in the developing kidney [ ]. In humans, mutations in the gene encoding Frizzled-4 are related to familial exudative vitreoretinopathy, a disorder characterized by the incomplete development of the retinal vasculature [].
Protein Domain
Name: Chorismate mutase type II superfamily
Type: Homologous_superfamily
Description: Chorismate mutase (CM) is a regulatory enzyme ( ) required for biosynthesis of the aromatic amino acids phenylalanine and tyrosine. CMcatalyzes the Claisen rearrangement of chorismate to prephenate, which can subsequently be converted to precursors of either L-Phe or L-Tyr. Inbifunctional enzymes the CM domain can be fused to a prephenate dehydratase (P-protein for Phe biosynthesis), to a prephenatedehydrogenase (T-protein, for Tyr biosynthesis), or to 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase.Besides these prokaryotic bifunctional enzymes, monofunctional CMs occur in prokaryotes as well as in fungi, plants and nematode worms []. The sequence of monofunctional chorismate mutase aligns well with the N-terminal part of P-proteins [].The type II or AroQ class of CM has an all-helical 3D structure, represented by the CM domain of the bifunctional Escherichia coliP-protein. This type is named after the Enterobacter agglomerans monofunctional CM encoded by the aroQ gene []. All CM domainsfrom bifunctional enzymes as well as most monofunctional CMs belong to this class, including archaeal CM.Eukaryotic CM from plants and fungi form a separate subclass of AroQ, represented by the Baker's yeast allosteric CM. These enzymes show onlypartial sequence similarity to the prokaryotic CMs due to insertions of regulatory domains, but the helix-bundle topology and catalytic residues areconserved and the 3D structure of the E. coli CM dimer resembles a yeast CM monomer [, , ]. The E. coli P-protein CM domain consists of3 helices and lacks allosteric regulation. The yeast CM has evolved by gene duplication and dimerization and each monomer has 12 helices. Yeast CM isallosterically activated by Trp and inhibited by Tyr [ ].
Protein Domain
Name: Muscarinic acetylcholine receptor M5
Type: Family
Description: Muscarinic acetylcholine receptors are members of rhodopsin-like G-protein coupled receptor family. They play several important roles; they mediate many of the effects of acetylcholine in the central and peripheral nervous system and modulate a variety of physiological functions, such as airway, eye and intestinal smooth muscle contraction, heart rate and glandular secretions. The receptors have a widespread tissue distribution and are a major drug target in human disease. They may be effective therapeutic targets in Alzheimer's disease, schizophrenia, Parkinson's disease and chronic obstructive pulmonary disease [ , ]. There are five muscarinic acetylcholine receptor subtypes, designated M1-5 [ , , , , ]. The family can be further divided into two broad groups based on their primary coupling to G-proteins. M2 and M4 receptors couple to the pertussis-toxin sensitive Gi proteins, whereas M1, M3 and M5 receptors couple to Gq proteins [, ], which activate phospholipase C. The different subtypes can also couple to a wide range of diverse signalling pathways, some of which are G protein-independent [, , ].All subtypes seem to serve as autoreceptors [ ], and knockout mice reveal the important neuromodulatory role played by this receptor family [, , ].The muscarinic acetylcholine M5 receptor is primarily found in the CNS [ , , ], but is also found in esophageal smooth muscle [] and in the heart []. Binding of acetylcholine to the receptor triggers a number of cellular responses, such as adenylate cyclase inhibition [], phosphoinositide degradation [], and potassium channel modulation []. The receptor has been shown to stimulate gastric acid secretion [].
Protein Domain
Name: Peptidase M12A
Type: Domain
Description: This group of metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12A (astacin family, clan MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH [ ].The astacin ( ) family of metalloendopeptidases encompasses a range of proteins found in hydra to humans, in mature and developmental systems [ , ]. Their functions include activation of growth factors, degradation of polypeptides, and processing of extracellular proteins []. The proteins are synthesised with N-terminal signal and pro-enzyme sequences, and many contain multiple domains C-terminal to the protease domain. They are either secreted from cells, or are associated with the plasma membrane [].The astacin molecule adopts a kidney shape, with a deep active-site cleft between its N- and C-terminal domains [ ]. The zinc ion, which lies at the bottom of the cleft, exhibits a unique penta-coordinated mode of binding, involving 3 histidine residues, a tyrosine and a water molecule (which is also bound to the carboxylate side chain of Glu93) []. The N-terminal domain comprises 2 α-helices and a 5-stranded β-sheet. The overall topology of this domain is shared by the archetypal zinc-endopeptidase thermolysin. Astacin protease domains also share common features with serralysins, matrix metalloendopeptidases, and snake venom proteases; they cleave peptide bonds in polypeptides such as insulin B chain and bradykinin, and in proteins such as casein and gelatin; and they have arylamidase activity [].
Protein Domain
Name: R3H domain, Cip2-type
Type: Domain
Description: This R3H domain is found in fungal proteins that are associated with a RNA recognition motif (RRM) domain. Present in this group is the RNA-binding post-transcriptional regulator Cip2 (Csx1-interacting protein 2) involved in counteracting Csx1 function [ ]. Csx1 plays a central role in controlling gene expression during oxidative stress.The R3H domain is a conserved sequence motif found in proteins from a diverse range of organisms including eubacteria, green plants, fungi and various groups of metazoans, but not in archaea and Escherichia coli. The domain is named R3H because it contains an invariant arginine and a highly conserved histidine, that are separated by three residues. It also displays a conserved pattern of hydrophobic residues, prolines and glycines. It can be found alone, in association with AAA domain or with various DNA/RNA binding domains like DSRM, KH, G-patch, PHD, DEAD box, or RRM. The functions of these domains indicate that the R3H domain might be involved in polynucleotide-binding, including DNA, RNA and single-stranded DNA [ ].The 3D structure of the R3H domain has been solved. The fold presents a small motif, consisting of a three-stranded antiparallel β-sheet, against which two α-helices pack from one side. This fold is related to the structures of the YhhP protein and the C-terminal domain of the translational initiation factor IF3. Three conserved basic residues cluster on the same face of the R3H domain and could play a role in nucleic acid recognition. An extended hydrophobic area at a different site of the molecular surface could act as a protein-binding site [ ].
Protein Domain
Name: Zinc finger, CHHC-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a putative zinc-binding domain (CHHC motif) in RNP H and F. The domain is often associated with .
Protein Domain
Name: B9-type C2 domain
Type: Family
Description: The C2 domain is one of the most prevalent eukaryotic lipid-binding domains deployed in diverse functional contexts. Distinct versions of the C2 domain have been recognized, the classical C2, the PI3K-type, the tensin-type, the B9-type, the DOCK-type, the NT-type and the Aida-type. Despite their limited sequence similarity, all C2 domains contain at their core a compact β-sandwich composed of two four-stranded β-sheets with highly variable inter-strand regions that might contain one or more α-helices. One feature that is highly conserved in the C2 domains is the pair of hydrophobic residues on the upper part of the β-sheet, which are involved in imparting a curvature of the sheet that allows formation of a concave ligand-binding area [ ].This entry represents a family of B9-type C2 domain containing proteins, found in ciliary basal body associated proteins. Although its specific function is unknown, a cilia-specific role has been suggested for the poorly characterised B9-type C2 domain [ , , ].Some proteins known to contain a B9-type C2 domain are listed below:Mammalian Tectonic-like complex member MKS1 (also known as Meckel syndrome 1 , MKS1). The tectonic-like complex is localised at the transition zone of primary cilia and acts as a barrier that prevents diffusion of transmembrane proteins between the cilia and plasma membranes. It is involved in centrosome migration to the apical cell surface during early ciliogenesis [ , , ]. The homologue in Drosophila melanogaster is required for ciliary structure and function [, ]. Mammalian proteins B9D1 and B9D2. B9D1 is required for ciliogenesis and sonic hedgehog/SHH signaling [ , ].
Protein Domain
Name: Transcription factor p65, RHD domain, N-terminal
Type: Domain
Description: NF-kappaB is a pleiotropic transcription factor present in almost all cell types. It is the endpoint of a series of signal transduction events that are initiated by a vast array of stimuli related to many biological processes such as inflammation, immunity, differentiation, cell growth, tumorigenesis and apoptosis. NF-kappaB is a homo- or heterodimeric complex formed by the Rel-like domain-containing proteins RelA/p65, RelB, NFKB1/p50, c-Rel and NFKB2/p52 [ ]. Each individual NF-kappaB subunit, and perhaps each dimer, carries out unique functions in regulating transcription. Dimer-specific functions can be conferred by selective protein-protein interactions with other transcription factors, coregulatory proteins, and chromatin proteins [].The prototypical NF-kB complex is a p50/RelA heterodimer. NF-kB is largely sequestered in the cytoplasm through its association with an IkB inhibitor [ ]. Cytoplasmic events culminating in the phosphorylation of IkB-alpha lead to its polyubiquitylation and proteasome-mediated degradation. The liberated NF-kB complex translocates to the nucleus. In the nucleus, site-specific acetylation and phosphorylation of RelA regulates the actions of the NF-kB complex [, ].This entry represents the N-terminal sub-domain of the Rel homology domain (RHD) of Transcription factor p65 (also known as RelA). p65 and p50 is the mostly commonly found heterodimer complex among NF-kappaB homodimers and heterodimers, and is the functional component participating in nuclear transclocation and activation of NF-kappaB [ ]. This complex tightly regulates the expression of a large number of genes, and is involved in processes such as adaptive and innate immunity, stress response, inflammation, cell adhesion, proliferation and apoptosis [, , , ].
Protein Domain
Name: TRIM22, PRY/SPRY domain
Type: Domain
Description: This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C terminus of TRIM22, also known as RING finger protein 94 (RNF94) or STAF50 (Stimulated trans-acting factor of 50kDa). TRIM proteins are defined by the presence of the tripartite motif RING/B-box/coiled-coil region and also known as RBCC proteins [ ]. TRIM22 is an interferon-induced protein, predominantly expressed in peripheral blood leukocytes, in lymphoid tissue such as spleen and thymus, and in the ovary [, ]. TRIM22 plays an integral role in the host innate immune response to viruses; it has been shown to inhibit the replication of a number of viruses, including HIV-1, hepatitis B, and influenza A. TRIM22 inhibits influenza A virus (IAV) infection by targeting the viral nucleoprotein for degradation; it represents a novel restriction factor up-regulated upon IAV infection that curtails its replicative capacity in epithelial cells []. Altered TRIM22 expression has also been associated with multiple sclerosis, cancer, and autoimmune disease. A large number of high-risk non-synonymous (ns)SNPs have been identified in the highly polymorphic TRIM22 gene, most of which are located in the SPRY domain and could possibly alter critical regions of the SPRY structural and functional residues, including several sites that undergo post-translational modification []. TRIM22 is a direct p53 target gene and inhibits the clonogenic growth of leukemic cells []. Its expression in Wilms tumors is negatively associated with disease relapse. It is greatly under-expressed in breast cancer cells as compared to non-malignant cell lines; p53 dysfunction may be one of the mechanisms for its down-regulation [].
Protein Domain
Name: Bacterial exotoxin B
Type: Family
Description: A large group of bacterial exotoxins are referred to as "A/B toxins", essentially because they are formed from two subunits []. The "A"subunit possesses enzyme activity, and is transferred to the host cell following a conformational change in the membrane-bound transport "B"subunit. Clostridial species are one of the major causes of food poisoning/gastro-intestinal illnesses. They are Gram-positive, spore-forming rods that occur naturally in the soil []. Among the toxins produced by certain Clostridium spp. are the binary exotoxins. These proteins consist of two independent polypeptides, which correspond to the A/B subunit moieties. The enzyme component (A) enters the cell through endosomes produced by the oligomeric binding/translocation protein (B), and prevents actin polymerisation through ADP-ribosylation of monomeric G-actin [, , ].Members of the "B"binary toxin family also include the Bacillus anthracis protective antigen (PA) protein [ ], most likely due to a common evolutionary ancestor. B. anthracis, a large Gram-positive spore-forming rod, is the causative agent of anthrax. Its two virulence factors are the poly-D-glutamate polypeptide capsule, and the actual anthrax exotoxin []. The toxin comprises three factors: the protective antigen (PA); the oedema factor (EF); and the lethal factor (LF). Each is a thermolabile protein of ~80kDa. PA forms the "B"part of the exotoxin and allows passage of the "A"moiety (consisting of EF and LF) into target cells. PA protein forms the central part of the complete anthrax toxin, and translocates the B moiety into host cells after assembling as a heptamer in the membrane [ , ].
Protein Domain
Name: Protective antigen, Ca-binding domain
Type: Domain
Description: This domain is a calcium-binding domain in the anthrax toxin protective antigen (PA).A large group of bacterial exotoxins are referred to as "A/B toxins", essentially because they are formed from two subunits []. The "A"subunit possesses enzyme activity, and is transferred to the host cell following a conformational change in the membrane-bound transport "B"subunit. Clostridial species are one of the major causes of food poisoning/gastro-intestinal illnesses. They are Gram-positive, spore-forming rods that occur naturally in the soil []. Among the toxins produced by certain Clostridium spp. are the binary exotoxins. These proteins consist of two independent polypeptides, which correspond to the A/B subunit moieties. The enzyme component (A) enters the cell through endosomes produced by the oligomeric binding/translocation protein (B), and prevents actin polymerisation through ADP-ribosylation of monomeric G-actin [ , , ].Members of the "B"binary toxin family also include the Bacillus anthracis protective antigen (PA) protein [ ], most likely due to a common evolutionary ancestor. B. anthracis, a large Gram-positive spore-forming rod, is the causative agent of anthrax. Its two virulence factors are the poly-D-glutamate polypeptide capsule, and the actual anthrax exotoxin []. The toxin comprises three factors: the protective antigen (PA); the oedema factor (EF); and the lethal factor (LF). Each is a thermolabile protein of ~80kDa. PA forms the "B"part of the exotoxin and allows passage of the "A"moiety (consisting of EF and LF) into target cells. PA protein forms the central part of the complete anthrax toxin, and translocates the B moiety into host cells after assembling as a heptamer in the membrane [ , ].
Protein Domain
Name: Saposin, chordata
Type: Family
Description: Sphingolipids are bioactive compounds found in lower and higher eukaryotes. They are involved in the regulation of various cellular functions, such asgrowth, differentiation and apoptosis, and are believed to be essential in a healthy diet. Sphigolipids are degraded in the lysosome, and theproducts from their hydrolysis are used in other biosynthetic and regulatory pathways in the host.There are a number of lysosomal enzymes involved in the breakdown ofsphinogolipids, and these act in sequence to degrade the moieties [ ]. These enzymes require co-proteins called sphingolipid activator proteins, (SAPs or saposins), to stabilise and activate them as necessary. SAPs are non-enzymatic and usually have a low molecular weight. They are conserved across a wide range of eukaryotes and contain specific saposin domains that aid in the activation of hydrolase enzymes. There have been four human saposins described so far, sharing significant similarity with each otherand with other eukaryotic SAP proteins. Mutations in SAP genes have been linked to a number of conditions. A defectin the saposin B region leads to metachromatic leucodystrophy (MLD), while a single nucleotide polymorphism in the SAP-C region may give rise toGaucher disease [ ]. More recently, an opportunistic protozoan parasite protein has shown similarity both to the higher and lower eukaryotic saposins. The pore-forming protein isolated from virulent Naegleria fowleri (Brain eating amoeba) has been dubbed Naegleriapore A. It also shares structural similarity with cytolytic bacterial peptides, although this similarity does not extend to the sequence level.This entry represents a group of saposins found specifically in chordates.
Protein Domain
Name: Myozenin
Type: Family
Description: This family consists of several mammalian calcineurin-binding proteins. Calcineurin is a Ca/calmodulin-dependent serine-threonine phosphatase and has been implicated in the transduction of signals that control the hypertrophy of cardiac muscle and slow fibre gene expression in striated muscle. A novel family of striated muscle-specific calcineurin-interacting proteins called calsarcins or myozenins has been identified that interact and co-localize with the Z-disc protein alpha-actinin thereby coupling muscle activity to calcineurin activation [, ].Because calcineurin responds to sustained, low amplitude calcium signals, calsarcins may serve to localize calcineurin in the vicinity of unique intracellular pool, where it can interact with specific upstream activators or downstream substrates. Therefore, calsarcins may play an important role in modulating the function and substrate specificity of calcineurin in striated muscle cells.Three isoforms of calsarcins that have been identified in human, rat and mouse: Calsarcin-1 (CALS-1, Myozenin-2, MYOZ2). Calsarcin-2 (CALS-2, Myozenin-1, FATZ) Calsarcin-3 (Myozenin-3, MYOZ3).Calsarcin-1, is expressed, throughout the development-cycle, in all striated muscle tissues. However, CALS-1 expression is localized in slow-twitch fibres. Calsarcin-2, has an approximate ~30% identity with CALS-1 is a globular protein with central glycine-rich domain flanked by a-helical regions. CALS-2 is expressed transiently in heart during early embryogenesis and later becomes restricted to skeletal muscle with weaker signals in adult prostate, placenta and pancreas. In contrast to CALS-1, the expression of Calsarcin-2 is restricted to fast-twitch skeletal fibre. Calsarcin-3, is expressed specifically in skeletal muscle and is enriched in fast-twitch muscle fibres. Like calsarcin-1 and calsarcin-2, calsarcin-3 interacts with calcineurin, and the Z-disc proteins alpha-actinin, gamma-filamin, and telethonin [ ].
Protein Domain
Name: COP9 signalosome subunit 6
Type: Family
Description: The COP9 signalosome (CSN) is a conserved protein complex that regulates the ubiquitin (Ubl) conjugation pathway by mediating the deneddylation of the cullin subunits of SCF-type E3 ligase complexes [ ], which leads to a decrease in Ubl ligase activity of SCF-type complexes such as SCF, CSA or DDB2 []. Protein kinases CK2 and D, which phosphorylate proteins such as cJun and p53 resulting in their degradation by the ubiquitin-26S proteasome system, also binds to CSN [, ]. The mammalian CSN typically consistis of eight subunits designated CSN1-CSN8. The fission yeast possesses a smaller version of the CSN, consisting only of six subunits, whereas a more distant CSN-like complex has been described in Saccharomyces cerevisiae [].CSN6 (COP9 signalosome subunit 6; COP9 subunit 6; MOV34 homolog, 34 kD; MEROPS identifier M67.972) is one of the eight subunits of COP9 signalosome. CSN6 is an MPN-domain protein that directly interacts with the MPN+-domain subunit CSN5 [ ]. It is cleaved during apoptosis by activated caspases. CSN6 processing occurs in CSN/CRL (cullin-RING Ub ligase) complexes and is followed by the cleavage of Rbx1, the direct interaction partner of CSN6 []. CSN6 cleavage enhances CSN-mediated deneddylating activity (i.e. cleavage of ubiquitin-like protein Nedd8 (neural precursor cell expressed, developmentally downregulated 8)) in the cullin 1 in cells []. The cleavage of Rbx1 and increased deneddylation of cullins inactivate CRLs and presumably stabilize pro-apoptotic factors for final apoptotic steps. While CSN6 shows a typical MPN metalloprotease fold, it lacks the canonical JAMM motif, and therefore does not show catalytic isopeptidase activity.
Protein Domain
Name: PdxS/SNZ N-terminal domain
Type: Domain
Description: The family of pyridoxal 5'-phosphate synthase subunits, known as the PdxS/SNZ family, occur in organisms in four kingdoms and form one of the most highly conserved families [ ]. A PdxS/SNZ protein has a classic (beta/alpha)8-barrel fold, consisting of eight parallel β-strands alternating with eight alpha helices. PdxS subunits form two hexameric rings []. Proteins are involved in vitamin B6 biosynthesis.The term vitamin B6 is used to refer collectively to the compound pyridoxine and its vitameric forms, pyridoxal, pyridoxamine, and their phosphorylated derivatives. Vitamin B6 is required by all organisms and plays an essential role as a co-factor for enzymatic reactions. Plants, fungi, bacteria, archaebacteria, and protists synthetize vitamin B6. Animals and some highly specialised obligate pathogens obtain it nutritionally. Vitamin B6 has two distinct biosynthetic pathways, which do not coexist in any organism. The pdxA/pdxJ pathway, that has been extensively characterised in Escherichia coli, is found in the gamma subdivision of the proteobacteria. A second pathway of vitamin B6 synthesis involving the pdxS/SNZ and pdxT/SNO protein families, which are completely unrelated in sequence to the pdxA/pdxJ proteins, is found in plants, fungi, protists, archaebacteria and most bacteria [, , ].PdxS/SNZ and pdxT/SNO proteins form a complex which serves as a glutamine amidotransferase to supply ammonia as a source of the ring nitrogen of vitamin B6 [ ]. PdxT/SNO and pdxS/SNZ appear to encode respectively the glutaminase subunit, which produces ammonia from glutamine, and the synthase subunit, which combines ammonia with five- and three-carbon phosphosugars to form vitamin B6 [].This entry represents the N-terminal domain of the pyridoxal 5'-phosphate synthase subunit PdxS/SNZ.
Protein Domain
Name: Phospholipase A2 inhibitor, alpha/gamma type
Type: Family
Description: This entry represents alpha- and gamma-type phospholipase A2 inhibitors (PLI) found in a variety of snakes, including Elapidae and Viperidae. Phospholipase A2 (PLA2; ) is a calcium-dependent enzyme that is involved in inflammatory processes such as the liberation of free arachidonic acid from the membrane pool for the biosynthesis of eicosanoids. Both Elapidae and Viperidae contain PLA2 enzymes in their venoms [ , ], which can exhibit a wide variety of pharmacological effects including neurotoxicity and myotoxicity. As a result, these snakes must contain PLI in their blood in order to protect themselves from leakage of their own venom PLA2s into the circulatory system. Venomous snakes have three distinct types of PLA2-inhibitory proteins: PLI-alpha, PLI-beta, and PLI-gamma. Alpha-type PLI (PLI-alpha) proteins have been found in a number of Viperidae snakes [ ]. Most PLI-alpha proteins are homomultimers composed of 3-5 subunits, except in Trimeresurus flavoviridis (Habu), where PLI-alpha consists of a trimer of two homologous subunits (PLI-alpha-A and PLI-alpha-B), each of which contains one C-type lectin-like domain and exhibiting significant homology to serum mannose-binding protein and lung-surfactant apoprotein []. A PLI-alpha homologue that lacks inhibitory activity was found in the non-venomous snake Elaphe quadrivirgata (Japanese four-lined ratsnake) [].Phospholipase A2 inhibitor CNF, a gamma-type PLI, inhibits the PLA2 activity of crotoxin (CTX) by replacing the acid subunit (CA) in the CTX complex [ ]. It has a proinflammatory action through activation of important main signalling pathways for human leukocytes, in vitro []. In mice phrenic nerve-diaphragm muscle preparations it abolishes both the muscle-paralyzing and muscle-damaging activities of CTX [].
Protein Domain
Name: Deoxyribonuclease I
Type: Family
Description: Deoxyribonuclease I (DNase I) ( ) [ ] is a vertebrate enzyme which catalyzes the endonucleolytic cleavage of double-stranded DNA to 5'- phosphodinucleotide and 5'-phosphooligonucleotide end-products. DNase I is an enzyme involved in DNA degradation; it is normally secreted outside of the cell but seems to be able to gain access to the nucleus where it is involved in cell death by apoptosis [].As shown in the following schematic representation, DNase I is a glycoprotein of about 260 residues with two conserved disulphide bonds. +-+ +--------+ | | | |xxxxxxxxxxxxxxxxx#xxxxxxCxCxxxxx#xxxxxxxxxCxxxxxxxxCxxxxxxxxxxxxx 'C': conserved cysteine involved in a disulphide bond. '#': active site residue.DNase I has a pH-optimum around 7.5 and requires calcium and magnesium for full activity. It causes single strand nicks in duplex DNA. A proton acceptor-donor chain composed of an histidine and a glutamic acid produce a nucleophilic hydroxyl ion from water, which cleaves the 3'-P-O bond [ ].DNase I forms a 1:1 complex with G-actin, resulting in the inhibition of DNase activity and loss of the ability of G-actin to polymerise into fibres [ ].DNase I has been used in the treatment of lung problems in patients with cystic fibrosis: here it acts by degrading DNA found in purulent lung secretions, reducing their viscosity and making it easier for the patient to breathe [ ].The sequence of DNase I is evolutionary related to that of human muscle-specific DNase-like protein and human proteins DHP1 and DHP2. However, the first disulphide bond of DNase I is not conserved in these proteins.This entry represents DNaseI and related proteins such as DNase gamma.
Protein Domain
Name: TetR transcriptional regulator Rv1219c-like, C-terminal domain
Type: Domain
Description: TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity [ ]. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response []. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis [ ]. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain [].This entry represents the C-terminal domain present in Rv1219c of Mycobacterium tuberculosis. Structural studies indicate that the helix alpha 10 of the C-terminal end of Rv1219c forms a long arm feature, a feature which is unique in Rv1219c compared to some other members of the TetR family. Furthermore, it has been shown that substrate binding occurs in the C-terminal regulatory domain of Rv1219c [ ].
Protein Domain
Name: Melanocortin receptor 3-5
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Adrenocorticotrophin (ACTH), melanocyte-stimulating hormones (MSH) and beta-endorphin are peptide products of pituitary pro-opiomelanocortin.ACTH regulates synthesis and release of glucocorticoids and aldosterone in the adrenal cortex; it also has a trophic action on these cells.ACTH and beta-endorphin are synthesised and released in response to corticotrophin-releasing factor at times of stress (heat, cold, infections,etc.) - their release leads to increased metabolism and analgesia. MSH has a trophic action on melanocytes, and regulates pigment productionin fish and amphibia. The ACTH receptor is found in high levels in the adrenal cortex - binding sites are present in lower levels in theCNS. The MSH receptor is expressed in high levels in melanocytes, melanomas and their derived cell lines. Receptors are found in lowlevels in the CNS. MSH regulates temperature control in the septal region of the brain and releases prolactin from the pituitary.This entry represents Melanocortin receptor 3-5 (MC3-5R) from chordates. These protein are receptors for MSH (alpha, beta and gamma) and ACTH. The activity of this receptor is mediated by G proteins which activate adenylate cyclase. MC3R is required for expression of anticipatory patterns of activity and wakefulness during periods of limited nutrient availability and for the normal regulation of circadian clock activity in the brain [ ]. MC4R plays a central role in energy homeostasis and somatic growth [, , ]. MC5R is a possible mediator of the immunomodulation properties of melanocortins, playing a role in immune reaction and inflammatory response as well as in the regulation of sexual behaviour, thermoregulation, and exocrine secretion [].
Protein Domain
Name: HPT domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a domain present at the N terminus in proteins which undergo autophosphorylation. The group includes histidine kinases such as CheA from Escherichia coli, the gliding motility regulatory protein from Myxococcus xanthus, and a number of bacterial chemotaxis proteins.Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [ , ]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. HKs can be roughly divided into two classes: orthodox and hybrid kinases [ , ]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [ ]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [ , ].
Protein Domain
Name: Leukotriene B4 type 2 receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Leukotrienes (LT) are potent lipid mediators derived from arachidonic acid metabolism. They can be divided into two classes, based on the presence or absence of a cysteinyl group. Leukotriene B4 (LTB4) does not contain such a group, whereas LTC4, LTD4, LTE4 and LTF4 are cysteinyl leukotrienes.LTB4 is one of the most effective chemoattractant mediators known, and is produced predominantly by neutrophils and macrophages. It is involved in a number of events, including: stimulation of leukocyte migration from the bloodstream; activation of neutrophils; inflammatory pain; host defence against infection; increased interleukin production and transcription [ ]. It is found in elevated concentrations in a number of inflammatory and allergic conditions, such as asthma, psoriasis, rheumatoid arthritis and inflammatory bowel disease, and has been implicated in the pathogenesis of these diseases [].Binding sites for LTB4 have been observed in membrane preparations from leukocytes, macrophages and spleen. Two receptors for LTB4 have since been cloned (BLT1 and BLT2); both are members of the rhodopsin-like G-protein-coupled receptor superfamily [ ].The leukotriene B4 type 2 receptor gene (BLT2) has been located in both the human and mouse genomes, and is found in close proximity to BLT1 in both species [ ]. The receptor is expressed in most human tissues, with highest levels in the liver, spleen, ovary and leukocytes []. Binding of LTB4 to the receptor produces increased levels of inositol trisphosphate and calcium, inhibition of forskolin-stimulated adenylyl cyclase activity and chemotaxis []. These effects may be accomplished by coupling to G-proteins of the Gq, Gi and Gz classes [].
Protein Domain
Name: Signal transduction histidine kinase, phosphotransfer (Hpt) domain
Type: Domain
Description: Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [ ]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [ , ].Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [, ]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. HKs can be roughly divided into two classes: orthodox and hybrid kinases [ , ]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.This entry represents a domain present at the N terminus in proteins which undergo autophosphorylation. The group includes, the gliding motility regulatory protein from Myxococcus xanthus and a number of bacterial chemotaxis proteins.
Protein Domain
Name: Ephexin-like
Type: Family
Description: This entry includes ephexin family members [ , , ] which comprises ephexin-1 to 5 and related animal proteins, such as ARHGEF26, also called SGEF (SH3 domain-containing GEF) which shows structural similarities with ephexins []. ARHGEF26 is highly expressed in liver and may play a role in regulating membrane dynamics []. A common feature of this proteins, apart from their high sequence homology, is that they are the direct downstream proteins of Eph receptors, a large subfamily of receptor tyrosine kinases that is activated by Ephrins and involved in various cellular processes such as axon guidance, formation of tissue boundaries, long-term potentiation, angiogenesis, and cancer. The are essential for normal function of neurons and their development []. Ephexin-1 (also called NGEF/neuronal guanine nucleotide exchange factor) plays a role in the homeostatic modulation of presynaptic neurotransmitter release and plays crucial roles in axon guidance [ ].Literature data about Ephexin-2 (also known as RhoGEF19) is limited, however, its intrinsic role to function as a GEF for RhoA seems to be clear. It is involved in convergent extension, a developmental step of anterior-posterior axis extension in Xenopus gastrulation through RhoA activation and it also participates in pronephric tubulogenesis of Xenopus and zebrafish. Elevated levels of Ephexin-2 results in the increased activity of RhoA which causes higher cancer proliferation, migration, and invasion [ ]. Ephexin-3 (also called Rho guanine nucleotide exchange factor 5/RhoGEF5) is ubiquitously expressed in many tissues, such as colon, kidney, trachea, prostate, liver, and pancreas, with tendency to be highly expressed in tissues containing epithelial cells. It functions as a GEF for RhoA. It plays a role in cell migration and adhesion as it is involved in Src-induced podosome formation and its deletion causes defects in immature dendritic cell migration in vivo [ ]. Ephexin-4 (also called RhoGEF16) acts downstream of EphA2 to promote ligand-independent breast cancer cell migration and invasion toward epidermal growth factor through activation of RhoG. This in turn results in the activation of RhoG which recruits ELMO2 and Dock4 to form a complex with EphA2 at the tips of cortactin-rich protrusions in migrating breast cancer cells [ , , ].Ephexin-5 (also known as RhoGEF15 and Vsm-RhoGEF) is the specific GEF for RhoA activation and the regulation of vascular smooth muscle contractility and it is also involved in angiogenesis, as it mediates VEGF-induced Rho GTPase activity modulation. It interacts with EPHA4 PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. It is highly expressed in the brain, especially in the hippocampus, where it may act as a beacon locating sites of new spine formation keeping them in check until incoming activity promotes spine formation at these sites [ ].Members of the Ephexin family contain a RhoGEF (DH) followed by a PH domain and an SH3 domain, except in Ephexin-5, in which the SH3 domain is absent [ , , ]. The ephexin PH domain is believed to act with the DH domain in mediating protein-protein interactions [, , ].
Protein Domain
Name: Zinc finger, RING/FYVE/PHD-type
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This superfamily represents RING-, PHD-, and FYVE-type zinc finger domains, which share a common dimetal (zinc)-bound α/β structural fold, as well as the non-zinc-containing U-box domain, which is similar to the RING zinc finger only lacking the metal ion-binding residues (U-box associated with multi-ubiquitination).
Protein Domain
Name: Glutaredoxin subgroup
Type: Domain
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This entry represents a conserved region including the active site of this enzyme.
Protein Domain
Name: Glutaredoxin, eukaryotic/virial
Type: Domain
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This entry is found in eukaryotic glutaredoxins and includes sequences from fungi, plants and metazoans as well as viruses [ ].
Protein Domain
Name: Glutaredoxin active site
Type: Active_site
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This entry represents the Glutaredoxin active site.
Protein Domain
Name: Glutaredoxin domain
Type: Domain
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This C-terminal domain with homology to glutaredoxin is fused to an N-terminal peroxiredoxin-like domain.
Protein Domain
Name: DegT/DnrJ/EryC1/StrS aminotransferase
Type: Family
Description: This entry represents the 3-amino-5-hydroxybenzoic acid synthase family (AHBA_syn) that are probably all pyridoxal-phosphate-dependent aminotransferase enzymes with a variety of molecular functions. Members of the family have the same structural fold as members of the pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily [ ]. The AHBA_syn family members are involved in various biosynthetic pathways for secondary metabolites. The AHBA_synfamily includes StsA , StsC and StsS [ ]. The aminotransferase activity was demonstrated for purified StsC protein as the L-glutamine:scyllo-inosose aminotransferase , which catalyses the first amino transfer in the biosynthesis of the streptidine subunit of streptomycin [ ]. Some other well studied proteins in this family are AHBA_synthase, the protein product of the pleiotropic regulatory gene degT, Arnb aminotransferase and pilin glycosylation protein. The prototype of this family, the AHBA_synthase, is a dimeric PLP dependent enzyme.AHBA_syn is the terminal enzyme of 3-amino-5-hydroxybenzoic acid (AHBA) formation which is involved in the biosynthesis of ansamycin antibiotics, including rifamycin B. Some members of this family are involved in 4-amino-6-deoxy-monosaccharide D-perosamine synthesis. Perosamine is an important element in the glycosylation of several cell products, such as antibiotics and lipopolysaccharides of Gram-positive and Gram-negative bacteria. The pilin glycosylation protein encoded by gene pglA, is a galactosyltransferase involved in pilin glycosylation. Additionally, this family consists of ArnB (PmrH) aminotransferase, a 4-amino-4-deoxy-L-arabinose lipopolysaccharide-modifying enzyme. This family also includes several predicted pyridoxal phosphate-dependent enzymes apparently involved in regulation of cell wall biogenesis. The catalytic lysine which is present in all characterized PLP dependent enzymes is replaced by histidine in some members of this family [ , , , , , , , , , , , , ].
Protein Domain
Name: Muscarinic acetylcholine receptor M1
Type: Family
Description: Muscarinic acetylcholine receptors are members of rhodopsin-like G-protein coupled receptor family. They play several important roles; they mediate many of the effects of acetylcholine in the central and peripheral nervous system and modulate a variety of physiological functions, such as airway, eye and intestinal smooth muscle contraction, heart rate and glandular secretions. The receptors have a widespread tissue distribution and are a major drug target in human disease. They may be effective therapeutic targets in Alzheimer's disease, schizophrenia, Parkinson's disease and chronic obstructive pulmonary disease [ , ]. There are five muscarinic acetylcholine receptor subtypes, designated M1-5 [ , , , , ]. The family can be further divided into two broad groups based on their primary coupling to G-proteins. M2 and M4 receptors couple to the pertussis-toxin sensitive Gi proteins, whereas M1, M3 and M5 receptors couple to Gq proteins [, ], which activate phospholipase C. The different subtypes can also couple to a wide range of diverse signalling pathways, some of which are G protein-independent [, , ].All subtypes seem to serve as autoreceptors [ ], and knockout mice reveal the important neuromodulatory role played by this receptor family [, , ].Muscarinic acetylcholine receptor M1 it is common in exocrine glands and the CNS [ , , ], being is particularly abundant in the cerebal cortex and hippocampus [, ]. The receptors have also been also found on esophageal smooth muscle [] and bladder tissue []. Their distribution largely overlaps with that of M3 and M4 receptors. M1 receptors are involved in mediating higher cognitive processes, such as learning and memory []. They also play a role in regulation of locomotor activity [] and in salivation [].
Protein Domain
Name: Immunoglobulin V-set domain
Type: Domain
Description: The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable; ), C1-set (constant-1; ), C2-set (constant-2; ) and I-set (intermediate; ) [ ]. Structural studies have shown that these domains share a common core Greek-key β-sandwich structure, with the types differing in the number of strands in the β-sheets as well as in their sequence patterns [, ].Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system [ ]. This entry represents the V-set domains, which are Ig-like domains resembling the antibody variable domain. V-set domains are found in diverse protein families, including immunoglobulin light and heavy chains; in several T-cell receptors such as CD2 (Cluster of Differentiation 2), CD4, CD80, and CD86; in myelin membrane adhesion molecules; in junction adhesion molecules (JAM); in tyrosine-protein kinase receptors; and in the programmed cell death protein 1 (PD1).
Protein Domain
Name: Cucumisin-like catalytic domain
Type: Domain
Description: This entry represents a peptidase cucumisin-like domain, found in a subgroup of proteins that belong to the members of the peptidases S8 (subtilisins), predominantly from plants. This peptidase domain has a protease-associated (PA) domain nested within it. Some members, such as cucumisin, have an extra C-terminal fibronectin-III-like domain. Cucumisin (MEROPS identifier S08.092) is an extracellular glycoprotein that is a major allergen from melon []. It is synthesized as a precursor with the C-terminal of the propeptide binding to the active site cleft in a substrate-like manner []. Other proteins included in this entry are the ALE1 peptidase (abnormal leaf shape 1; S08.014) from Arabidopsiswhich is expressed in the developing embryo and absence of which leads to a defective cuticle [ ]; and phytaspase (S08.150) which have a caspase-like specificity and play a similar role in programmed cell death [].The subtilisin family is one of the largest serine peptidase families characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence [ ]. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses []. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase [, ]. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity [, ]. Some subtilisins are mosaic proteins, while others contain N- and C-terminal extensions that show no sequence similarity to any other known protein [].
Protein Domain
Name: Carbohydrate-binding module superfamily 5/12
Type: Homologous_superfamily
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This entry represents and . These modules have a core structure consisting of a 3-stranded meander β-sheet, which contain six aromatic groups that may be important for binding. CBM5/12 is found in proteins such as chitinase A1, chitinase B [], and endoglucanase Z [].The overall topology of the CBM is structurally similar to the C-terminal chitin-binding domains (ChBD) of chitinase A1 and chitinase B, however the binding mechanism for the ChBD may be different from that of the CBM [ ].
Protein Domain
Name: Cystathionine beta-synthase, C-terminal domain
Type: Domain
Description: This entry represents the C-terminal region of Cystathionine beta-synthase (CBS), which includes two tandem repeats of the CBS domain. CBS is an hydro-lyase that catalyses the first step of the transsulfuration pathway, where the hydroxyl group of L-serine is displaced by L-homocysteine in a beta-replacement reaction to form L-cystathionine, the precursor of L-cysteine. This catabolic route allows the elimination of L-methionine and the toxic metabolite L-homocysteine [ , , ]. This protein is also involved in the production of hydrogen sulphide, a gasotransmitter with signalling and cytoprotective effects on neurons [, ]. CBS domains are evolutionarily conserved structural domains found in a variety of non functionally-related proteins from all kingdoms of life. These domains pair together to form a intramolecular dimeric structure (CBS pair), termed Bateman domain [ , , , ]. CBS domains have been shown to bind mainly ligands with an adenosyl group such as AMP, ATP and S-AdoMet, but may also bind metal ions, or nucleic acids [, ]. Hence, they play an essential role in the regulation of the activities of numerous proteins, and mutations in them are associated with several hereditary diseases [, , ]. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role making proteins sensitive to adenosyl-carrying ligands. The region containing the CBS domains in cystathionine-beta synthase is involved in regulation by S-AdoMet []. CBS domain pairs from AMPK bind AMP or ATP []. The CBS domains from IMPDH, which bind ATP, have shown to have a role in the regulation of adenylate nucleotide synthesis [, ].
Protein Domain
Name: Transthyretin/hydroxyisourate hydrolase domain superfamily
Type: Homologous_superfamily
Description: This entry includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this entry do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU [ , ]. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage [, ]. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence.Transthyretin (formerly prealbumin) is one of 3 thyroid hormone-binding proteins found in the blood of vertebrates [ ]. It is produced in the liver and circulates in the bloodstream, where it binds retinol and thyroxine (T4) []. It differs from the other 2 hormone-binding proteins (T4-binding globulin and albumin) in 3 distinct ways: (1) the gene is expressed at a high rate in the brain choroid plexus; (2) it is enriched in cerebrospinal fluid; and (3) no genetically caused absence has been observed, suggesting an essential role in brain function, distinct from that played in the bloodstream []. The protein consists of around 130 amino acids, which assemble as a homotetramer that contains an internal channel in which T4 is bound. Within this complex, T4 appears to be transported across the blood-brain barrier, where, in the choroid plexus, the hormone stimulates further synthesis of transthyretin. The protein then diffuses back into the bloodstream, where it binds T4 for transport back to the brain []. Structurally, it consists of a sandwich fold with seven strands arranged in two sheets and a greek-key topology.
Protein Domain
Name: Glycosyl amidation-associated protein, WbuZ
Type: Family
Description: This entry represents a protein highly similar to the HisF protein, but generally represents the second HisF homologue in the genome where the other is an authentic HisF observed in the context of a complete histidine biosynthesis operon. The similarity between these WbuZ sequences and true HisFs is such that often the closest match by BLAST of a WbuZ is a HisF. Only by making a multiple sequence alignment is the homology relationship among the WbuZ sequences made apparent. WbuZ genes are invariably observed in the presence of a homologue of the HisH protein (designated WbuY) and a proposed N-acetyl sugar amidotransferase designated in WbuX in Escherichia coli [ ], IfnA in Pseudomonas aeruginosa [] and PseA in Campylobacter jejuni []. Similarly, this trio of genes is invariably found in the context of saccharide biosynthesis loci. It has been shown that the WbuYZ homologues are not essential components of the activity expressed by WbuX, leading to the proposal that these to proteins provide ammonium ions to the amidotransferase when these are in low concentration [ ]. WbuY (like HisH) is proposed to act as a glutaminase to release ammonium. In histidine biosynthesis this is also dispensable in the presence of exogenous ammonium ion. HisH and HisF form a complex such that the ammonium ion is passed directly to HisF where it is used in an amidation reaction causing a subsequent cleavage and cyclization. In the case of WbuYZ, the ammonium ion would be passed from WbuY to WbuZ. WbuZ, being non-essential and so similar to HisF that a sugar substrate is unlikely, would function instead as a ammonium channel to the WbuX protein which does the enzymatic work.
Protein Domain
Name: Myosin phosphatase-RhoA interacting protein, PH domain
Type: Domain
Description: Myosin phosphatase-RhoA interacting protein (M-RIP) is proposed to play a role in myosin phosphatase regulation by RhoA. M-RIP contains two PH domains followed by a Rho binding domain (Rho-BD), and a C-terminal myosin binding subunit (MBS) binding domain (MBS-BD). The amino terminus of M-RIP with its adjacent PH domains and polyproline motifs mediates binding to both actin and Galpha. M-RIP brings RhoA and MBS into close proximity where M-RIP can target RhoA to the myosin phosphatase complex to regulate the myosin phosphorylation state. M-RIP does this via its C-terminal coiled-coil domain which interacts with the MBS leucine zipper domain of myosin phosphatase, while its Rho-BD, directly binds RhoA in a nucleotide-independent manner [ , ].PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [ ]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [].
Protein Domain
Name: Zinc finger, A20-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the zinc finger domain found in A20. A20 is an inhibitor of cell death that inhibits NF-kappaB activation via the tumour necrosis factor receptor associated factor pathway [ ]. The zinc finger domains appear to mediate self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation.
Protein Domain
Name: BLOC-2 complex member HPS5
Type: Family
Description: Lysosome-related organelles comprise a group of specialised intracellular compartments that include melanosomes and platelet dense granules in mammals and eye pigment granules in insects. Hermansky-Pudlak syndrome (HPS) is a disorder of lysosome-related organelle biogenesis. Genes associated with HPS encode subunits of three complexes that are known as biogenesis of lysosome-related organelles complex (BLOC)-1, -2 and -3 [ ]. There are eight known HPS proteins of the BLOCs [, ]]. Organelles affected in HPS include the melanosome, resulting in hypopigmentation, and the platelet delta (dense) granule, resulting in prolonged bleeding times. HPS in humans or mice is caused by mutations in any of 15 genes, five of which encode subunits BLOC-1. BLOC-1 and BLOC-2 act sequentially in the same pathway. Melanosome maturation requires at least two cargo transport pathways directly from early endosomes to melanosomes. One pathway mediated by AP-3, and one pathway mediated by BLOC-1 and BLOC-2 []. The adaptor protein AP-3 complex is a component of the cellular machinery that controls protein sorting from endosomes to lysosomes and melanosomes. BLOC-1 interacts physically and functionally with AP-3 to facilitate the trafficking of a known AP-3 cargo, CD63, and of tyrosinase-related protein 1 (Tyrp1). BLOC-1 also interacts with BLOC-2 to facilitate Tyrp1 trafficking by a mechanism apparently independent of AP-3 function. Both BLOC-1 and -2 predominantly localise to early endosome-associated tubules [].Complex-2 (BLOC-2) contains the HPS3, HPS5 and HPS6 proteins as subunits. Fibroblasts deficient in the BLOC-2 subunits HPS3 or HPS6 have normal basal secretion function of the lysosomal enzyme beta-hexosaminidase [ ].This entry also includes HPS5 homologues from insects. Fruit fly HPS5 (also known as p) has a role in the biogenesis of eye pigment granules [ , ].
Protein Domain
Name: Zinc finger, CGNR
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a C-terminal zinc finger domain. It seems likely to be DNA-binding given the conservation of many positively charged residues. The domain is named after a highly conserved motif found in many members of the family.
Protein Domain
Name: Alpha giardin
Type: Family
Description: Giardia lamblia (Giardia intestinalis) is a protozoan parasite of numerous mammals, including Homo sapiens []. It belongs to the phylum Sarcomastigophora, and is amongst the most primitive eukaryotes identified to date. It is the main causative agent of global protozoan diarrhoea, and severe infection can cause giardiasis.G. lamblia exists as either trophozoites that live in the small intestine of the host and cause the disease symptoms, or cysts that are passed in thefaeces of the host and infect the next host through contaminated water or food []. Trophozoites exhibit antigenic variation to evade the host immune system, expressing a number of virulence factors to aid adherence and invasion of the small intestine endothelium []. The molecular basis for its antigenic variation has been well characterised, and it is believed that its phenotypic heretogeneity arises from sexual reproduction []. One of themajor virulence factors of G. lamblia is giardin, an antigen expressed as several variants on the trophozoite surface []. Alpha giardin is the predominant immunotypic giardin present, although beta and gamma giardin have also been identified []. This entry represents a family of alpha giardin proteins specific to Giardia. The biochemical properties of the subunit identify the protein as an annexin, a eukaryotic protein widely conservedamongst plants and animals. This protein associates with phosphatidyl serine-containing vesicles in a Ca2+-dependent manner, and has very low sequence similarity with human annexin XIX []. This protein shows a fourfold cyclic configuration of five-helix bundles with repeats I/IV, and II/III each forming a module. The canonical membrane binding site of theannexins is located on the convex side of the molecule [ ].
Protein Domain
Name: Glutaredoxin, GrxA
Type: Family
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This entry includes the Escherichia coli glyutaredoxin GrxA which appears to have primary responsibility for the reduction of ribonucleotide reductase [ ].
Protein Domain
Name: Dystroglycan, C-terminal
Type: Domain
Description: Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta](C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues [].The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan.Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [ ].
Protein Domain
Name: Helper-component proteinase (HC-Pro) cysteine protease (CPD) domain
Type: Domain
Description: This entry represents the CPD domain of the HC-Pro protein. Potyviruses form one of the most numerous groups of plant viruses and are a major cause of crop loss worldwide. The helper-component proteinase (HC-Pro) is an indispensable, multifunctional protein of members of the genus Potyvirus and other viruses of the family Potyviridae. It is directly involved in diverse steps of viral infection, such as aphid plant-to-plant transmission, polyprotein processing, and suppression of host antiviral RNA silencing. HC-Pro is generally divided into three functional domains: a N-terminal domain, a central region, and a cysteine protease domain (CPD) in the C-terminal region. The HC-Pro CPD domain has a protease activity that autocatalytically cleaves a Gly-Gly dipeptide at its own C terminus to release HC-Pro from the rest of the viral polyprotein. Cysteine and histidine residues form the catalytic dyad at the active site. The HC-Pro CPD domain constitutes the peptidase family C6 of the CA clan [ ].The HC-Pro CPD domain adopts a compact oval-shaped alpha/beta fold. The secondary structure elements include four α-helices (alpha1-alpha4) and two short β-strands (beta1 and beta2) arranged in the order alpha1-alpha2-alpha3-beta1-beta2-alpha4. In addition, two 3(10) helices are located between alpha3 and beta1 and downstream of alpha4. The four helices form a helix bundle packed against one face of a short β-hairpin formed by strands beta1 and beta2. The catalytic residue Cys is located at the N terminus of helix alpha1, and the other catalytic residue His is located on strand beta2. The substrate binding cleft is lined by the loop connecting helices alpha2 and alpha3 and the N-terminal region of helix alpha1 on one side and by strand beta2 on the other side [ ].
Protein Domain
Name: Peptidase C30, domain 3, coronavirus
Type: Homologous_superfamily
Description: This group of cysteine peptidases correspond to MEROPS peptidase family C30 (clan PA(C)). These peptidases are related to serine endopeptidases of family S1 and are restricted to coronaviruses, where they are involved in viral polyprotein processing during replication [ , , ].This Coronavirus (CoV) domain, peptidase C30, is also known as 3C-like proteinase (3CL-pro), or CoV main protease (M-pro) domain and it is highly conserved among coronaviruses. CoV M-pro is a dimer where each subunit is composed of three domains I, II and III. Domains I and II consist of six-stranded antiparallel beta barrels [ ] and together resemble the architecture of chymotrypsin, and of picornaviruses 3C proteinases. The substrate-binding site is located in a cleft between these two domains. The catalytic site is situated at the centre of the cleft. A long loop connects domain II to the C-terminal domain (domain III). This latter domain, a globular cluster of five helices, has been implicated in the proteolytic activity of M-pro. In the active site of M-pro, Cys and His form a catalytic dyad. In contrast to serine proteinases and other cysteine proteinases, which have a catalytic triad, there is no third catalytic residue present [, , , ]. Many drugs have been developed to inhibit CoV M-pro [, ]. This superfamily represents CoV M-pro domain III, which is reported to be required for dimerisation and regulation [ , ]. Whereas the chymotrypsin-like fold formed by domains I and II is also present in MEROPS family S1 peptidases found in plants, animals, fungi, eubacteria, archaea and viruses, the C-terminal extra helical domain III is unique for the coronavirus 3CL proteases [, ].
Protein Domain
Name: G protein-coupled receptor, rhodopsin-like
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].This entry represents the G protein-coupled receptor, rhodopsin-like family.
Protein Domain
Name: Retinal-specific ATP-binding cassette transporter
Type: Family
Description: The ABC transporter family is a group of membrane proteins that use the hydrolysis of ATP to power the translocation of a wide variety of substrates across cellular membranes. ABC transporters minimally consist of two conserved regions: a highly conserved nucleotide-binding domain (NBD) and a less conserved transmembrane domain (TMD). Eukaryotic ABC proteins are usually organised either as full transporters (containing two NBDs and two TMDs), or as half transporters (containing one NBD and one TMD), that have to form homo- or heterodimers in order to constitute a functional protein [ ].Retinal-specific ATP-binding cassette transporter ABCA4 (also known as the Rim protein, ABCR) is a eukaryotic protein belonging to the ABC-A subfamily of the ABC transporter family. In humans, ABCA4 is localised with opsin photopigments in outer segment disc membranes of rod and cone photoreceptor cells. It serves as an N-retinylidene-phosphatidylethanolamine and phosphatidylethanolamine importer [ ]. Mutations in the ABCA4 gene cause Stargardt macular degeneration, a recessive disease characterised by the loss in central vision, progressive bilateral atrophy of photoreceptor and retinal pigment epithelial (RPE) cells, accumulation of fluorescent deposits in the macula, and a delay in dark adaptation [, ]. ABCR contains eight glycosylation sites. Four sites reside in a 600-amino acid exocytoplasmic domain of the N-terminal half between the first transmembrane segment H1 and the first multi-spanning membrane domain, and four sites are in a 275-amino acid domain of the C-terminal half between transmembrane segment H7 and the second multi-spanning membrane domain. This leads to a model in which each half has a transmembrane segment followed by a large exocytoplasmic domain, a multi-spanning membrane domain, and a nucleotide binding domain.
Protein Domain
Name: Carbohydrate-binding module family 5/12
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This entry represents and . These modules have a core structure consisting of a 3-stranded meander β-sheet, which contain six aromatic groups that may be important for binding. CBM5/12 is found in proteins such as chitinase A1, chitinase B [ ], and endoglucanase Z [].The overall topology of the CBM is structurally similar to the C-terminal chitin-binding domains (ChBD) of chitinase A1 and chitinase B, however the binding mechanism for the ChBD may be different from that of the CBM [ ].
Protein Domain
Name: Sortase A
Type: Family
Description: Class A sortases are membrane-bound cysteine transpeptidases distributed in Gram-positive bacteria (mainly present in Firmicutes). They perform a housekeeping role in the cell as members of this group are capable of anchoring a large number of functionally distinct surface proteins containing a cell wall sorting signal to an amino group located on the bacterial cell wall. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall-sorting signal (Class A sortases recognize a canonical LPXTG motif, X can be any amino acid), and covalently linked to peptidoglycan for display on the bacterial surface. The prototypical sortase A protein from Staphylococcus aureus (named Sa-SrtA) cleaves the amide bond between threonine and glycine residues of the canonical LPXTG motif in a wide range of protein substrates with diverse functions that can promote bacterial adhesion, nutrient acquisition, host cell invasion, and immune evasion. Next, it catalyzes a transpeptidation reaction by which the proteins are covalently linked to the peptidoglycan precursor lipid II. SrtA is therefore affects the ability of a pathogen to establish successful infection. SrtA contains an N-terminal hydrophobic segment, a linker region and an extra-cellular C-terminal catalytic domain. The hydrophobic segment functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring. The catalytic domain contains the catalytic TLXTC signature sequence where X is usually a valine, isoleucine or a threonine. The gene encoding SrtA is generally not located in the same gene cluster as its substrates while the gene encoding SrtB is usually clustered in the same locus as its substrate [, , ].
Protein Domain
Name: Succinate dehydrogenase [ubiquinone] cytochrome b small subunit, CybS
Type: Family
Description: This family consists of mitochondrial succinate dehydrogenase [ubiquinone] cytochrome b small subunit CybS (also known an SQR) and import inner membrane translocase subunit Tim18. Members of this subfamily are classified as Type C SQRs because they contain two transmembrane subunits and one heme group. CybS and CybL are the two transmembrane proteins of eukaryotic SQRs. They contain heme and quinone binding sites. CybS is the eukaryotic homologue of the bacterial SdhD subunit.CybS is a membrane-anchoring subunit of succinate dehydrogenase (SDH) that is involved in complex II of the mitochondrial electron transport chain and is responsible for transferring electrons from succinate to ubiquinone (coenzyme Q) [ ]. CybS is composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits []. The two-electron oxidation of succinate in the flavoprotein active site is coupled to the two-electron reduction of quinone in the transmembrane subunits via electron transport through FAD and three iron-sulfur centres []. The reversible reduction of quinone is an essential feature of respiration, allowing transfer of electrons between respiratory complexes. Mutations in human Complex II result in various physiological disorders including hereditary paraganglioma and pheochromocytoma tumors. The gene encoding for the SdhD subunit is classified as a tumor suppressor gene [, , ].In Saccharomyces cerevisiae, Tim18 is a component of the TIM22 complex, a complex that mediates the import and insertion of multi-pass transmembrane proteins into the mitochondrial inner membrane. The TIM22 complex forms a twin-pore translocase that uses the membrane potential as external driving force. Its role in the complex is unclear but it may be involved in the assembly and stabilisation of the TIM22 complex [ , , ].
Protein Domain
Name: Helper-component proteinase (HC-Pro) cysteine protease (CPD) domain superfamily
Type: Homologous_superfamily
Description: Potyviruses form one of the most numerous groups of plant viruses and are a major cause of crop loss worldwide. The helper-component proteinase (HC-Pro) is an indispensable, multifunctional protein of members of the genus Potyvirus and other viruses of the family Potyviridae. It is directly involved in diverse steps of viral infection, such as aphid plant-to-plant transmission, polyprotein processing, and suppression of host antiviral RNA silencing. HC-Pro is generally divided into three functional domains: a N-terminal domain, a central region, and a cysteine protease domain (CPD) in the C-terminal region. The HC-Pro CPD domain has a protease activity that autocatalytically cleaves a Gly-Gly dipeptide at its own C terminus to release HC-Pro from the rest of the viral polyprotein. Cysteine and histidine residues form the catalytic dyad at the active site. The HC-Pro CPD domain constitutes the peptidase family C6 of the CA clan [ ].The HC-Pro CPD domain adopts a compact oval-shaped alpha/beta fold. The secondary structure elements include four α-helices (alpha1-alpha4) and two short β-strands (beta1 and beta2) arranged in the order alpha1-alpha2-alpha3-beta1-beta2-alpha4. In addition, two 3(10) helices are located between alpha3 and beta1 and downstream of alpha4. The four helices form a helix bundle packed against one face of a short β-hairpin formed by strands beta1 and beta2. The catalytic residue Cys is located at the N terminus of helix alpha1, and the other catalytic residue His is located on strand beta2. The substrate binding cleft is lined by the loop connecting helices alpha2 and alpha3 and the N-terminal region of helix alpha1 on one side and by strand beta2 on the other side [ ].This superfamily represents the CPD domain of the HC-Pro protein.
Protein Domain
Name: M polyprotein precursor, phlebovirus
Type: Family
Description: This group represents the Phlebovirus M polyprotein precursor that contains the nonstructural protein NS-M, glycoprotein G1 and glycoprotein G2.
Protein Domain
Name: Pilus assembly, Flp-type CpaB
Type: Family
Description: Members of this protein family are the CpaB protein of Flp-type pilus assembly. Similar proteins include the FlgA protein of bacterial flagellum biosynthesis.
Protein Domain
Name: Transcription factor IIIC, putative zinc-finger
Type: Domain
Description: This zinc-finger domain is at the very C terminus of a number of different TFIIIC subunit proteins. This domain might be involved in protein-DNA and/or protein-protein interactions [ ].
Protein Domain
Name: SAF domain
Type: Domain
Description: This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins [ ]. This domain adopts a β-clip fold [, ].
Protein Domain
Name: Jiraiya
Type: Family
Description: The membrane protein Jiraiya from Xenopus inhibits bone morphogenetic protein (BMP) signalling during embryogenesis [ ]. The human member of this family is uncharacterised protein TMEM221 (transmembrane protein 221).
Protein Domain
Name: Ubiquitin specific protease domain
Type: Domain
Description: Protein ubiquitination is a reversible posttranslational modification, which affects a large number of cellular processes including protein degradation,trafficking, cell signaling and the DNA damage response. Ubiquitination is reversible, and dedicated deubiquitinases exist which hydrolyze isopeptidebonds. Ubiquitin specific proteases (USPs) ( ) are the largest family of deubiquitinating enzymes. USP domains consist of a common conserved catalytic core which is interspersed at five points with insertions, some of which as large as the catalytic domain itself. The insertions can fold into independent domains that can be involved in the regulation of deubiquitinase activity. As commonly found in signaling proteins, many USP deubiquitinases have a modular architecture, and not only contain a catalytic domain but also additional protein-protein interaction and localisation domains. Most USP domains cleave the isopeptide linkage between two ubiquitin molecules, and hence contain (at least) two ubiquitin-binding sites, one for the distal ubiquitin, the C terminus of which is linked to the Lys residue on the proximal ubiquitin in a second, proximal binding site []. The USP domain forms the peptidase family C19 [].The USP catalytic core can be divided into six conserved boxes that are present in all USP domains. Box 1 contains the catalytic Cys residue, box 5contains the catalytic His, and box 6 contains the catalytic Asp/Asn residue. All boxes show several additional conserved features and residues. Boxes 3 and4 contain a Cys-X-X-Cys motif each, which have been shown to constitute a functional zinc-binding motif. Potentially, zinc-binding facilitates foldingof the USP core, helping the interaction of sequence motifs some few hundred residues apart. USP domains share a common conserved fold.The USP domain resembles an open hand containing Thumb, Palm and Fingers subdomains. The catalytic triad resides between the Thumb (Cys) and Palmsubdomains (His/Asp) [ ].This entry represents the entire USP domain.
Protein Domain
Name: Beta-ketoacyl synthase, active site
Type: Active_site
Description: Beta-ketoacyl-ACP synthase (EC 2.3.1.41) (KAS) [ ] is the enzyme that catalyses the condensation of malonyl-ACP with the growing fatty acid chain. It is found as a component of the following enzymatic systems:Fatty acid synthase (FAS), which catalyses the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH. Bacterial and plant chloroplast FAS are composed of eight separate subunits which correspond to different enzymatic activities; beta-ketoacyl synthase is one of these polypeptides. Fungal FAS consists of two multifunctional proteins, FAS1 and FAS2; the beta-ketoacyl synthase domain is located in the C-terminal section of FAS2. Vertebrate FAS consists of a single multifunctional chain; the beta-ketoacyl synthase domain is located in the N-terminal section [ ]. The multifunctional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum [ ]. This is a multifunctional enzyme involved in the biosynthesis of a polyketide antibiotic and which has a KAS domain in its N-terminal section.Polyketide antibiotic synthase enzyme systems. Polyketides are secondary metabolites produced by microorganisms and plants from simple fatty acids. KAS is one of the components involved in the biosynthesis of the Streptomyces polyketide antibiotics granatacin [ ], tetracenomycin C [] and erythromycin. Emericella nidulans multifunctional protein Wa. Wa is involved in the biosynthesis of conidial green pigment. Wa is protein of 216 Kd that contains a KAS domain.Rhizobium nodulation protein nodE, which probably acts as a beta-ketoacyl synthase in the synthesis of the nodulation Nod factor fatty acyl chain. Yeast mitochondrial protein Cem1.The condensation reaction is a two step process: the acyl component of an activated acyl primer is transferred to a cysteine residue of the enzyme andis then condensed with an activated malonyl donor with the concomitant release of carbon dioxide.This entry represents the active site of beta-ketoacyl-ACP synthases [ ].
Protein Domain
Name: Anthranilate synthase/para-aminobenzoate synthase like domain
Type: Domain
Description: This entry represents the anthranilate synthase/para-aminobenzoate synthase domain, which share sequence similarity to the glutamine amidotransferase domain . Anthranilate synthase play a role in the tryptophan-biosynthetic pathway, while the para-aminobenzoate synthase is involved in the folate biosynthetic pathway. In at least one case, a single polypeptide from Bacillus subtilis was shown to have both functions. This entry contains proteins similar to para-aminobenzoate (PABA) synthase and ASase. These enzymes catalyze similar reactions and produce similar products, PABA and ortho-aminobenzoate (anthranilate). Each enzyme is composed of non-identical subunits: a glutamine amidotransferase subunit (component II) and a subunit that produces an aminobenzoate products (component I). ASase catalyses the synthesis of anthranilate from chorismate and glutamine and is a tetrameric protein comprising two copies each of components I and II. Component II of ASase belongs to the family of triad GTases which hydrolyze glutamine and transfer nascent ammonia between the active sites. In some bacteria, such as Escherichia coli, component II can be much larger than in other organisms, due to the presence of phosphoribosyl-anthranilate transferase (PRTase) activity. PRTase catalyses the second step in tryptophan biosynthesis and results in the addition of 5-phosphoribosyl-1-pyrophosphate to anthranilate to create N-5'-phosphoribosyl-anthranilate. In E.coli, the first step in the conversion of chorismate to PABA involves two proteins: PabA and PabB which co-operate to transfer the amide nitrogen of glutamine to chorismate forming 4-amino-4 deoxychorismate (ADC). PabA acts as a glutamine amidotransferase, supplying an amino group to PabB, which carries out the amination reaction. A third protein PabC then mediates elimination of pyruvate and aromatization to give PABA. Several organisms have bipartite proteins containing fused domains homologous to PabA and PabB commonly called PABA synthases. These hybrid PABA synthases may produce ADC and not PABA. [ , , , , , ].
Protein Domain
Name: tRNA pseudouridine synthase B family
Type: Family
Description: This family, found in archaea and eukaryotes, includes the only archaeal proteins markedly similar to bacterial TruB, the tRNA pseudouridine 55 synthase. However, among two related yeast proteins, the archaeal set matches yeast YLR175w far better than YNL292w. The first, termed centromere/microtubule binding protein 5 (CBF5), is an apparent rRNA pseudouridine synthase, while the second is the exclusive tRNA pseudouridine 55 synthase for both cytosolic and mitochondrial compartments. It is unclear whether archaeal proteins found by this entry modify tRNA, rRNA, or both. Yeast CBF5 plays a central role in ribosomal RNA processing. It is a probable catalytic subunit of H/ACA small nucleolar ribonucleoprotein (H/ACA snoRNP) complex, which catalyzes pseudouridylation of rRNA. This involves the isomerization of uridine such that the ribose is subsequently attached to C5, instead of the normal N1. Its pseudouridine ('psi') residues may serve to stabilise the conformation of rRNAs. It may function as a pseudouridine synthase. It is also a centromeric DNA-CBF3-binding factor which is involved in mitotic chromosome segregation [ , , , ]. Human CBF5 homologue, DKC1 (also called Dyskerin), has been involved in a variety of disparate cellular functions. DKC1 isoform 1 is required for correct processing or intranuclear trafficking of TERC, the RNA component of the telomerase reverse transcriptase (TERT) holoenzyme [ ]. In Hela cells, overexpression of DKC1 isoform 3 promotes cell to cell and cell to substratum adhesion, increases the cell proliferation rate and leads to cytokeratin hyper-expression []. Mutations in the human DKC1 gene cause the X-linked form of DC, a bone marrow failure syndrome characterised by mucosal leukoplakia, nail dystrophy, abnormal skin pigmentation, premature aging, stem cell dysfunction and increased susceptibility to cancer. DKC1 loss of function also causes the Hoyeraal-Hreidarsson syndrome, recognised as a severe X-DC allelic variant [, , , , , ].
Protein Domain
Name: Terpenoid cyclases/protein prenyltransferase alpha-alpha toroid
Type: Homologous_superfamily
Description: Protein prenyltransferases catalyse the transfer of the carbon moiety of C15 farnesyl pyrophosphate or geranylgeranyl pyrophosphate synthase to a conserved cysteine residue in a CaaX motif of protein and peptide substrates. The addition of a farnesyl group is required to anchor proteins to the cell membrane. In the 3D structure of a mammalian Ras farnesyltransferases (Ftase), both subunits are largely composed of α-helices. The α-2 to α-15 helices in the alpha subunit fold into a novel helical hairpin structure, resulting in a crescent-shape domain that envelopes part of the subunit. The 12 helices of the beta-subunit form an α-α barrel. Six additional helices connect the inner core of helices and form the outside of the helical barrel. A deep cleft surrounded by hydrophobic amino acids in the centre of the barrel is proposed as the FPP-binding pocket. A single Zn2+ ion is located at the junction between the hydrophilic surface groove near the subunit interface [ , , , ].Terpenoid cyclases such as squalene cyclase, pentalenene synthase, 5-epi-aristolochene synthase, and trichodiene synthase are responsible for the synthesis of cholesterol, a hydrocarbon precursor of the pentalenolactone family of antibiotics, a precursor of the antifungal phytoalexin capsidiol, and the precursor of antibiotics and mycotoxins, respectively. In the structures of these three enzymes, the similar structural feature referred to as 'terpenoid synthase fold' with 10-12 mostly antiparallel α-helices is found, as also observed in protein prenyltransferases. The high structural similarity provides support for the hypothesis that the three families of prenyltransferases have related evolution despite their low sequence similarity [ ].Alpha-2-macroglobulin inhibit all four classes of proteinases by a unique 'trapping' mechanism in which the inhibitor undergoes global structural transformation to lead active proteases into its molecular cage. It also shows other functions related with the immune-cell function such as the binding of cytokines or the facilitation of cell migration [ , ].
Protein Domain
Name: CheY-like superfamily
Type: Homologous_superfamily
Description: CheY is a member of the response regulator family in bacterial two-component signalling systems, where CheY receives the signal from the sensor partner, usually a histidine protein kinase. Signal transduction involves phosphotransfer, whereby the histidine kinase phosphorylates a conserved aspartate in the response regulator to activate responses to environmental signals [ ]. CheY is a single domain protein that folds into a compact globular unit with a flavodoxin-like fold consisting of three-layer alpha/beta/alpha sandwich with 21345 beta topology, where the phosphorylation region lies in a cavity.Other members of the response regulator family contain a CheY-like receiver domain, which is often found N-terminal to a DNA-binding effector domain. Examples include NarL (nitrate/nitrite response regulator), NtrC (nitrogen regulatory protein C), Spo0A and Spo0F (sporulation response) from Bacillus, PhoA and PhoB cyclin-dependent kinases from Aspergillus, among others.AmiR, the positive regulator of the amidase operon in Psuedomonas, is an unusual member of the bacterial response regulator family; AmiR is able to bind RNA and uses ligand-regulated activation rather than phopho-activation. It has a CheY-like fold at its N terminus, but contains two subdomains in a C-terminal extension, one forming a coiled-coil and the other a long α-helix. As such AmiR may represent a new family of RNA-binding response regulators [ ].CheY-like domains can be found in other protein families as well. Examples include the receiver domain of the ethylene receptor (ETR1) from Arabidopsis, which is involved in ethylene detection and signal transduction [ ]; the N-terminal wing' domain of ornithine decarboxylase from Lactobacilli, which catalyses the conversion of ornithine to putrescine at the beginning of the polyamine pathway [ ]. The N-terminal domain of the circadian clock protein, KaiA, from cyanobacteria, acts as a psuedo-receiver domain, but lacks the conserved aspartyl residue required for phosphotransfer in response regulators [].
Protein Domain
Name: Thiolase
Type: Family
Description: Thiolases are ubiquitous enzymes that catalyze the reversible thiolytic cleavage of 3-ketoacyl-CoA into acyl-CoA and acetyl-CoA, a 2-step reaction involving a covalent intermediate formed with a catalytic cysteine. They are found in prokaryotes and eukaryotes. Two different types of thiolase [ , , ] are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase () and 3-ketoacyl-CoA thiolase ( ). 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis. In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction [ ].Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases [ ].The beta-ketothiolases from Haloferax mediterranei BktB and phaA have different substrate specificities and are involved in PHBV biosynthesis [ ]. Their catalytic residues have been identified [].
Protein Domain
Name: Cadherin-like
Type: Domain
Description: Cadherins are a group of transmembrane proteins that serve as the major adhesion molecules located within adherens junctions. They can regulate cell-cell adhesion through their extracellular domain and their cytosolic domains connect to the actin cytoskeleton by binding to catenins [ ]. These proteins preferentially interact with themselves in a homophilic manner in connecting cells; thus acting as both receptor and ligand. They may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins.Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a C-terminal cytoplasmic domain [ ]. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.This entry represents the extracellular repeated domains found in cadherins and related proteins.
Protein Domain
Name: Chorismate mutase II, prokaryotic-type
Type: Domain
Description: Chorismate mutase (CM) is a regulatory enzyme ( ) required for biosynthesis of the aromatic amino acids phenylalanine and tyrosine. CMcatalyzes the Claisen rearrangement of chorismate to prephenate, which can subsequently be converted to precursors of either L-Phe or L-Tyr. Inbifunctional enzymes the CM domain can be fused to a prephenate dehydratase (P-protein for Phe biosynthesis), to a prephenatedehydrogenase (T-protein, for Tyr biosynthesis), or to 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase.Besides these prokaryotic bifunctional enzymes, monofunctional CMs occur in prokaryotes as well as in fungi, plants and nematode worms []. The sequence of monofunctional chorismate mutase aligns well with the N-terminal part of P-proteins [].The type II or AroQ class of CM has an all-helical 3D structure, represented by the CM domain of the bifunctional Escherichia coliP-protein. This type is named after the Enterobacter agglomerans monofunctional CM encoded by the aroQ gene []. All CM domainsfrom bifunctional enzymes as well as most monofunctional CMs belong to this class, including archaeal CM.Eukaryotic CM from plants and fungi form a separate subclass of AroQ, represented by the Baker's yeast allosteric CM. These enzymes show onlypartial sequence similarity to the prokaryotic CMs due to insertions of regulatory domains, but the helix-bundle topology and catalytic residues areconserved and the 3D structure of the E. coli CM dimer resembles a yeast CM monomer [, , ]. The E. coli P-protein CM domain consists of3 helices and lacks allosteric regulation. The yeast CM has evolved by gene duplication and dimerization and each monomer has 12 helices. Yeast CM isallosterically activated by Trp and inhibited by Tyr [ ].This entry represents the CM type 2 domain, mainly from prokaryotes. It does not include the CM from plants and or Baker's yeast.
Protein Domain
Name: POU-specific domain
Type: Domain
Description: POU proteins are eukaryotic transcription factors containing a bipartite DNA binding domain referred to as the POU domain. The acronym POU (pronounced 'pow') is derived from the names of three mammalian transcription factors, the pituitary-specific Pit-1, the octamer-binding proteins Oct-1 and Oct-2, and the neural Unc-86 from Caenorhabditis elegans. POU domain genes have been identified in diverse organisms including nematodes, flies, amphibians, fish and mammals but have not been yet identified in plants and fungi. The various members of the POU family have a wide variety of functions, all of which are related to the function of the neuroendocrine system [ ] and the development of an organism []. Some other genes are also regulated,including those for immunoglobulin light and heavy chains (Oct-2) [ , ], and trophic hormone genes, such as those for prolactin and growth hormone (Pit-1). The POU domain is a bipartite domain composed of two subunits separated by a non-conserved region of 15-55 aa. The N-terminal subunit is known as the POU-specific (POUs) domain ( ), while the C-terminal subunit is a homeobox domain ( ). 3D structures of complexes including both POU subdomains bound to DNA are available. Both subdomains contain the structural motif 'helix-turn-helix', which directly associates with the two components of bipartite DNA binding sites, and both are required for high affinity sequence-specific DNA-binding. The domain may also be involved in protein-protein interactions [ ]. The subdomains are connected by a flexible linker [, , ]. In proteins a POU-specific domain is always accompanied by a homeodomain. Despite of the lack of sequence homology, 3D structure of POUs is similar to 3D structure of bacteriophage lambda repressor and other members of HTH_3 family [, ].This entry represents the POU-specific subunit of the POU domain.
Protein Domain
Name: PSF, RNA recognition motif 1
Type: Domain
Description: This entry represents the RNA recognition motif 1 (RRM1) of PSF.PSF is a member of the DBHS (Drosophila behavior human splicing) family. It participates in a wide range of gene regulatory processes and cellular response pathways. It has been shown to affect the alternative splicing of CD45 and Tau and regulate the 3' polyadenylation of mRNAs. It is often localised in the paraspeckles and may be involved in the nuclear retention of mRNAs. It is involved in translation and transcription. It can bind directly to DSBs and play a role in DNA repair. PSF can also be utilized as an essential host factor for viral RNA multiplication and replication [ , ]. In addition to the common DHBS core, which encompasses RRM1 and RRM2, the protein-protein interaction NOPS domain and the coiled-coil domain, PSF features additional domains, such as a RGG motif and a proline-rich region in its N terminus []. DBHS (Drosophila behavior human splicing) family are characterised by a core domain arrangement consisting of tandem RNA recognition motifs (RRMs), a conserved intervening sequence referred to as a NONA/ParaSpeckle (NOPS) domain, and a ~100 amino acid coiled-coil domain. Its members include p54nrb (also known as NONO), PTB-associated splicing factor/splicing factor proline-glutamine rich (PSF or SFPQ) and PSPC1 (paraspeckle protein component 1). They are found in the nucleoplasm and can be triggered by binding to local high concentrations of various nucleic acids to form microscopically visible nuclear bodies, paraspeckles or large complexes such as DNA repair foci. They may also function cytoplasmically and on the cell surface in defined cell types. All three DBHS proteins are conserved throughout vertebrate species, while flies, worms, and yeast express a single DBHS protein [ , ].
Protein Domain
Name: Coactivator CBP, KIX domain superfamily
Type: Homologous_superfamily
Description: Transcriptional activators are believed to stimulate gene expression via protein-protein interactions with the basal machinery. The cAMP-regulated transcription factor CREB has been shown to stimulate target gene expression, in part by associating with the coactivator paralogues p300 and CREB binding protein (CBP). CBP and P300 bind to the Ser-133-phosphorylated kinase-inducible domain (KID) of CREB via a region of approximately 90 residues referred to as the KIX domain, which is highly conserved in CBP homologues from Caenorhabditis elegans and Drosophila melanogaster. In addition to CREB, the KIX domain of CBP also recognises the transactivation domains of other nuclear factors, including Myb, Jun, cubitus interruptus, and HTLV-1 virally encoded Tax protein. Thus the KIX domain appears to be a common docking site on CBP for many transcriptional activators. The KIX domain is found in association with other domains, such as the bromodomain, the ZZ-type zinc finger, or the TAZ-type zinc finger [ , ].The KIX domain of CBP is composed of three mutually interacting alpha helices, designated alpha1, alpha2 and alpha3, and two short 3(10) helices G1 and G2, that together with the interconnecting loops define a compact structural domain with an extensive hydrophobic core. Helices alpha1 and alpha3 constitute the primary interacting surface for the phosphorylated KID domain (pKID), forming a hydrophobic patch on the protein surface that is large enough to accommodate up to 3 turns of an amphipathic alpha helix, designated alphaB, in pKID. A second alpha helix in pKID, referred to as alphaA, interacts with a different face of the alpha3 helix of KIX. The two helices of pKID are arranged at an angle of about 90 degree and essentially wrap around the alpha3 helix of KIX [ , ].
Protein Domain
Name: GRASP-type PDZ domain
Type: Domain
Description: The Golgi apparatus is a highly dynamic organelle responsible for sorting out proteins and other biomolecules to the cell surface and to the extracellular milieu. The Golgi apparatus is comprised of flattened membrane-bound compartments called cisternae, which are apposed to one another to form a Golgi stack. The structural organization of the cisternae into stacks and their lateral connection, building the Golgi ribbon, requires a family of proteins called Golgi ReAssembly and Stacking Proteins (GRASP). Two homologues (GRASP55 and GRASP65) have been described in vertebrates and their functions have been associated to Golgi phosphorylation-regulated assembly/disassembly, protein secretion , Golgi remodeling in migrating cells, among others. There is only one gene for GRASP in lower eukaryotes. Essentially all GRASPs contain a conserved N-terminal GRASP region, which comprises two tandem PDZ domains (PDZ1 and PDZ2), a classical protein-peptide interaction domain, and is responsible for GRASP homo-oligomerization and for the attachment to the Golgi membrane. The C-terminal half which is not conserved between species but is rich in proline and serines residues, as well as glutamine and asparagine residues [ , , , ]. The GRASP-type PDZ domains adopt a canonical PDZ fold with a β-sandwich of five β-strands and two α-helices. The PDZ1 and PDZ2 domains are nearly superimposable. The peptide-binding pockets of both PDZ domains are formed by alpha2 and beta5. A typical ligand peptide is predicted to form antiparallel β-strand interactions with beta5 and insert hydrophobic side chains between alpha2 and beta5. The two PDZ domains cooperate to achieve dimerization and oligomerization. In the dimers the PDZ2 domains interact in a way that positions the peptide-binding pockets facing each other. In addition, the dimers are linked through interactions between the two C-terminal tails (CTs) of one dimer and two peptide-binding pockets of the PDZ1 domains in the next dimer [ , ]. This entry represents the GRASP-type PDZ domain.
Protein Domain
Name: Muscarinic acetylcholine receptor M4
Type: Family
Description: Muscarinic acetylcholine receptors are members of rhodopsin-like G-protein coupled receptor family. They play several important roles; they mediate many of the effects of acetylcholine in the central and peripheral nervous system and modulate a variety of physiological functions, such as airway, eye and intestinal smooth muscle contraction, heart rate and glandular secretions. The receptors have a widespread tissue distribution and are a major drug target in human disease. They may be effective therapeutic targets in Alzheimer's disease, schizophrenia, Parkinson's disease and chronic obstructive pulmonary disease [ , ]. There are five muscarinic acetylcholine receptor subtypes, designated M1-5 [ , , , , ]. The family can be further divided into two broad groups based on their primary coupling to G-proteins. M2 and M4 receptors couple to the pertussis-toxin sensitive Gi proteins, whereas M1, M3 and M5 receptors couple to Gq proteins [, ], which activate phospholipase C. The different subtypes can also couple to a wide range of diverse signalling pathways, some of which are G protein-independent [, , ].All subtypes seem to serve as autoreceptors [ ], and knockout mice reveal the important neuromodulatory role played by this receptor family [, , ].The muscarinic acetylcholine M4 receptor is primarily found in the CNS [ , , ], its distribution largely overlapping with that of M1 and M3 subtypes. M4 receptors function as inhibitory autoreceptors for acetylcholine [, ], activation of which inhibits acetylcholine release in the striatum.Muscarinic acetylcholine receptors possess a regulatory effect on dopaminergic neurotransmission and activation of M4 receptors in the striatum inhibits dopamine-induced locomotor stimulation in mice [ ]. M4 receptor-deficient mice exhibit increased locomotor simulation in response to dopamine agonists, such as amphetamine and cocaine [, , , ]. Neurotransmission in the striatum influences extrapyramidal motor control. Therefore, alterations in M4 receptor activity may contribute to conditions such as Parkinson's Disease [, , ].
Protein Domain
Name: SET domain superfamily
Type: Homologous_superfamily
Description: The SET domain is a 130 to 140 amino acid, evolutionary well conserved sequence motif that was initially characterised in the Drosophila proteins Su(var)3-9, Enhancer-of-zeste and Trithorax [ , ]. In eukaryotic organisms, it appears in proteins with an important role in regulating chromatin-mediated gene transcriptional activation and silencing. In viruses,bacteria and archaea, its function is not clear yet []. This superfamily includes eukaryotic proteins with histone methyltransferase activity, which requires the combination of the SET domain with the adjacent cysteine-rich regions, one located N-terminally (pre-SET) and the other posterior to the SET domain (post-SET). Post- and pre- SET regions seem then to play a crucial role when it comes to substrate recognition and enzymatic activity [ , ]. Other SET domain-containing proteins function as transcription factors (such as PR domain zinc finger protein 1 from humans []). The structure of the SET domain and the two adjacent regions pre-SET and post-SET have been solved [ , , ]. The SET domain structure is all-β, but consists only in sets of few short strands composing no more than a couple of small sheets. Consequently the SET structure is mostly defined by turns and loops. An unusual feature is that the SET core is made up of two discontinuous segments of the primary sequence forming an approximate L-shape [, , ]. Two of the most conserved motifs in the SET domain are constituted by a stretch at the C-terminal containing a strictly conserved tyrosine residue and a preceding loop inside which the C-terminal segment passes forming a knot-like structure, but not quite a true knot. These two regions have been proven to be essential for SAM binding and catalysis, particularly the invariant tyrosine where in all likelihood catalysis takes place [, ].
Protein Domain
Name: Peptidase M38, beta-aspartyl dipeptidase
Type: Family
Description: Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site [ ]. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of proteins include metallopeptidases belonging to the MEROPS peptidase family M38 (clan MJ, beta-aspartyl dipeptidase family). This entry includes the beta-aspartyl dipeptidase from Escherichia coli, (, IadA; MEROPS identifier M38.001), which degrades isoaspartyl dipeptides and may unblock degradation of proteins that cannot be repaired. This entry also describes closely related proteins from other species (e.g. Clostridium perfringens, Thermoanaerobacter tengcongensis) that may have an equivalent in function. This family shows homology to dihydroorotases. The L-isoaspartyl derivative of Asp arises non-enzymatically over time as a form of protein damage. In this isomerisation, the connectivity of the polypeptide changes to pass through the β-carboxyl of the side chain. Much but not all of this damage can be repaired by protein-L-isoaspartate (D-aspartate) O-methyltransferase.
Protein Domain
Name: ADAMTS/ADAMTS-like, Spacer 1
Type: Domain
Description: This entry represents the Spacer-1 domain found in ADAM-TS and ADAM-TS-like proteins.Proteolysis of the extracellular matrix plays a critical role in establishing tissue architecture during development and in tissue degradation in diseases such as cancer, arthritis, Alzheimer's disease and a variety of inflammatory conditions [ , ]. The proteolytic enzymes responsible for this process are members of diverse protease families, including the secreted zinc metalloproteases (MPs) []. ADAM-TS (A Disintegrin and Metalloproteinase with Thrombospondin Motifs) is closely related to the ADAM family (A Disintegrin and Metalloproteinase) and is a subfamily of the MP family, consists of at least 20 members sharing a high degree of sequence similarity and conserved domain organisation [ , ]. The defining domains of the ADAM-TS family are (from N- to C-termini) a pre-pro metalloprotease domain of the reprolysin type, a snake venom disintegrin-like domain, a thrombospondin type-I (TS) module, a cysteine-rich region, and a cysteine-free (spacer) domain []. Domain organisation following the spacer domain C terminus shows some variability in certain ADAM-TS members, principally in the number of additional TS domains. These enzymes have a wide-spectrum role in vascular biology and cardiovascular pathophysiology [].Members of the ADAM-TS family have been implicated in a range of diseases [ , , ]. For instance, members of this family have been found to participate directly in processes in the central nervous system (CNS) such as the regulation of brain plasticity []. ADAM-TS1 is reported to be involved in inflammation and cancer cachexia [], whilst recessively inherited ADAM-TS2 mutations cause Ehlers-Danlos syndrome type VIIC, a disorder characterised clinically by severe skin fragility []. ADAM-TS4 is an aggrecanase involved in arthritic destruction of cartilage []. ADAM-TS-like proteins lack a metalloprotease domain. They resides in the ECM and have regulatory roles []. Examples of ADAM-TS-like proteins are papilin [] and punctin [].
Protein Domain
Name: Type VI secretion system TssC-like
Type: Family
Description: The long cytoplasmic tubular structure of the T6SS system is wrapped by a sheath structure composed of two proteins, TssB and TssC. Contraction of the sheath causes the internal tube of the T6SS with associated effectors to be propelled out of the effector cell and across the membranes of bacterial or eukaryotic target cells [ , , ].TssB and TssC assemble into tubular structures with cogwheel patterns resembling the bacteriophage contractile sheath [ ]. Several structures of T6SS sheath assemblies have been solved displaying a helical assembly [, , ]. Interactions between TssB and TssC occur between the N-terminal region of TssC and the conserved a-helix of TssB []. The two proteins of the F. novicida T6SS outer sheath, IglA (TssB) and IglB (TssC), are interdigitated into a single fold similar to that of the phage sheath. The F. novicida T6SS outer sheath has a highly interlaced two-dimensional array architecture with augmented beta sheets that is essential to secretory function [].Three distinct T6SS subtypes exist, T6SSi, in which most proteobacterial T6SSs are found, including V. cholerae and P. aeruginosa; T6SSii for the Francisella T6SS; and T6SSiii for Bacteroidetes systems [ ].TssB/TssC are also known as IglA/IglB and VipA/VipB.The type VI secretion system (T6SS) is a supra-molecular bacterial complex that resembles phage tails. It is a toxin delivery systems which fires toxins into target cells upon contraction of its TssBC sheath [ ]. Thirteen essential core proteins are conserved in all T6SSs: the membrane associated complex TssJ-TssL-TssM, the baseplate proteins TssE, TssF, TssG, and TssK, the bacteriophage-related puncturing complex composed of the tube (Hcp), the tip/puncturing device VgrG, and the contractile sheath structure (TssB and TssC). Finally, the starfish-shaped dodecameric protein, TssA, limits contractile sheath polymerization at its distal part when TagA captures TssA [].
Protein Domain
Name: Haemagglutinin-esterase glycoprotein, core
Type: Domain
Description: Haemagglutinin-esterase fusion glycoprotein (HEF) is a multi-functional protein embedded in the viral envelope of several viruses, including influenza C virus, coronaviruses and toroviruses [ , ]. HEF is required for infectivity, and functions to recognise the host cell surface receptor, to fuse the viral and host cell membranes, and to destroy the receptor upon host cell infection. The haemagglutinin region of HEF is responsible for receptor recognition and membrane fusion, and bears a strong resemblance to the sialic acid-binding haemagglutinin found in influenza A and B viruses, except that it binds 9-O-acetylsialic acid. The esterase region of HEF is responsible for the destruction of the receptor, an action that is carried out by neuraminidase in influenza A and B viruses. The esterase domain is similar in structure to Streptomyces scabies esterase, and to acetylhydrolase, thioesterase I and rhamnogalacturonan acetylesterase. The haemagglutinin-esterase glycoprotein HEF must be cleaved by the host's trypsin-like proteases to produce two peptides (HEF1 and HEF2) in order for the virus to be infectious. Once HEF is cleaved, the newly exposed N-terminal of the HEF2 peptide then acts to fuse the viral envelope to the cellular membrane of the host cell, which allows the virus to infect the host cell.The haemagglutinin-esterase glycoprotein is a trimer, where each monomer is composed of three domains: an elongated stem active in membrane fusion, an esterase domain, and a receptor-binding domain, where the stem and receptor-binding domains together resemble influenza A virus haemagglutinin. Two of these domains are composed of non-contiguous sequence: the receptor-binding haemagglutinin domain is inserted into a surface loop of the esterase domain, and the esterase domain is inserted into a surface loop of the haemagglutinin stem. This entry represents the core of the haemagglutinin-esterase glycoprotein, including the haemagglutinin receptor-binding domain and the esterase domain.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom