Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 15701 to 15800 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.039s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Folliculin/SMCR8, longin domain
Type: Domain
Description: Folliculin (FLCN) is a tumor suppressor that enables nutrient-dependent activation of the mechanistic target of rapamycin complex 1 (mTORC1) protein kinase via its guanosine triphosphatase (GTPase) Activating Protein (GAP) activity. It belong to the DENN module family of proteins and contains a divergent DENN module comprised of a N-terminal longin domain (also known as upstream DENN domain, u-DENN), followed by a DENN domain. It forms a complex with its partners, FNIP1 or FNIP2 (Folliculin interacting protein 1 or 2), which directly contacts the Rag GTPases RagC/D to stimulate GTP hydrolysis and thus promote the conversion to the GDP-bound state. FLCN-FNIP2 adopts an extended conformation with two pairs of heterodimerized domains. They contain longin domains that heterodimerize and contact both nucleotide binding domains of the Rag heterodimer, and C-terminal DENN domains which interact at the distal end of the structure [ , , ].This is the N-terminal domain of folliculin, the longin domain [ , , ]. An arginine residue located in this domain (Arg164) is catalytic residue for GAP activity [, ]. This domain can also be found in SMCR8, a component of the C9orf72-SMCR8 complex, a complex that has guanine nucleotide exchange factor (GEF) activity and regulates autophagy [, , , , , ].
Protein Domain
Name: MMS19, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of MMS19. This domain shares homology with some HEAT repeat sequences. MMS19 is a key component of the cytosolic iron-sulfur protein assembly (CIA) complex, a multiprotein complex that mediates the incorporation of iron-sulfur cluster into apoproteins specifically involved in DNA metabolism and genomic integrity [ , , ]. In humans, MMS19 acts as an adapter between early-acting CIA components and a subset of cellular target iron-sulfur proteins such as ERCC2/XPD, FANCJ and RTEL1, thereby playing a key role in nucleotide excision repair (NER) and RNA polymerase II (POL II) transcription [ , ]. It is also part of the MMXD (MMS19-MIP18-XPD) complex, which plays a role in chromosome segregation, probably by facilitating iron-sulfur cluster assembly into ERCC2/XPD [ ].In budding yeasts, the mms19 mutants were originally isolated in a screening for mutants hypersensitive to the alkylating agent methyl methanesulfonate (MMS) [ ]. Different from human MMS19, Mms19 in budding yeasts (also known as Met18) does not participate directly in NER []. In fission yeast, Mms19 is part of a silencing complex named Rik1-Dos2 complex, which contains Dos2, Rik1, Mms19 and Cdc20. This complex regulates RNA Pol II activity in heterochromatin, and is required for DNA replication and heterochromatin assembly [ ].
Protein Domain
Name: Chemotaxis protein-glutamate methylesterase
Type: Family
Description: In bacterial chemotaxis, cellular movement is directed in response to chemical gradients. Transmembrane chemoreceptors that sense the stimuli are coupled (via a coupling protein, CheW) with a signal transduction histidine kinase (CheA). CheA phosphorylates response regulators CheB and CheY. Phosphorylated CheY binds to FliM, a component of the flagellar motor switch complex, and modulates the direction of flagellar rotation [ ]. Response regulator CheB (receptor modification enzyme, protein-glutamate methylesterase) modulates the signalling output of the chemotaxis receptors through control of the level of chemoreceptor methylation []. Specific glutamyl residues in the transmembrane chemoreceptor cytoplasmic domain are methylated by methyltransferase CheR to form γ-carboxyl glutamyl methyl esters. These esters can be hydrolyzed by methylesterase CheB. Receptor modification resets the signalling states of receptors, allowing for responses to changes in concentration of the chemical stimuli irrespective of their absolute concentrations [].Proteins in this family contain a divergent form of the CheB-like protein-glutamate methylesterase domain in the C-terminal region. This domain is usually found fused with the CheY-like receiver (response regulator) domain, forming CheB response regulator methylesterase ( ). The stand-alone form is presumed also to be involved in the process of regulating bacterial chemotaxis [ ], but there is no experimental evidence to confirm this.
Protein Domain
Name: VAV1 protein, first SH3 domain
Type: Domain
Description: VAV1 (also known as proto-oncogene vav) is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells [ , , ]. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization [, ]. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others []. The VAV protein family members are multiple domain proteins, including Vav from flies and VAV1/2/3 from mammals. VAV1 predominates in hematopoietic cells, whereas VAV2 and VAV3 are more broadly expressed. They have a calponin homology (CH) domain, an acidic domain (AC), a Dbl homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich (CR) domain containing a zinc finger, and a complex region with SH2 and SH3 domains. Therefore they may participate in the activity of several pathways [ , ]. They are signal transducer proteins that couple tyrosine kinase signals with the activation of the Rho/Rac GTPases, [, , ]. This entry represents the first SH3 domain of VAV1.
Protein Domain
Name: Notch-like domain superfamily
Type: Homologous_superfamily
Description: The Notch domain is also called the 'DSL' domain or the Lin-12/Notch repeat (LNR). The LNR region is present only in Notch related proteins C-terminal to EGF repeats. The lin-12/Notch proteins act as transmembrane receptors for intercellular signals that specify cell fates during animal development. In response to a ligand, proteolytic cleavages release the intracellular domain of Notch, which then gains access to the nucleus and acts as a transcriptional co-activator [ ]. The LNR region is supposed to negatively regulate the Lin-12/Notch proteins activity. It is a triplication of an around 35-40 amino acids module present on the extracellular part of the protein [, ]. Each module contains six cysteine residues engaged in three disulphide bonds and three conserved aspartate and asparagine residues []. The biochemical characterisation of a recombinantly expressed LIN-12.1 module from the human Notch1 receptor indicate that the disulphide bonds are formed between the firstand fifth, second and fourth, and third and sixth cysteines. The formation of this particular disulphide isomer is favored by the presence of Ca 2+, which is also required to maintain the structural integrity of the rLIN-12.1 module. The conserved aspartate and asparagine residues are likely to be important for Ca 2+binding, and thereby contribute to the native fold.
Protein Domain
Name: Elongation Factor G, domain II
Type: Domain
Description: This entry represents domain II of elongation factor G (EFG). It shares a similar structure with domain V.EF2 (or EFG) participates in the elongation phase of protein synthesis by promoting the GTP-dependent translocation of the peptidyl tRNA of the nascent protein chain from the A-site (acceptor site) to the P-site (peptidyl tRNA site) of the ribosome. EF2 also has a role after the termination phase of translation, where, together with the ribosomal recycling factor, it facilitates the release of tRNA and mRNA from the ribosome, and the splitting of the ribosome into two subunits [ ]. EF2 is folded into five domains, with domains I and II forming the N-terminal block, domains IV and V forming the C-terminal block, and domain III providing the covalently-linked flexible connection between the two. Domains III and V have the same fold (although they are not completely superimposable and domain III lacks some of the superfamily characteristics), consisting of an alpha/beta sandwich with an antiparallel β-sheet in a (beta/alpha/beta)x2 topology []. This double split beta/alpha/beta fold is also seen in a number of ribonucleotide binding proteins. It is the most common motif occurring in the translation system and is referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif.
Protein Domain
Name: Guanine nucleotide exchange factor SopE, N-terminal domain
Type: Domain
Description: The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell [ ] and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal"state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE [ ].Recently, it has been found that SopE and its protein homologue SopE2 can activate different sets of Rho-GTPases in the host cell [ ]. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in the host cell during infection.This entry represents the N-terminal domain of SopE and SopE2. The function of this domain is unknown.
Protein Domain
Name: CCN, TSP1 domain
Type: Domain
Description: Proteins from the CCN family are intercellular signalling proteins that includes six homologous members: cysteine-rich 61(Cyr61)/CCN1, connective tissue growth factor (Ctgf)/CCN2, nephroblastoma overexpressed gene (Nov)/CCN3, Wnt-induced secreted protein 1 (Wisp1)/CCN4, Wisp2/CCN5, and Wisp3/CCN6. They are involved in many biological processes, such as angiogenesis, wound healing, and tumorigenesis, by regulating the proliferation, migration, and differentiation of the target cells [ ]. They interact with extracellular matrix components and growth factors via one of their four domains. In particular CNN3 (previously known as Nov) plays an important role in the generation of various types of tissues, such as muscle, fat, cartilage, and bone. CCN3 is expressed in notochord and presomitic mesoderm in early stage development, and in adults it is expressed in diverse tissues, including the nervous system, muscle, cartilage, and bone [].This entry represents a sub-type of thrombospondin module 1 (TSP1) domains found in matricellular CCN proteins which shares a similar three-stranded fold with the thrombospondin type 1 repeats of thrombospondin-1 and spondin-1. This domain has an alternative disulphide binding pattern compared to the canonical TSP1 domain and a conserved charged cluster in the centre of the domain which suggests to have a potential functional binding site for heparan sulfate [ ].
Protein Domain
Name: Tenascin, EGF-like domain
Type: Domain
Description: This entry represents the EGF-like domains found in Tenascin and Reelin proteins. A common feature of all EGF-like domains is that they are found in the extracellular domain of membrane-bound proteins or in proteins known to besecreted (exception: prostaglandin G/H synthase). The EGF-like domain includessix cysteine residues which have been shown to be involved in disulfide bonds. The structure of several EGF-like domains has been solved. The fold consistsof two-stranded β-sheet followed by a loop to a C-terminal short two-stranded sheet []. Tenascins are extracellular matrix glycoproteins that act both as integrin ligands and as modifiers of fibronectin-integrin interactions to regulate cell adhesion, migration, proliferation and differentiation. Tenascins are usually composed of repeated epidermal growth factor (EGF)-like domains, fibronectin-type III (FNIII) domains and a C-terminal fibrinogen related domain (FReD) [ ].Reelin is an extracellular matrix serine protease that regulates neuronal migration during embryonic development and acts as a modulator of synaptic transmission in the adult brain [ ]. Reelin acts on its receptors, VLDLR and ApoER2, acting on cytoskeleton, controlling migration and subsequently positioning and stabilizing the cortical neurons [, , ]. In the adult brain, reelin stabilizes the actin cytoskeleton by inducing cofilin phosphorylation. Decreased Reelin expression causes destabilization of neurons, which could have implications for brain disorders, such as epilepsy and schizophrenia [].
Protein Domain
Name: Carbohydrate binding module family 15
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This entry represents which binds to xylan and xylooligosaccharides [ ].
Protein Domain
Name: Carbohydrate-binding module family 19
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This entry represents which consists of 60-70 residues with chitin-binding function.
Protein Domain
Name: Carbohydrate binding module family 11
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This entry represents , which binds both beta-1,4-glucan and beta-1,3-1,4-mixed linked glucans.
Protein Domain
Name: Cyclophilin-type peptidyl-prolyl cis-trans isomerase, E. coli cyclophilin A-like
Type: Family
Description: Cyclophilins exhibit peptidyl-prolyl cis-trans isomerase (PPIase) activity ( ), accelerating protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides [ , ]. They also have protein chaperone-like functions [] and are the major high-affinity binding proteins for the immunosuppressive drug cyclosporin A (CSA) in vertebrates [].Cyclophilins are found in all prokaryotes and eukaryotes, and have been structurally conserved throughout evolution, implying their importance in cellular function [ ]. They share a common 109 amino acid cyclophilin-like domain (CLD) and additional domains unique to each member of the family. The CLD domain contains the PPIase activity, while the unique domains are important for selection of protein substrates and subcellular compartmentalisation [].This entry includes the cyclophilin-type peptidyl-prolyl cis-trans isomerase mostly found in bacteria, archea and plants, including the E. coli cyclophilin A and Streptomyces antibioticus SanCyp18. Compared to the archetypal cyclophilin human cyclophilin A, these have reduced affinity for cyclosporin A. E. coli cyclophilin A has a similar peptidyl-prolyl cis-trans isomerase activity to the human cyclophilin A. Most members of this subfamily contain a phenylalanine residue at the position equivalent to human cyclophilin W121, where a tryptophan has been shown to be important for cyclophilin binding [].
Protein Domain
Name: HTH-type transcriptional regulator GbpR, PBP2 domain
Type: Domain
Description: Galactose-binding protein regulator (GbpR), a member of the LysR family of bacterial transcriptional regulators, regulates the expression of chromosomal virulence gene chvE [ ]. The chvE gene is involved in the uptake of specific sugars, in chemotaxis to these sugars, and in the VirA-VirG two-component signal transduction system. In the presence of an inducing sugar such as L-arabinose, D-fucose, or D-galactose, GbpR activates chvE expression, while in the absence of an inducing sugar, GbpR represses expression []. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2).The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate- binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction [ , , ].
Protein Domain
Name: Tubby, N-terminal
Type: Domain
Description: A mutation in the mouse tub gene causes maturity-onset obesity, insulin resistance and sensory deficits [ , ]. By contrast with the rapid juvenile-onset weight gain seen in diabetes (db) and obese (ob) mice, obesity in tubby mice develops gradually, and strongly resembles the late-onset obesity observed in the human population []. Excessive deposition of adipose tissue culminates in a two-fold increase of body weight. Tubby mice also suffer retinal degeneration and neurosensory hearing loss. The tripartite character of the tubby phenotype is highly similar to human obesity syndromes, such as Alstrom and Bardet-Biedl. Although these phenotypes indicate a vital role for tubby proteins, no biochemical function has yet been ascribed to any family member [], although it has been suggested that the phenotypic features of tubby mice may be the result of cellular apoptosis triggered by expression of the mutuated tub gene.Mammalian tub is a hydrophilic protein of ~500 residues. Tub carries a nuclear localisation signal and is able to activate transcription [ ]. The N-terminal portion of the protein is conserved neither in length nor sequence, but the C-terminal 250 residues are highly conserved. The C-terminal extremity contains a cysteine residue that might play an important role in the normal functioning of these proteins. The C-terminal is represented by .
Protein Domain
Name: Sulphonylurea receptor
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].The sulphonylurea receptor (SUR) is a member of the ATP-binding cassette superfamily that associates with certain K+channel inward rectifier subunits to form ATP-sensitive K+channels (KATP channels) [ , ].These are a family of K +channels that are inhibited by intracellular ATP, which can couple metabolic state to cell excitability. Their presence on pancreaticislet beta cells allows the cells to function as metabolic sensors, regulating insulin release in relation to glucose metabolism. Furthermore,SUR is the site of action for the sulphonylurea oral hypoglycaemic agents that are used widely for the treatment of non-insulin dependent diabetesmellitus. When these agents bind to the sulphonlyurea receptor, they reduce KATP channel activity, stimulating insulin release.As mentioned, SUR is a member of the ATP-binding cassette superfamily. This raises the possibility that SUR may transport some endogenous substance, as yet unidentified.Two closely related genes have been found to encode the sulphonylurea receptors, SUR1 and SUR2, there being three splice variants of the secondform [ ]. They are thought to contain 13-17 transmembrane (TM) domains,with two potential nucleotide binding folds, and a large number of possible protein kinase A, or C phosphorylation sites. Comparison of the propertiesof cloned and wild-type KATP channels suggests that SUR1 may associate with the inward rectifier subunit Kir6.2 to form the pancreatic beta cell KATPchannel. Splice variants of SUR2 (termed SUR2A and SUR2B) may form the cardiac and smooth muscle isoforms, respectively, again when combined withKir 6.2. This co-assembly likely occurs with an obligate 4:4 stoichiometry, giving rise to an octameric channel.Mutations in SUR genes have been characterised; these can result in truncations of the second predicted nucleotide binding fold, leading topersistent hyperinsulinemic hypoglycaemia of infancy, a rare familial disorder characterised by excessive, unregulated insulin secretion.
Protein Domain
Name: Potassium channel, voltage-dependent, beta subunit, KCNE
Type: Family
Description: Two types of beta subunit (KCNE and KCNAB) are presently known to associate with voltage-gated alpha subunits (Kv, KCNQ and eag-like). However, not all combinations of alpha and beta subunits are possible. The KCNE family of K+ channel subunits are membrane glycoproteins that possess a single transmembrane (TM) domain. They share no structural relationship with the alpha subunit proteins, which possess pore forming domains. The subunits appear to have a regulatory function, modulating the kinetics and voltage dependence of the alpha subunits of voltage-dependent K+ channels. KCNE subunits are formed from short polypeptides of ~130 amino acids, and are divided into five subfamilies: KCNE1 (MinK/IsK), KCNE2 (MiRP1), KCNE3 (MiRP2), KCNE4 (MiRP3) and KCNE1L (AMMECR2). Potassium channels are the most diverse group of the ion channel family [ , ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K +channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [ ]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [ ]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K +channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K +selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K +across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K +channels; and three types of calcium (Ca)-activated K +channels (BK, IK and SK) [ ]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K +channel alpha-subunits that possess two P-domains. These are usually highly regulated K +selective leak channels.
Protein Domain
Name: GPCR, family 2, latrophilin
Type: Family
Description: Latrophilins are a family of secretin-like GPCRs that can be subdividedinto 3 subtypes: LPH1, LPH2 and LPH3. LPH1 is a brain-specific calcium independent receptor of alpha-latrotoxin (LTX), a neurotoxin. It is the affinity of this form of the receptor for LTX that gives the family its name. LPH2 and LPH3, whilst sharing extensive sequence similarity to LPH1, do not bind LTX. LPH2 is distributed throughout most tissues, whereas LPH3 is also brain-specific []. The endogenous ligand(s) for these receptors are at present unknown. Binding of LTX to LPH1 stimulates exocytosis and the subsequent release of large amounts of neurotransmitters from neuronal and endocrine cells. The latrophilins possess up to 7 sites of alternative splicing; the resulting number of possible splice variants leads to a highly variable family of proteins.Structurally, these proteins have a seven-transmembrane region and a large extracellular N-terminal region which consists of several domains: a rhamnose binding lectin (RBL) domain, an olfactomedin-like (OLF) domain followed by a Serine/Threonine rich domain that is O-linked glycosylated, a hormone binding (HR) domain; and a GPCR Autoproteolysis INducing (GAIN) domain [ ].G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The secretin-like GPCRs include secretin [ ], calcitonin [], parathyroid hormone/parathyroid hormone-related peptides [] and vasoactive intestinal peptide [], all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique '7TM' signature). Their N-terminal is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allows the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products.
Protein Domain
Name: Myoglobin-like, M family globin domain
Type: Domain
Description: This entry represents the M family of globin domain which includes chimeric (FHbs/flavohemoglobins) and single-domain globins: FHbs, Ngbs/neuroglobins, Cygb/cytoglobins, GbE/avian eye specific globin E, GbX/globin X, amphibian GbY/globin Y, Mb/myoglobin, HbA/hemoglobin-alpha, HbB/hemoglobin-beta, SDgbs/single-domain globins related to FHbs, and Adgb/androglobin [ ]. The M family exhibits the canonical secondary structure of hemoglobins, a 3-over-3 α-helical sandwich structure (3/3 Mb-fold), built by eight α-helical segments (named A through H) [, ]. In Adgbs, the globin domain is split into two: helices C-H are followed by helices A-B and the two parts are separated by the IQ calmodulin-binding motif. Although rearranged, the globin domain of most Adgbs contains a number of conserved residues which play critical roles in heme-coordination and gas ligand binding [].Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms [ ]. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:Haemoglobin (Hb): tetramer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates [ ]. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors [].Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle [ ].Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia [ ]. Neuroglobin belongs to a branch of the globin family that diverged early in evolution. Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin [ ].Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers [ ].Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation.Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants [ ].Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin [ ].Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [ , ]. Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors [ ].Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 α-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features [ ].
Protein Domain
Name: Glutathione S-transferase, alpha class
Type: Family
Description: Glutathione S-transferases (GSTs) are soluble proteins with typical molecular masses of around 50kDa, each composed of two polypeptide subunits. GSTs catalyse the transfer of the tripeptide glutathione (gamma-glutamyl-cysteinyl-glycine; GSH) to a co-substrate (R-X) containing a reactive electrophillic centre to form a polar S-glutathionylated reaction product (R-SG). Each soluble GST is a dimer of approximately 26kDa subunits, typically forming a hydrophobic 50kDa protein with an isoelectric point in the pH range 4-5. The ability to form heterodimers greatly increases the diversity of the GSTs, but the functional significance of this mixing and matching of subunits has yet to be determined. Each GST subunit of the protein dimer contains an independent catalytic site composed of two components. The first is a binding site specific for GSH or a closely related homologue (the G site) formed from a conserved group of amino-acid residues in the amino-terminal domain of the polypeptide. The second component is a site that binds the hydrophobic substrate (the H site), which is much more structurally variable and is formed from residues in the carboxy-terminal domain. Between the two domains is a short variable linker region of 5-10 residues. The GST proteins have evolved by gene duplication to perform a range of functional roles. GSTs also have non-catalytic roles, binding flavonoid natural products in the cytosol prior to their deposition in the vacuole. Recent studies have also implicated GSTs as components of ultraviolet-inducible cell signalling pathways and as potential regulators of apoptosis. The mammalian GSTs active in drug metabolism are now classified into the alpha, mu and pi classes. Additional classes of GSTs have been identified in animals that do not have major roles in drug metabolism; these include the sigma GSTs, which function as prostaglandin synthases. In cephalopods, however, sigma GSTs are lens S-crystallins, giving an indication of the functional diversity of these proteins. The soluble glutathione transferases can be divided into the phi, tau, theta, zeta and lambda classes. The theta and zeta GSTs have counterparts in animals, whereas the other classes are plant-specific. In the case of phi and tau GSTs, only subunits from the same class will dimerise. Within a class, however, the subunits can dimerise even if they are quite different in amino-acid sequence. An insect-specific delta class has also been described, and bacteria contain a prokaryote-specific beta class of GST. Alpha-class GSTs show substrate specificity for cumene hydroperoxide (CuOOH) and 7-chloro-4-nitrobenz-2-oxa-1,3-diazole (NBD-C1), amongst others. In addition, this class exhibits a number of differences from the characteristic GST structure: within domain II, there is a short 3-residueβ-strand near the C-terminal segment and a longer alpha-7 helix (due to insertions at the N terminus and near to the middle of this helix); domain Iis formed from two separate segments of the sequence. This occurs because an extra helix (alpha-11) formed via folding of the C-terminal region of thepolypeptide chain is also part of this domain [ ]. This helix covers the substrate bound in the H subsite, which is thought to explain the preference of alpha class GSTs for more hydrophobic compounds [].
Protein Domain
Name: GPCR, family 2, calcitonin receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The secretin-like GPCRs include secretin [ ], calcitonin [], parathyroid hormone/parathyroid hormone-related peptides [] and vasoactive intestinal peptide [], all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique '7TM' signature). Their N-terminal is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allows the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products. The major physiological role of calcitonin is to inhibit bone resorption thereby leading to a reduction in plasma Ca2+. Further, it enhances excretion of ions in the kidney, prevents absorption of ions in theintestine, and inhibits secretion in endocrine cells (e.g. pancreas and pituitary). In the CNS, calcitonin has been reported to be analgesicand to suppress feeding and gastric acid secretion. It is used to treat Paget's disease of the bone. Calcitonin receptors are found predominantlyon osteoclasts or on immortal cell lines derived from these cells. It is found in lower amounts in the brain (e.g. in hypothalamus and pituitarytissues) and in peripheral tissues (e.g. testes, kidney, liver and lymphocytes). It has also been described in lung and breast cancer celllines. The predominant signalling pathway is activation of adenylyl cyclase through guanine nucleotide-binding proteins (G proteins), but calcitonin has also been described to have both stimulatoryand inhibitory actions on the phosphoinositide pathway.
Protein Domain
Name: GPCR fungal pheromone A receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].GPCR Fungal pheromone mating factor receptors form a distinct family of G-protein-coupled receptors, and are also known as Class D GPCRs.The Fungal pheromone mating factor receptors STE2 and STE3 are integral membrane proteins that may be involved in the response to mating factors on the cell membrane [ , , ]. The amino acid sequences of both receptors contain high proportions of hydrophobic residues grouped into 7 domains,in a manner reminiscent of the rhodopsins and other receptors believed tointeract with G-proteins. However, while a similar 3D framework has been proposed to account for this, there is no significant sequence similarity either between STE2 and STE3, or between these and the rhodopsin-type family: the receptors thereofore bear their own unique '7TM' signatures which is why they have been given their own GPCR group: Class D Fungal mating pheromone receptors.The STE3 gene in Saccharomyces cerevisiae is the cell-surface receptor that binds the 13-residue lipopeptide a-factor. Several related fungal pheromone receptorsequences are known: these include pheromone B alpha 1 and B alpha 3, and pheromone B beta 1 receptors from Schizophyllum commune; pheromone receptor1 from Ustilago hordei; and pheromone receptors 1 and 2 from Ustilago maydis. Members of the family share about 20% sequence identity.U. maydis, a tetrapolar fungal species, has two genetically unlinked loci that encode the distinct mating functions of cell fusion (the a locus)and subsequent sexual development and pathogenicity (the b locus) [ ].The a locus exists in two alleles, the mating type in each of which is determined by a set of two genes; one encodes a precursor for a lipopeptidemating factor, while the other specifies the receptor for the pheromone secreted by cells of opposite mating type []. U. maydis thus employs anovel strategy to determine its mating type by providing the primary determinants of cell-cell recognition directly from the mating type locus[ ]. The bipolar species, U. hordei, contains both a and b loci;physical linkage of these loci in this bipolar fungus accounts for its distinct mating system [].This entry represents mating-type a receptors.
Protein Domain
Name: Potassium channel, voltage dependent, KCNQ, C-terminal
Type: Domain
Description: Potassium channels are the most diverse group of the ion channel family [ , ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K +channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [ ]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [ ]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K +channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K +selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K +across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K +channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K +channels; and three types of calcium (Ca)-activated K +channels (BK, IK and SK) [ ]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K +channel alpha-subunits that possess two P-domains. These are usually highly regulated K +selective leak channels. KCNQ channels (also known as KQT-like channels) differ from other voltage-gated 6 TM helix channels, chiefly in that they possess no tetramerisation domain. Consequently, they rely on interaction with accessory subunits, or form heterotetramers with other members of the family [ ]. Currently, 5 members of the KCNQ family are known. These have been found to be widely distributed within the body, having been shown to be expressed in the heart, brain, pancreas, lung, placenta and ear. They were initially cloned as a result of a search for proteins involved in cardiac arhythmia. Subsequently, mutations in other KCNQ family members have been shown to be responsible for some forms of hereditary deafness [] and benign familial neonatal epilepsy [].This entry represents a region found at the C terminus of these proteins.
Protein Domain
Name: Glutamate synthase large subunit-like protein, archaeal type
Type: Family
Description: This entry represents the predicted archaeal type glutamate synthase large subunit, which includes stand-alone proteins corresponding to the N-terminal, FMN-binding, and the C-terminal domains of the large subunit. All members in this entry contain the FMN-binding domain and some have 1-3 copies of 4Fe-4S binding domain in the N-terminal region but they lack the linker domain, found in the bacterial glutamate synthase large subunit [ , ].The large (alpha, GltB) subunit of bacterial glutamate synthase (GOGAT) consists of three domains. represents a stand-alone version of the central domain, and this subgoup contains proteins that are predicted to function as part of GOGAT. This stand-alone form occurs in the archaeal type of GOGAT, where the large subunit is represented by three separate proteins, corresponding to the three domains of the "standard"bacterial enzyme [ ]. Similar organization of GOGAT with stand-alone domains has been found in some bacteria (e.g., Sinorhizobium meliloti, Thermotoga maritima), but its function is not clear in those organisms where the "standard"bacterial form is also present (e.g., Sinorhizobium meliloti). The second (central) domain of the bacterial GOGAT large subunit consists of a linker domain and the FMN-binding domain ( ). The FMN-binding domain has a beta/alpha barrel topology. In this domain, the 2-iminoglutarate intermediate, formed upon the addition of ammonia onto 2-oxoglutarate, is reduced by the FMN cofactor producing the second molecule of L-glutamate [ ]. This domain also contains the enzyme 3Fe-4S cluster [].Originally, only the ORF encoding the central domain of GOGAT was recognised and annotated as GltB in archaea, and the rest of the large subunit was thought to be missing, which may lead to some misannotations [ ]. This led to speculations that the archaeal form of the GOGAT large subunit is the ancestral minimum form of the enzyme. Later analysis showed, however, that in all archaea where the large subunit has been found, its entire sequence is represented by three separate ORFs [].Glutamate synthase (GOGAT, GltS) is a complex iron-sulphur flavoprotein that catalyses the reductive synthesis of L-glutamate from 2-oxoglutarate (2-OG) and L-glutamine via intramolecular channeling of ammonia, a reaction in the bacterial, yeast and plant pathways for ammonia assimilation [ ]. GOGAT is a multifunctional enzyme that functions through three distinct active centres carrying out multiple reaction steps: L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor [].There are four classes of GOGAT [ , ]: 1. Bacterial NADPH-dependent GOGAT (NADPH-GOGAT, ). This standard bacterial NADPH-GOGAT is composed of a large (alpha, GltB) subunit and a small (beta, GltD) subunit. 2. Ferredoxin-dependent form in cyanobacteria and plants (Fd-GOGAT from photosynthetic cells, ) displays a single-subunit structure corresponding to the large bacterial subunit. 3. Pyridine-linked form in both photosynthetic and nonphotosynthetic eukaryotes (eukaryotic GOGAT or NADH-GOGAT, ) displays a single-subunit structure corresponding to the fusion of the small and the large bacterial subunits ( ). 4. The archaeal type with stand-alone proteins corresponding to the N-terminal, FMN-binding, and the C-terminal domains of the large subunit [ , ] (, , ), and to the small subunit.
Protein Domain
Name: Signal transduction histidine kinase, CHASE2 sensor domain-containing, predicted
Type: Family
Description: Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [ ]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [ , ].Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [ , ]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. HKs can be roughly divided into two classes: orthodox and hybrid kinases [, ]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.This entry represents proteins predicted to act as signal transduction histidine kinases, most of which contain a CHASE2 sensor domains. CHASE2 extracellular sensory domains are found in at least four classes of transmembrane receptors: histidine kinases, adenylate cyclases, predicted diguanylate cyclases, and either serine/threonine protein kinases [ ].
Protein Domain
Name: DPH-type metal-binding domain
Type: Domain
Description: Diphthamide is a unique post-translationally modified histidine residue found only in translation elongation factor 2 (eEF-2). It is conserved from archaea to humans and serves as the target for diphteria toxin and Pseudomonas exotoxin A. These two toxins catalyse the transfer of ADP-ribose to diphtamide on eEF-2, thus inactivating eEF-2, halting cellular protein synthesis, and causing cell death [ ]. The biosynthesis of diphtamide is dependent on at least five proteins, DPH1 to -5, and a still unidentified amidating enzyme. DPH3 and DPH4 share a conserved region, which encode a putative zinc finger, the DPH-type or CSL-type (after the the final conserved cysteine of the zinc finger and the next two residues) MB domain contains a Cys-X-Cys...Cys-X2-Cys motif which tetrahedrically coordinates both Fe and Zn. The Fe containing DPH-type MBD has an electron transfer activity [, , , , , ].This entry represents the DPH-type metal binding domain consists of a three-stranded β-sandwich with one sheet comprising two parallel strands: (i) β1 and (ii) β6 and one antiparallel strand: β5. The second sheet in the β-sandwich is comprised of strands β2, β3, and β4 running anti-parallel to each other. The two β-sheets are separated by a short stretch α-helix. It can be found in proteins such as DPH3 and DPH4. This domain is also found associated with N-terminal domain of heat shock protein DnaJ domain [ , , ].
Protein Domain
Name: GINS complex, subunit Psf1
Type: Family
Description: DNA replication in eukaryotes results from a highly coordinated interaction between proteins, often as part of protein complexes, and the DNA template. One of the key early steps leading to DNA replication is formation of the prereplication complex, or pre-RC. The pre-RC is formed by the sequential binding of the origin recognition complex (ORC), Cdc6 and Cdt1 proteins, and the MCM complex. Activation of the pre-RC into the initiation complex (IC) is achieved via the action of S-phase kinases, eventually leading to the loading of the replication machinery.Recently, a novel replication complex, GINS (for Go, Ichi, Nii, and San; five, one, two, and three in Japanese), has been identified [ , ]. The precise function of GINS is not known. However, genetic and two-hybrid interactions indicate that it mediates the loading of the enzymatic replication machinery at a step after the action of the S-phase kinases []. Furthermore, GINS may be a part of the replication machinery itself, since it is found associated with replicating DNA [, ]. Electron microscopy of GINS shows that it forms a ring-like structure [], reminiscent of the structure of PCNA [], the DNA polymerase delta replication clamp.This observation, coupled with the observed interactions for GINS, indicates that the complex may represent the replication clamp for DNA polymerase epsilon [].This family of proteins represents the PSF1 component (for partner of SLD five) of the GINS complex.
Protein Domain
Name: C-type lectin-like
Type: Domain
Description: A number of different families of proteins share a conserved domain which was first characterised in some animal lectins and which seem to function as a calcium-dependent carbohydrate-recognition domain [ , ]. This domain, which is known as the C-type lectin domain (CTL) or as the carbohydrate-recognition domain (CRD), consists of about 110 to 130 residues. There are four cysteines which are perfectly conserved and involved in two disulphide bonds.There are proteins with modules similar in overall structure to CRDs that serve functions other than sugar binding. Therefore, a more general term C-type lectin-like domain was introduced to refer to such domains, although both terms C-type lectin and C-type lectin-like are sometimes used interchangeably [ ].C-type lectins can be further divided into seven subgroups based on additional non-lectin domains and gene structure: (I) hyalectans, (II) asialoglycoprotein receptors, (III) collectins, (IV) selectins, (V) NK group transmembrane receptors, (VI) macrophage mannose receptors, and (VII) simple (single domain) lectins [ ]. Lectins are a diverse group of proteins, both in terms of structure and activity. Carbohydrate binding ability may have evolved independentlyand sporadically in numerous unrelated families, where each evolved a structure that was conserved to fulfil some other activity and function. In general, animal lectins act as recognition molecules within the immune system, their functions involving defence against pathogens, cell trafficking, immune regulation and the prevention of autoimmunity [ ].
Protein Domain
Name: Carbohydrate binding domain CBM49
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This domain is found at the C-terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose [ ].
Protein Domain
Name: Peptidase M17, leucine aminopeptidase
Type: Family
Description: The majority of members of this family are zinc-dependent exopeptidases belonging to MEROPS peptidase family M17 (leucyl aminopeptidase, clan MF).This family excludes pepB aminopeptidases, which are also members of MEROPS family M17 (see ). Leucyl aminopeptidase (LAP; ) selectively release N-terminal amino acid residues from polypeptides and proteins; in general they are involved in the processing, catabolism and degradation of intracellular proteins [ , , ]. Leucyl aminopeptidase forms a homohexamer containing two trimers stacked on top of one another []. Each monomer binds two zinc ions. The zinc-binding and catalytic sites are located within the C-terminal catalytic domain []. Leucine aminopeptidase has been shown to be identical with prolyl aminopeptidase () in mammals [ ]. Interestingly, members of this group are also implicated in transcriptional regulation and are thought to combine catalytic and regulatory properties [ ]. The N-terminal domain of these proteins has been shown in Escherichia coli PepA to function as a DNA-binding protein in Xer site-specific recombination and in transcriptional control of the carAB operon [, ]. It is not well conserved and in some members can be found only by PSI-BLAST (after 4-6 iterations). It is not clear if the DNA binding function is preserved in all or even in most of the members.For additional information please see [ , , , ].
Protein Domain
Name: DSBA-like thioredoxin domain
Type: Domain
Description: DSBA is a sub-family of the Thioredoxin family [ ]. The efficient and correct folding of bacterial disulphide bonded proteins in vivois dependent upon a class of periplasmic oxidoreductase proteins called DsbA, after the Escherichia coli enzyme. The bacterial protein-folding factor DsbA is the most oxidizing of the thioredoxin family. DsbA catalyses disulphide-bond formation during the folding of secreted proteins. The extremely oxidizing nature of DsbA has been proposed to result from either domain motion or stabilising active-site interactions in the reduced form. DsbA's highly oxidizing nature is a result of hydrogen bond, electrostatic and helix-dipole interactions that favour the thiolate over the disulphide at the active site [ ]. In the pathogenic bacterium Vibrio cholerae, the DsbA homologue (TcpG) is responsible for the folding, maturation and secretion of virulence factors. While the overall architecture of TcpG and DsbA is similar and the surface features are retained in TcpG, there are significant differences. For example, the kinked active site helix results from a three-residue loop in DsbA, but is caused by a proline in TcpG (making TcpG more similar to thioredoxin in this respect). Furthermore, the proposed peptide binding groove of TcpG is substantially shortened compared with that of DsbA due to a six-residue deletion. Also, the hydrophobic pocket of TcpG is more shallow and the acidic patch is much less extensive than that of E. coli DsbA [ ].
Protein Domain
Name: Transcription factor, MADS-box
Type: Domain
Description: Human serum response factor (SRF) is a ubiquitous nuclear protein important for cell proliferation and differentiation. SRF function is essential for transcriptional regulation of numerous growth-factor-inducible genes, such as c-fos oncogene and muscle-specific actin genes. A core domain of around 90 amino acids is sufficient for the activities of DNA-binding, dimerisation and interaction with accessory factors. Within the core is a DNA-binding region, designated the MADS box [ ], that is highly similar to many eukaryotic regulatory proteins: among these are MCM1, the regulator of cell type-specific genes in fission yeast; DSRF, a Drosophila trachea development factor; the MEF2 family of myocyte-specific enhancer factors; and the Agamous and Deficiens families of plant homeotic proteins.In SRF, the MADS box has been shown to be involved in DNA-binding and dimerisation [ ]. Proteins belonging to the MADS family function as dimers, the primary DNA-binding element of which is an anti-parallel coiled coil of two amphipathic α-helices, one from each subunit. The DNA wraps around the coiled coil allowing the basic N-termini of the helices to fit into the DNA major groove. The chain extending from the helix N-termini reaches over the DNA backbone and penetrates into the minor groove. A 4-stranded, anti-parallel β-sheet packs against the coiled-coil face opposite the DNA and is the central element of the dimerisation interface. The MADS-box domain is commonly found associated with K-box region see ( ).
Protein Domain
Name: Importin subunit alpha
Type: Family
Description: The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.Members of the importin-alpha (karyopherin-alpha) family can form heterodimers with importin-beta. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Proteins can contain one (monopartite) or two (bipartite) NLS motifs. Importin-alpha contains several armadillo (ARM) repeats, which produce a curving structure with two NLS-binding sites, a major one close to the N terminus and a minor one close to the C terminus.Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. The N-terminal importin-beta-binding (IBB) domain of importin-alpha contains an auto-regulatory region that mimics the NLS motif [ ]. The release of importin-beta frees the auto-regulatory region on importin-alpha to loop back and bind to the major NLS-binding site, causing the cargo to be released [].This entry represents importin alpha.
Protein Domain
Name: Peptidase S24, LexA-like
Type: Family
Description: This signature defines serine peptidases belong to MEROPS peptidase family S24 (LexA family, clan SF). They include:LexA, the repressor of genes in the cellular SOS response to DNA damageMucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damageRumA a plasmid encoded homologue of UmuD [ ] RuvA, which is a component of the RuvABC resolvasome that catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair [ ] The LexA, UmuD and MucD proteins interact with RecA, which activates self cleavage either derepressing transcription in the case of LexA [ ] or activating the lesion-bypass polymerase in the case of UmuD and MucA. UmuD'2, is the homodimeric component of DNA pol V, which is produced from UmuD by RecA-facilitated self-cleavage. The first 24 N-terminal residues of UmuD are removed; UmuD'2 is a DNA lesion bypass polymerase [ , ]. MucA [, ], like UmuD, is a plasmid encoded a DNA polymerase (pol RI) which is converted into the active lesion-bypass polymerase by a self-cleavage reaction involving RecA [].This group of proteins also contains proteins classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.
Protein Domain
Name: ABC transporter type 1, transmembrane domain MetI-like
Type: Domain
Description: ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein (mostly in eukaryotes and bacterial exporters) or on two different ones (mostly bacterial importers) [ ]. In importers, the TMD displays a distinctive signature, the EAA motif, a 20 amino acid conserved sequence located about 100 residues from the C terminus. The motif is hydrophilic and has been found to reside in a cytoplasmic loop located between the penultimate and the antepenultimate transmembrane segment in all proteins with a known topology []. It appears to play an important role in ensuring the correct assembly of the prokaryotic ABC transport complex [] and constituting an interaction site with the so-called helical domain of the ABC module [, ].This entry recognises ABC transmembrane domains where the TMD is on a separate protein, such as the D-methionine transport system permease protein MetI. The crystal structure of the high-affinity Escherichia coli MetNI methionine uptake transporter has been solved. Each MetI subunit is organised around a core of five transmembrane helices that correspond to a subset of the helices observed in the larger membrane-spanning subunits of the molybdate (ModBC) and maltose (MalFGK) ABC transporters, which contain six helices [ , ].
Protein Domain
Name: Cyclin CLN
Type: Family
Description: This entry represents a G1-class of cyclins which has so far only been identified in fungi [ , ]. These proteins are important for the control of the cell cycle at the G1/S transition and interact with the cdc2 protein kinase.Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles [ ], and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus [ ].
Protein Domain
Name: Heme transporter HRG
Type: Family
Description: Haems are metalloporphyrins that serve as prosthetic groups for a variety of biological processes, including respiration, gas sensing, xenobiotic detoxification, cell differentiation, circadian clock control, metabolic reprogramming and microRNA processing. Haem is usually synthesised by a multistep biosynthetic pathway. The cellular pathways and molecules that mediate intracellular haem trafficking are still largely unknown [ ].Caenorhabditis elegans and related helminths are natural haem auxotrophs that acquire environmental haem for incorporation into haemoproteins. In C.elegans, it has been shown that HRG-1 proteins are essential for haem homeostasis. In worms, depletion of hrg-1, or its paralogue hrg-4, results in the disruption of organismal haem sensing, and an abnormal response to haem analogues [ ].HRG-1 and HRG-4 are transmembrane (TM) proteins that reside in distinct intracellular compartments. Transient knockdown of hrg-1 in zebrafish leads to hydrocephalus, yolk tube malformations and profound defects in erythropoiesis-phenotypes that are fully rescued by worm HRG-1. Human and worm proteins have been shown to co-localise, and bind and transport haem, thus establishing an evolutionarily conserved function for HRG-1 [ ].Sequence analysis of HRG-1 has identified 4 predicted TM domains, and a conserved tyrosine and acidic-di-leucine-based sorting signal in the cytoplasmic C terminus. In addition, residues that could potentially either directly bind haem (H90 in TM2) or interact with the haem side chains (FARKY) are situated in the C-terminal tail [ ].
Protein Domain
Name: Rab20
Type: Family
Description: Rab20 is one of several Rab proteins that appear to be restricted in expression to the apical domain of murine polarized epithelial cells. It is expressed on the apical side of polarized kidney tubule and intestinal epithelial cells, and in non-polarized cells. It also localizes to vesico-tubular structures below the apical brush border of renal proximal tubule cells and in the apical region of duodenal epithelial cells. Rab20 has also been shown to colocalize with vacuolar H+-ATPases (V-ATPases) in mouse kidney cells, suggesting a role in the regulation of V-ATPase traffic in specific portions of the nephron [ ]. It was also shown to be one of several proteins whose expression is upregulated in human myelodysplastic syndrome (MDS) patients [].Rabs are regulated by GTPase activating proteins (GAPs), which interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins [ , ].
Protein Domain
Name: Copper chaperone PCuAC superfamily
Type: Homologous_superfamily
Description: CuA is a dinuclear copper site within the soluble domain of subunit II (Cox2) of bacterial and eukaryotic cytochrome c oxidases (CcOs), whose function is to convey electrons from a soluble cytochrome c to the catalytic heme a3-CuB centre of CcO. The proper assembly of the CuA site is essential for the catalytic machinery of a functional oxidase. In prokaryotes two protein families have been proposed to be involved in CuA site formation, with a key role in the delivery of metal ions to the CuA site. The first includes proteins that are able to bind Cu(I) through methionine and histidine residues arranged in a highly conserved H(M)X(10)MX(21)HXM motif 5 (referred to as periplasmic CuA chaperone, PCu(A)C). The second consists of the Sco proteins, whose mechanism of action in CuA assembly as thioredoxins or metallochaperones is still debated. These proteins (PCuAC and Sco) are often found in the same bacterial operon, and most of the identified operons that encode Sco also contain a gene for Cox2 [ ].PCu(A)C is a periplasmic copper chaperone. It selectively inserts Cu(I) ions into subunit II of Thermus thermophilus ba3 oxidase to generate a native Cu(A) site. Its role may be to capture and transfer copper to two other copper chaperones, PrrC and Cox11, which in turn deliver Cu(I) to cytochrome c oxidase [ ].
Protein Domain
Name: Perforin-1, C2 domain
Type: Domain
Description: Perforin-1 contains a single copy of a C2 domain in its C terminus and plays a role in lymphocyte-mediated cytotoxicity [ ]. Mutations in perforin-1 lead to familial hemophagocytic lymphohistiocytosis type 2, a rare, rapidly fatal, autosomal recessive immune disorder characterized by uncontrolled activation of T cells and macrophages and overproduction of inflammatory cytokines []. The function of perforin-1 is calcium dependent and the C2 domain is thought to confer this binding to target cell membranes []. C2 domains fold into an 8-standed β-sandwich that can adopt 2 structural arrangements, type I and type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions [, , , , , , , ].
Protein Domain
Name: CheW-like domain superfamily
Type: Homologous_superfamily
Description: This entry represents the CheW-like domain superfamily.The CheW-like domain is an around 150-residue domain that is found in proteins involved in the two-component signaling systems regulating bacterial chemotaxis. Two components systems are composed of a receptor kinase, whichmonitors the environmental conditions and its substrate, the response regulator, which acts as a binary switch depending on the phosphorylationstate. In Escherichia coli, the signal transduction pathway for chemotaxis consists of specialised membrane receptors, termed chemotaxis transducers; aCheA-CheY two-component system, which transmits the signal from transducers to flagellar motors; and a docking protein, CheW, which couples the CheAhistidine kinase to transducers. Whereas CheW is only made of a CheW-like domain, CheA additionally contains an HPt domain and anhistidine kinase domain. The CheW-like domain has been shown to mediate the interaction between CheA and the adaptor protein CheW. Somebacteria contain another bifunctional protein, CheV, consisting of an N- terminal CheW-like domain and a C-terminal response regulatory domain. Although its precise function in chemotaxis is unknown, CheVprobably acts in adaptation to attractants [ , , , ].The CheW-like domain is composed of two β-sheet subdomains, each of which forms a loose five-stranded β-barrel around an internal hydrophobic core. The interactions between the subdomains are contributed by athird hydrophobic core sandwiched between the two β-sheet subdomains. The CheW-like structure is stabilised by extensive hydrophobic interactions [, ].
Protein Domain
Name: Kazal domain superfamily
Type: Homologous_superfamily
Description: This entry represents the Kazal domain superfamily.Canonical serine proteinase inhibitors are distributed in a wide range of organisms from all kingdoms of life and play crucial role in various physiological mechanisms [ ]. They interact from the canonical proteinase-inhibitor binding loop, where P1 residue has a predominant role (the residue at the P1 position contributing the carbonyl portion to the reactive-site peptide bond). These so-called canonical inhibitors bind to their cognate enzymes in the same manner as a good substrate, but are cleaved extremely slowly. Kazal-type inhibitors represent the most studied canonical proteinase inhibitors. Kazal inhibitors are extremely variable at their reactive sites. However, some regularity prevails such as the presence of lysine at position P1 indicating strong inhibition of trypsin [].The Kazal inhibitor has six cysteine residues engaged in disulfide bonds arranged as shown in the following schematic representation:+------------------+ | |*******************|*** xxxxxxxxCxxxxxxCx#xxxxxCxxxxxxxxxxCxxCxxxxxxxxxxxxxxxxxC| | | | | +-------------|-----------------++----------------------------+ 'C': conserved cysteine involved in a disulfide bond.'#': active site residue. '*': position of the pattern.The structure of classical Kazal domains consists of a central α-helix, which is inserted between two β-strands and a third that is toward the C terminus []. The reactive site P1 and the conformation of the reactive site loop is structurally highly conserved, similar to the canonical conformation of small serine proteinase inhibitors.
Protein Domain
Name: Coagulin
Type: Family
Description: Coagulogen is a gel-forming protein of hemolymph that hinders the spread of invaders by immobilising them [ , ]. The protein contains a single 175- residue polypeptide chain; this is cleaved after Arg-18 and Arg-46 by a clotting enzyme contained in the hemocyte and activated by a bacterial endotoxin (lipopolysaccharide). Cleavage releases two chains of coagulin, A and B, linked by two disulphide bonds, together with the peptide C [, ]. Gel formation results from interlinking of coagulin molecules. The crystal structure of a coagulogen has been revealed. Coagulogen contains a C-terminal domain that has a NGF-like fold []. Mammalian blood coagulation is based on the proteolytically induced polymerisation of fibrinogens. Initially, fibrin monomers noncovalently interact with each other. The resulting homopolymers are further stabilised when the plasma transglutaminase (TGase) intermolecularly cross-links epsilon-(gamma-glutamyl)lysine bonds. In crustaceans, hemolymph coagulation depends on the TGase-mediated cross-linking of specific plasma-clotting proteins, but without the proteolytic cascade. In horseshoe crabs, the proteolytic coagulation cascade triggered by lipopolysaccharides and beta-1,3-glucans leads to the conversion of coagulogen into coagulin, resulting in noncovalent coagulin homopolymers through head-to-tail interaction. Horseshoe crab TGase, however, does not cross-link coagulins intermolecularly. It has been shown that coagulins are cross-linked on hemocyte cell surface proteins called proxins. This indicates that a cross-linking reaction at the final stage of hemolymph coagulation is an important innate immune system of horseshoe crabs [ ].
Protein Domain
Name: SOS response associated peptidase-like
Type: Homologous_superfamily
Description: This entry represents the SOS response associated peptidase (SRAP) superfamily.The SRAP (SOS-response associated peptidase) family is characterised by the SRAP domain with a novel thiol autopeptidase activity, whose active site in human HMCES is comprised of the catalytic triad residues C2, E127, and H210 [ ]. SRAP proteins are evolutionarily conserved in all domains of life. For instance, human HMCES and E. coli YedK are similar in both sequence and structure []. HMCES was originally identified as a possible reader of 5hmC in embryonic stem cell extracts using a double-stranded DNA molecule containing 5hmC as bait []. The bacterial members have operonic associations with the SOS DNA damage response, mutagenic translesion DNA polymerases, non-homologous DNA-ending-joining networks that employ Ku and an ATP-dependent ligase, and other repair systems []. Abasic (AP) sites are one of the most common DNA lesions that block replicative polymerases. SRAP proteins shield the AP site from endonucleases and error-prone polymerases [ ]. Both HMCES and YedK have been found to preferentially bind ssDNA and efficiently form DNA-protein crosslinks (DPCs) to AP sites in ssDNA. They crosslink to AP sites via a stable thiazolidine DNA-protein linkage formed with the N-erminal cysteine and the aldehyde form of the AP deoxyribose []. In B Cells, HMCES has also been shown to mediate microhomology-mediated alternative-end-joining through its SRAP domain [ ].
Protein Domain
Name: Tumour necrosis factor receptor 7, N-terminal
Type: Domain
Description: Tumor necrosis factor receptor superfamily member 7 (TNFRSF7), also known as CD27, T14, S152, Tp55, S152 or LPFS2, has a key role in the generation of immunological memory via effects on T-cell expansion and survival, and B cell development [ , ]. It binds to ligand CD70, and plays a key role in regulating B-cell activation and immunoglobulin synthesis. CD27 transduces signals that lead to the activation of NF-kappaB and MAPK8/JNK, and mediates the signaling process through adaptor proteins TRAF2 and TRAF5. CD27-binding protein (SIVA), a pro-apoptotic protein, can bind to CD27 and may play an important role in the apoptosis induced by this receptor []. The potential role of the CD27/CD70 pathway in the course of inflammatory diseases, such as arthritis, and inflammatory bowel disease, suggests that CD70 may be a target for immune intervention. The expression of CD27 and CD44 molecules correlates with the differentiation stage of B cell precursors and has been shown to have a biological significance in acute lymphoblastic leukemia [].This entry represents the N-terminal domain of TNFRSF7. TNF-receptors are modular proteins. The N-terminal extracellular part contains a cysteine-rich region responsible for ligand-binding. This region is composed of small modules of about 40 residues containing 6 conserved cysteines; the number and type of modules can vary in different members of the family [ , , ].
Protein Domain
Name: AAA domain-containing protein, R3H domain
Type: Domain
Description: This R3H domain is found in a group of proteins with unknown function, which also contain an AAA-ATPase (AAA) domain.The R3H domain is a conserved sequence motif found in proteins from a diverse range of organisms including eubacteria, green plants, fungi and various groups of metazoans, but not in archaea and Escherichia coli. The domain is named R3H because it contains an invariant arginine and a highly conserved histidine, that are separated by three residues. It also displays a conserved pattern of hydrophobic residues, prolines and glycines. It can be found alone, in association with AAA domain or with various DNA/RNA binding domains like DSRM, KH, G-patch, PHD, DEAD box, or RRM. The functions of these domains indicate that the R3H domain might be involved in polynucleotide-binding, including DNA, RNA and single-stranded DNA [ ].The 3D structure of the R3H domain has been solved. The fold presents a small motif, consisting of a three-stranded antiparallel β-sheet, against which two α-helices pack from one side. This fold is related to the structures of the YhhP protein and the C-terminal domain of the translational initiation factor IF3. Three conserved basic residues cluster on the same face of the R3H domain and could play a role in nucleic acid recognition. An extended hydrophobic area at a different site of the molecular surface could act as a protein-binding site [ ].
Protein Domain
Name: Sperm-associated antigen 7, R3H domain
Type: Domain
Description: This is the R3H domain of a group of metazoan proteins that is related to the sperm-associated antigen 7 [ ].The R3H domain is a conserved sequence motif found in proteins from a diverse range of organisms including eubacteria, green plants, fungi and various groups of metazoans, but not in archaea and Escherichia coli. The domain is named R3H because it contains an invariant arginine and a highly conserved histidine, that are separated by three residues. It also displays a conserved pattern of hydrophobic residues, prolines and glycines. It can be found alone, in association with AAA domain or with various DNA/RNA binding domains like DSRM, KH, G-patch, PHD, DEAD box, or RRM. The functions of these domains indicate that the R3H domain might be involved in polynucleotide-binding, including DNA, RNA and single-stranded DNA [ ].The 3D structure of the R3H domain has been solved. The fold presents a small motif, consisting of a three-stranded antiparallel β-sheet, against which two α-helices pack from one side. This fold is related to the structures of the YhhP protein and the C-terminal domain of the translational initiation factor IF3. Three conserved basic residues cluster on the same face of the R3H domain and could play a role in nucleic acid recognition. An extended hydrophobic area at a different site of the molecular surface could act as a protein-binding site [ ].
Protein Domain
Name: Orange carotenoid-binding protein, N-terminal domain superfamily
Type: Homologous_superfamily
Description: Carotenoids such as beta-carotene, lycopene, lutein and beta-cryptoxanthine are produced in plants and certain bacteria, algae and fungi, where they function as accessory photosynthetic pigments and as scavengers of oxygen radicals for photoprotection. They are also essential dietary nutrients in animals. Orange carotenoid-binding proteins (OCP) were first identified in cyanobacterial species, where they occur associated with phycobilisome in the cellular thylakoid membrane. These proteins function in photoprotection, and are essential for inhibiting white and blue-green light non-photochemical quenching (NPQ) [ , ]. Carotenoids improve the photoprotectant activity by broadening OCP's absorption spectrum and facilitating the dissipation of absorbed energy. OCP acts as a homodimer, and binds one molecule of carotenoid (3'-hydroxyechinenone) and one chloride ion per subunit, where the carotenoid binding site is lined with a striking number of methionine residues. The carotenoid 3'-hydroxyechinenone is not found in higher plants. OCP has two domains: an N-terminal helical domain and a C-terminal domain that resembles a NTF2 (nuclear transport factor 2) domain. OCP can be proteolytically cleaved into a red form (RCP), which lacks 15 residues from the N terminus and approximately 150 residues from the C terminus [].This entry represents the N-terminal domain superfamily found predominantly in prokaryotic orange carotenoid proteins and related carotenoid-binding proteins. It adopts an α-helical structure consisting of two four-helix bundles [ ].
Protein Domain
Name: STAT3, SH2 domain
Type: Domain
Description: STAT3 is a member of the STAT protein family. STAT3 mediates the expression of a variety of genes in response to cell stimuli, and plays a key role in many cellular processes such as cell growth and apoptosis. STAT3 has been shown to interact with Rho GTPases [ ] Three alternatively spliced transcript variants encoding distinct isoforms have been described. STAT3 activation is required for self-renewal of embryonic stem cells (ESCs) [] and is essential for the differentiation of the TH17 helper T cells []. Mutations in the STAT3 gene result in hyperimmunoglobulin E syndrome and human cancers []. This entry represents the SH2 domain of STAT3.STAT proteins have a dual function: signal transduction and activation of transcription. When cytokines are bound to cell surface receptors, the associated Janus kinases (JAKs) are activated, leading to tyrosine phosphorylation of the given STAT proteins [ ]. Phosphorylated STATs form dimers, translocate to the nucleus, and bind specific response elements to activate transcription of target genes []. STAT proteins contain an N-terminal domain (NTD), a coiled-coil domain (CCD), a DNA-binding domain (DBD), an α-helical linker domain (LD), an SH2 domain, and a transactivation domain (TAD). The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6 [].
Protein Domain
Name: Flagellin, C-terminal domain
Type: Domain
Description: Bacterial flagella are responsible for motility and chemotaxis [ ]. They comprise a basal body, a hook and a filament, the latter accounting for 98% of the mass []. Flagellin is the subunit protein that polymerises to form the flagella [], the subunits being transported through the centre of the filament to the tip, where they then polymerise []. Both the N- and C-termini of the subunit protein, which are α-helical in structure [ ], are required to mediate polymerisation. Although no export or assembly, consensus sequences have been identified: Ala, Val, Leu, Ile, Gly, Ser, Thr, Asn, Gln and Asp tend to make up around 90% of the sequence, Cys and Trp being absent [].Flagellin plays a role in the activation of innate and adaptive immunity. It is an specific ligand for Toll-like receptor 5 (TLR5) in the host, which has lead to great interest to use it as adjuvant for vaccines [ , , ]. The protein is also recognised by the intracellular NAIP5/NLRC4 inflammasome receptor [ ]. This entry represents the C-terminal domain of Flagellin and similar bacterial proteins. This domain comes together with the N-terminal domain ( ) to form the D0 and D1 structural domains [ ]. These domains are responsible for the activation of TLR5, with the C-terminal D0 region playing a key role [, , ].
Protein Domain
Name: MAN1, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif 1 (RRM1) of Man1, an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization [ ]. It is part of a protein complex essential for chromatin organization and cell division. It also functions as an important negative regulator for the transforming growth factor (TGF) beta/activin/Nodal signaling pathway by directly interacting with chromatin-associated proteins and transcriptional regulators, including the R-Smads, Smad1, Smad2, and Smad3 [, ]. Moreover, Man1 is a unique type of left-right (LR) signaling regulator that acts on the inner nuclear membrane. Man1 plays a crucial role in angiogenesis. The vascular remodeling can be regulated at the inner nuclear membrane through the interaction between Man1 and Smads []. Man1 contains an N-terminal LEM domain, two putative transmembrane domains, a MAN1-Src1p C-terminal (MSC) domain, and a C-terminal RNA recognition motif (RRM) [ ]. The LEM domain interacts with the DNA and chromatin-binding protein Barrier-to-Autointegration Factor, and is also necessary for efficient localization of MAN1 in the inner nuclear membrane []. Research has indicated that C-terminal nucleoplasmic region of Man1 exhibits a DNA binding winged helix domain and is responsible for both DNA- and Smad-binding [].Mutations in the Man1 gene cause Buschke-Ollendorf syndrome (BOS), an uncommon syndrome characterised by osteopoikilosis and other bone abnormalities [ ].
Protein Domain
Name: Metallothionein, family 6, nematoda
Type: Family
Description: Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, nickel, etc. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds [ , , ]. The metallothionein superfamily comprises all polypeptides that resemble equine renal metallothionein in several respects, e.g. low molecular weight; high metal content; amino acid composition with high Cys and low aromatic residue content; unique sequence with characteristic distribution of cysteines, and spectroscopic manifestations indicative of metal thiolate clusters. A MT family subsumes MTs that share particular sequence-specific features and are thought to be evolutionarily related. Fifteen MT families have been characterised, each family being identified by its number and its taxonomic range. Nematode (family 6) MTs are 62-74 residue proteins, containing 18 conserved cysteines and binding 6 cadmium ions. The protein also binds cations of several transition elements. The cysteine residues are arranged in C-X-C and X-C-C-X groups. In particular, the consensus pattern K-C-C-x(3)-C-C has been found to be diagnostic of family 6 metallothioneins. The protein is induced by cadmium, and is abundantly and exclusively expressed in the intestinal cells of larvae and adult animals [ ]. Subfamilies of this family, n1 and n2, hit the same entry. It is known that the identity between n1 and n2 is about 60% and n2 is longer than n1.
Protein Domain
Name: Transcription factor, MADS-box superfamily
Type: Homologous_superfamily
Description: Human serum response factor (SRF) is a ubiquitous nuclear protein important for cell proliferation and differentiation. SRF function is essential for transcriptional regulation of numerous growth-factor-inducible genes, such as c-fos oncogene and muscle-specific actin genes. A core domain of around 90 amino acids is sufficient for the activities of DNA-binding, dimerisation and interaction with accessory factors. Within the core is a DNA-binding region, designated the MADS box [ ], that is highly similar to many eukaryotic regulatory proteins: among these are MCM1, the regulator of cell type-specific genes in fission yeast; DSRF, a Drosophila trachea development factor; the MEF2 family of myocyte-specific enhancer factors; and the Agamous and Deficiens families of plant homeotic proteins.In SRF, the MADS box has been shown to be involved in DNA-binding and dimerisation [ ]. Proteins belonging to the MADS family function as dimers, the primary DNA-binding element of which is an anti-parallel coiled coil of two amphipathic α-helices, one from each subunit. The DNA wraps around the coiled coil allowing the basic N-termini of the helices to fit into the DNA major groove. The chain extending from the helix N-termini reaches over the DNA backbone and penetrates into the minor groove. A 4-stranded, anti-parallel β-sheet packs against the coiled-coil face opposite the DNA and is the central elementof the dimerisation interface. The MADS-box domain is commonly found associated with K-box region see ().
Protein Domain
Name: MDM2, modified RING finger, HC subclass
Type: Domain
Description: MDM2 is an E3 ubiquitin-protein ligase that mediates ubiquitination of p53/TP53, leading to its degradation by the proteasome [ ]. p53 acts as an important defense mechanism against cancer, and is negatively regulated by interaction with the oncoprotein MDM2 []. MDM2 overexpression correlates with metastasis and advanced forms of several cancers and may be used as a cancer drug target []. In addition, MDM2 has important roles in the cell independent of p53. It interacts with several proteins such as Rb/E2F-1 complex [], the DNA methyltransferase DNMT3A [], p107 [], MTBP [] and the cyclin kinase inhibitor p21 []. MDM2 also affects cell apoptosis [, ].MDM2 contains an N-terminal p53-binding domain, and a C-terminal modified C2H2C4-type RING-HC finger conferring E3 ligase activity that is required for ubiquitination and nuclear export of p53. It is also responsible for the hetero-oligomerization of MDM2, which is crucial for the suppression of P53 activity during embryonic development, and the recruitment of E2 ubiquitin-conjugating enzymes [ ]. MDM2 also harbours a RanBP2-type zinc finger (Znf-RanBP2) domain, as well as a nuclear localisation signal (NLS) and a nuclear export signal (NES), near the central acidic region. The Znf-RanBP2 domain plays an important role in mediating MDM2 binding to ribosomal proteins and thus is involved in MDM2-mediated p53 suppression.This entry represents the C-terminal modified C2H2C4-type RING-HC finger.
Protein Domain
Name: Copper chaperone PCuAC
Type: Family
Description: CuA is a dinuclear copper site within the soluble domain of subunit II (Cox2) of bacterial and eukaryotic cytochrome c oxidases (CcOs), whose function is to convey electrons from a soluble cytochrome c to the catalytic heme a3-CuB centre of CcO. The proper assembly of the CuA site is essential for the catalytic machinery of a functional oxidase. In prokaryotes two protein families have been proposed to be involved in CuA site formation, with a key role in the delivery of metal ions to the CuA site. The first includes proteins that are able to bind Cu(I) through methionine and histidine residues arranged in a highly conserved H(M)X(10)MX(21)HXM motif 5 (referred to as periplasmic CuA chaperone, PCu(A)C). The second consists of the Sco proteins, whose mechanism of action in CuA assembly as thioredoxins or metallochaperones is still debated. These proteins (PCuAC and Sco) are often found in the same bacterial operon, and most of the identified operons that encode Sco also contain a gene for Cox2 [ ].PCu(A)C is a periplasmic copper chaperone. It selectively inserts Cu(I) ions into subunit II of Thermus thermophilus ba3 oxidase to generate a native Cu(A) site. Its role may be to capture and transfer copper to two other copper chaperones, PrrC and Cox11, which in turn deliver Cu(I) to cytochrome c oxidase [ ].
Protein Domain
Name: Dedicator of cytokinesis D, C2 domain
Type: Domain
Description: DOCK family members are evolutionarily conserved guanine nucleotide exchange factors (GEFs) for Rho-family GTPases [ ]. DOCK proteins are required during several cellular processes, such as cell motility and phagocytosis. The N-terminal SH3 domain of the DOCK proteins functions as an inhibitor of GEF, which can be relieved upon its binding to the ELMO1-3 adaptor proteins, after their binding to active RhoG at the plasma membrane [, ]. DOCK family proteins are categorised into four subfamilies based on their sequence homology: DOCK-A subfamily (DOCK1/180, 2, 5), DOCK-B subfamily (DOCK3, 4), DOCK-C subfamily (DOCK6, 7, 8), DOCK-D subfamily (DOCK9, 10, 11) []. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). This entry represents the C2 domain of the Dock-D members. In addition to the C2 domain (also known as the DHR-1 domain) and the DHR-2, Dock-D members contain a functionally uncharacterised domain and a PH domain upstream of the C2 domain. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock1 (also known as Dock180) and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3). The PH domain broadly binds to phospholipids and is thought to be involved in targeting the plasma membrane [ , , ].
Protein Domain
Name: Signal transducer and activator of transcription 4, DNA-binding domain
Type: Domain
Description: Signal transducer and activator of transcription 4 (STAT4) transduces interleukin-12, interleukin-23, and type I interferon cytokine signals in T cells and monocytes [ , ]. It plays an important role in CD4+ Th1 lineage differentiation and IFN-gamma protein expression by CD4+ T cells []. It is crucial for both innate and adaptive immune responses to viral infection []. Variations of the STAT4 gene affect the susceptibility to autoimmune diseases [], such as systemic lupus erythematosus 11 (SLEB11) [] and rheumatoid arthritis (RA) []. STAT proteins have a dual function: signal transduction and activation of transcription. When cytokines are bound to cell surface receptors, the associated Janus kinases (JAKs) are activated, leading to tyrosine phosphorylation of the given STAT proteins [ ]. Phosphorylated STATs form dimers, translocate to the nucleus, and bind specific response elements to activate transcription of target genes []. STAT proteins contain an N-terminal domain (NTD), a coiled-coil domain (CCD), a DNA-binding domain (DBD), an α-helical linker domain (LD), an SH2 domain, and a transactivation domain (TAD). The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6 []. This entry represents the DNA-binding domain (DBD) of STAT4. It has an immunoglobulin-like structural fold.
Protein Domain
Name: Orange carotenoid-binding protein, N-terminal
Type: Domain
Description: Carotenoids such as beta-carotene, lycopene, lutein and beta-cryptoxanthine are produced in plants and certain bacteria, algae and fungi, where they function as accessory photosynthetic pigments and as scavengers of oxygen radicals for photoprotection. They are also essential dietary nutrients in animals. Orange carotenoid-binding proteins (OCP) were first identified in cyanobacterial species, where they occur associated with phycobilisome in the cellular thylakoid membrane. These proteins function in photoprotection, and are essential for inhibiting white and blue-green light non-photochemical quenching (NPQ) [ , ]. Carotenoids improve the photoprotectant activity by broadening OCP's absorption spectrum and facilitating the dissipation of absorbed energy. OCP acts as a homodimer, and binds one molecule of carotenoid (3'-hydroxyechinenone) and one chloride ion per subunit, where the carotenoid binding site is lined with a striking number of methionine residues. The carotenoid 3'-hydroxyechinenone is not found in higher plants. OCP has two domains: an N-terminal helical domain and a C-terminal domain that resembles a NTF2 (nuclear transport factor 2) domain. OCP can be proteolytically cleaved into a red form (RCP), which lacks 15 residues from the N terminus and approximately 150 residues from the C terminus [].This entry represents the N-terminal domain found predominantly in prokaryotic orange carotenoid proteins and related carotenoid-binding proteins. It adopts an α-helical structure consisting of two four-helix bundles [ ].
Protein Domain
Name: KaiA/RbsU helical domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a structural domain consisting of four helices in a bundle with a right-handed superhelix. Homologous structural domains can be found in:The C-terminal domain of the circadian clock protein KaiAThe N-terminal domain of the phosphoserine phosphatase protein RsbUThe cyanobacterial clock proteins KaiA, KaiB and KaiC are proposed as regulators of the circadian rhythm in cyanobacteria. KaiA activates the expression of the kaiBC locus, while KaiC represses it. KaiA is composed of three functional domains: the N-terminal amplitude-amplifier domain, the central period-adjuster domain and the C-terminal clock-oscillator domain. The C-terminal domain is responsible for dimer formation, binding to KaiC, enhancing KaiC phosphorylation and generating the circadian oscillations [ ]. The KaiA protein from Anabaena sp. (strain PCC 7120) lacks the N-terminal CheY-like domain.The phosphoserine phosphatase RsbU acts as a positive regulator of the general stress-response factor of Gram-positive organisms, sigma-B. RsbU dephosphorylates rsbV in response to environmental stress conveyed from the rsbXST module. The phosphatase activity of RsbU is stimulated during the stress response by associating with the RsbT kinase. This association leads to the induction of sigmaB activity. The N-terminal domain forms a helix-swapped dimer that is otherwise similar to the KaiA domain dimer. Deletions in the N-terminal domain are deleterious to the activity of RsbU. The C-terminal domain of RsbU is similar to the catalytic domains of PP2C-type phosphatases [ ].
Protein Domain
Name: Enterobactin synthetase-like, component D
Type: Family
Description: Iron is essential for growth in both bacteria and mammals. Controlling the amount of free iron in solution is often used as a tactic by hosts to limit invasion of pathogenic microbes; binding iron tightly within protein molecules can accomplish this. Such iron-protein complexes include haem in blood, lactoferrin in tears/saliva and transferrin in blood plasma. Some bacteria express surface receptors to capture eukaryotic iron-binding compounds, while others have evolved siderophores to scavenge iron from iron-binding host proteins [ ].The absence of free iron molecules in the surrounding environment triggers transcription of gene clusters that encode both siderophore-synthesis enzymes, and receptors that recognise iron-bound siderophores [ ]. Classic examples are the enterobactin/enterochelin clusters found in Escherichia coli and Salmonella, although similar moieties in other pathogens have been identified. The enzymic machinery that produces vibrionectin in Vibrio cholerae is such a homologue [].EntD (also known as 4'-phosphopantetheinyl transferase EntD) forms part of the enterobactin-synthetase enzyme complex. It plays an essential role in the assembly of the enterobactin by catalysing the transfer of the 4'-phosphopantetheine (Ppant) moiety from coenzyme A to the apo-domains of both EntB (ArCP domain) and EntF (PCP domain) to yield their holo-forms which make them competent for the activation of 2,3-dihydroxybenzoate (DHB) and L-serine, respectively []. Deletion studies involving EntD- mutants have shown that it is essential for virulence [].This entry also identifies some 4'-phosphopantetheinyl transferase proteins.
Protein Domain
Name: Dedicator of cytokinesis 4, SH3 domain
Type: Domain
Description: DOCK family members are evolutionarily conserved guanine nucleotide exchange factors (GEFs) for Rho-family GTPases [ ]. DOCK proteins are required during several cellular processes, such as cell motility and phagocytosis. The N-terminal SH3 domain of the DOCK proteins functions as an inhibitor of GEF, which can be relieved upon its binding to the ELMO1-3 adaptor proteins, after their binding to active RhoG at the plasma membrane [, ]. DOCK family proteins are categorised into four subfamilies based on their sequence homology: DOCK-A subfamily (DOCK1/180, 2, 5), DOCK-B subfamily (DOCK3, 4), DOCK-C subfamily (DOCK6, 7, 8), DOCK-D subfamily (DOCK9, 10, 11) []. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). DOCK4 is an atypical guanine nucleotide exchange factor (GEF) that lacks the conventional Dbl homology (DH) domain. It activates small GTPases by exchanging bound GDP for free GTP. It plays a role in regulating dendritic growth and branching in hippocampal neurons, where it is highly expressed. It may also regulate spine morphology and synapse formation and has been linked to autism, dyslexia, and schizophrenia [ ]. It also plays a critical role in mediating TGF-beta's prometastatic effects in lung cancer [].This entry represents the SH3 domain found in DOCK4.
Protein Domain
Name: Closterovirus 1a polyprotein, central region
Type: Family
Description: This family represents an alignment of the Zemlya region of closteroviruses. The alignment of the 1a polyprotein of the Closteroviridae family members revealed that this region was not conserved in other genera. The homologs of the Zemlya region are not found in other viral or cellular proteins. This region is named "the Zemlya region"(zemlya)- is the Russian word for "earth", meaning that its conserved amino acid sequence represents a solid ground within the highly variable central region of 1a polyporotein. It is composed of four predicted α-helices, alphaA to alphaD, and contains three conserved positions: i) a strictly conserved glutamate (E) in helix alphaA (E1291 in Beet yellows closterovirus (BYV)); ii) a strictly conserved proline (P1380) in alphaD; and iii) a conserved basic position (arginine or lysine; R1384 in BYV). The presence of a conserved proline in helix alphaD is noteworthy because prolines are strongly disfavoured in helices; this proline most probably induces a kink in the helix. Functional studies have suggested that most part of the Zemlya region, targets the ER and remodels the ER membranes. More specifically, deletion analysis and substitutions of the conserved hydrophobic amino acid residues suggest a role of the putative amphipathic helix1368-1385 (alphaD) in the formation of globules. Hence it was proposed that this specific region in 1a protein protein may be involved in the biogenesis of closterovirus [ ].
Protein Domain
Name: Voltage gated sodium channel, alpha subunit
Type: Family
Description: Voltage-dependent sodium channels are transmembrane (TM) proteins responsible for the depolarising phase of the action potential in most electrically excitable cells []. They may exist in 3 states []: the resting state, where the channel is closed; the activated state, where the channel is open; and the inactivated state, where the channel is closed and refractory to opening. Several different structurally and functionally distinct isoforms are found in mammals, coded for by a multigene family, these being responsible for the different types of sodium ion currents found in excitable tissues.There are nine pore-forming alpha subunit of voltage-gated sodium channels consisting of four membrane-embedded homologous domains (I-IV), each consisting of six α-helical segments (S1-S6), three cytoplasmic loops connecting the domains, and a cytoplasmic C-terminal tail. The S6 segments of the four domains form the inner surface of the pore, while the S4 segments bear clusters of basic residues that constitute the channel's voltage sensors [ , , ].Cation channels are transport proteins responsible for the movement of cations through the membrane. These proteins contain 6 transmembrane helices in which the last two helices flank a loop which determines ion selectivity. In some sub-families (e.g. Na channels) the domain is repeated four times, whereas in others (e.g. K channels) the protein forms as a tetramer in the membrane. This entry represents alpha subunits of the voltage-gated Na+ channel superfamily.
Protein Domain
Name: SPARC, follistatin-like domain
Type: Domain
Description: SPARC (also known as BM-40 or osteonectin) is a matricellular protein essential for embryo development in invertebrates and highly expressed in bone [ ]. It participates in normal tissue remodeling as it regulates the deposition of extracellular matrix, as well as in neoplastic transformation []. It is involved in extracellular matrix (ECM) assembly and fibrosis through binding both fibrillar collagen and basal lamina collagen IV []. It regulates the activity of matrix metalloproteinases (MMPs), as well as the growth factor signaling mediated by cell surface receptors including vascular endothelial growth factor (VEGF) receptor, basic fibroblast growth factor (bFGF), and transforming growth factor (TGF) beta1. Overexpression of SPARC has been linked to cancers []. SPARC is also a bone-associated protein that has a major role in bone development and mineralisation. It is involved in the initiation and progression of vascular calcification and upregulated by adiponectin []. Furthermore, SPARC may be one of the molecules that govern the uptake and delivery of proteins from blood to the cerebrospinal fluid (CSF) during brain development [].SPARC contains an N-terminal acidic 52-residue segment followed by a follistatin-like (FS) domain, and an α-helical EC domain with 2 unusual calcium-binding EF-hands and the collagen-binding site [ ]. Platelet-derived growth factor (PDGF) interacts with its EC domain, but in a calcium-independent manner, whereas collagen binding is calcium-dependent [, , ].This entry represents the FS domain of SPARC.
Protein Domain
Name: Malic enzyme, N-terminal domain
Type: Domain
Description: Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate, a reaction important in a number of metabolic pathways - e.g. carbon dioxide released from the reaction may be used in sugar production during the Calvin cycle of photosynthesis [ ]. There are 3 forms of the enzyme []: an NAD-dependent form that decarboxylates oxaloacetate; an NAD-dependent form that does not decarboxylate oxalo-acetate; and an NADPH-dependent form []. Other proteins known to be similar to malic enzymes are the Escherichia coli scfA protein; an enzyme from Zea mays (Maize), formerly thought to be cinnamyl-alcohol dehydrogenase []; and the hypothetical Saccharomyces cerevisiae protein YKL029c.Studies on the duck liver malic enzyme reveals that it can be alkylated by bromopyruvate, resulting in the loss of oxidative decarboxylation and the subsequent enhancement of pyruvate reductase activity [ ]. The alkylated form is able to bind NADPH but not L-malate, indicating impaired substrate or divalent metal ion-binding in the active site []. Sequence analysis has highlighted a cysteine residue as the point of alkylation, suggesting that it may play an important role in the activity of the enzyme [], although it is absent in the sequences from some species.Malic enzyme is a tetramer comprised of subunits with four domains each [ , , ].This entry represents the N-terminal domain of the NAD(P)-dependent malic enzyme and related proteins from bacteria, eukaryotes and archaea.
Protein Domain
Name: CENP-T/Histone H4, histone fold
Type: Domain
Description: This domain is the C-terminal histone fold domain of CENP-T, which associates with chromatin [ , ]. Proteins containing this domain also include Histone H4. CENP-T is a family of vertebral kinetochore proteins that associates directly with CENP-W. The N terminus of CENP-T proteins interacts directly with the Ndc80 complex in the outer kinetochore. Importantly, the CENP-T-W complex does not directly associate with CENP-A, but with histone H3 in the centromere region. CENP-T and -W form a hetero-tetramer with CENP-S and -X and bind to a ~100 bp region of nucleosome-free DNA forming a nucleosome-like structure. The DNA-CENP-T-W-S-X complex is likely to be associated with histone H3-containing nucleosomes rather than with CENP-nucleosomes [ , , ].Histone H4 is one of the five histones, along with H1/H5, H2A, H2B and H3. Two copies of each of the H2A, H2B, H3, and H4 histones ensemble to form the core of the nucleosome [ ]. The nucleosome forms octameric structure that wraps DNA in a left-handed manner. H3 is a highly conserved protein of 135 amino acid residues [, ]. Histones can undergo several different types of post-translational modifications that affect transcription, DNA repair, DNA replication and chromosomal stability. The sequence of histone H4 has remained almost invariant in more then 2 billion years of evolution [, , ].
Protein Domain
Name: p53, DNA-binding domain
Type: Domain
Description: P53 is a tumor suppressor gene product; mutations in p53 or lack of expression are found associated with a large fraction of all human cancers. P53 is activated by DNA damage and acts as a regulator of gene expression that ultimatively blocks progression through the cell cycle. P53 binds to DNA as a tetrameric transcription factor. In its inactive form, p53 is bound to the ring finger protein Mdm2, which promotes its ubiquitinylation and subsequent proteosomal degradation. Phosphorylation of p53 disrupts the Mdm2-p53 complex, while the stable and active p53 binds to regulatory regions of its target genes, such as the cyclin-kinase inhibitor p21, which complexes and inactivates cdk2 and other cyclin complexes [ , , , , , , , , , ].This domain is found in p53 transcription factors, where it is responsible for DNA-binding. The DNA-binding domain acts to clamp, or in the case of TonEBP, encircle the DNA target in order to stabilise the protein-DNA complex [ ]. Protein interactions may also serve to stabilise the protein-DNA complex, for example in the STAT-1 dimer the SH2 (Src homology 2) domain in each monomer is coupled to the DNA-binding domain to increase stability []. The DNA-binding domain consists of a β-sandwich formed of 9 strands in 2 sheets with a Greek-key topology. This structure is found in many transcription factors, often within the DNA-binding domain.
Protein Domain
Name: Cytochrome c6
Type: Family
Description: Cytochrome c (CytC) proteins can be defined as electron-transfer proteins having one or several haem c groups, bound to the protein by one or, more generally, two thioether bonds involving sulphydryl groups of cysteine residues. The fifth haem iron ligand is always provided by a histidine residue. CytC possess a wide range of properties and function in a large number of different redox processes.Ambler [ ] recognised four classes of cytC.Class I includes the low-spin soluble CytC of mitochondria and bacteria, with the haem-attachment site towards the N terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C terminus. On the basis of sequence similarity, class I CytC were further subdivided into five classes, IA to IE. Class IC, 'split-alpha-band' Cyt C, possess a widened or split alpha-band of lowered absorptivity. This class includes dihaem Cyt C4 and monohaem Cyt C6 (Cyt C-553) and Cyt C-554.The 3D structures of Chlamydomonas reinhardtii Cyt C6 [ ] and Desulfovibrio vulgaris Cyt C-553 [] have been determined. The proteins consist of 4 α-helices; three 'core' helices form a 'basket' around the haem group, with one haem edge exposed to the solvent.This entry also includes Cytochrome c6 from Arabidopsis, which functions as an electron carrier between membrane-bound cytochrome b6-f and photosystem I in oxygenic photosynthesis [].
Protein Domain
Name: SopE-like, GEF domain superfamily
Type: Homologous_superfamily
Description: The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell [ ] and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal"state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE [ ]. Recently, it has been found that SopE and its protein homologue SopE2 can activate different sets of Rho-GTPases in the host cell [ ]. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in the host cell during infection.This entry represents the guanine nucleotide exchange factor domain of SopE and homologues. This domain has an α-helical structure consisting of two three-helix bundles arranged in a lamdba shape [ , ].
Protein Domain
Name: Glucose-fructose oxidoreductase, bacterial
Type: Family
Description: Glucose-fructose oxidoreductase (GFOR) catalyses the formation of D- gluconolactone and D-glucitol from D-glucose and D-fructose. It hasone tightly-bound NADP(H) per enzyme subunit, it exists as a homotetramer, and is one of the pivotal proteins in the sorbitol-gluconate pathway. It istargeted to the periplasm of the Gram-negative cell envelope, and belongs to the GFO/IDH/MOCA superfamily. First discovered in Zymomonas mobilis, homologues have also been found in Caulobacter crescentus and Deinococcus radiodurans.GFOR is of great interest as its mechanism of secretion into the bacterial periplasm differs from other precursor proteins of the Twin ArginineTranslocation (TAT) pathway [ ]. Although it exhibits the consensus TAT signal motif (S/T-R-R-x-L-F-K) at its N terminus, unlike other TAT proteins that can be universally secreted across a number of Gram-negative microbes, GFOR is only translocated in Z. mobilis. However, replacing the Z. mobilis signal peptide with one from Escherichia coli restores this function. This observation has led to the suggestion that TAT-dependent precursors are optimally adapted only to their particular cognate secretion apparatus [].Recently, the crystal structure of Z. mobilis GFOR was resolved to 2.5A by means of X-ray crystallography. This revealed that the protein indeed exists as a homotetramer, and has 4 active sites. There are 2 distinct domains: a classical dinucleotide binding fold at the N terminus and a 9-stranded β-sheet at the C terminus. NADP(H) is bound to the N terminus of the first α-helix.
Protein Domain
Name: V0 complex accessory subunit Ac45
Type: Family
Description: V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release [ ]. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins [].This entry represents the V0 complex Ac45 accessory subunit (ATP6AP1, also known as V-type proton ATPase subunit S1), an ER/Golgi membrane protein. This subunit is synthesized as an N-glycosylated 60kDa precursor that is intracellularly cleaved to a protein of about 45kDa. This subunit plays a crucial role on V0 assembly, stability and function as it connects to multiple V0 subunits and phospholipids in the c-ring [ ]. This subunit assists the V-ATPase in the acidification of neuroendocrine granules [] and guides the V-ATPase into specialized subcellular compartments such as neuroendocrine regulated secretory vesicles or the ruffled border of the osteoclast []. In humans, mutations of ATP6AP1 cause immunodeficiency with hypogammaglobulinemia, hepatopathy and neurocognitive abnormalities [].
Protein Domain
Name: FAAP20, zinc finger UBZ2-type
Type: Domain
Description: The ubiquitin-binding zinc finger (UBZ) is a type of zinc-coordinating β-β-α fold domain found mainly in proteins involved in DNA repair and transcriptional regulation. UBZ domains coordinate a zinc ion with cysteine or histidine residues; depending on their amino acid sequence, UBZ domains are classified into several families [ , ]. Type 1 UBZs are CCHH-type zinc fingers found in tandem UBZ domains of TAX1-binding protein 1 (TAX1BP1) [, , ], type 2 UBZs are CCHC-type zinc fingers found in FAAP20 which is a subunit of the Fanconi anemia (FA) core complex [, ], type 3 UBZs are CCHH-type zinc fingers found only in the Y-family translesion polymerase eta [, , ], and type 4 UBZs are CCHC-type zinc fingers found in Y-family translesion polymerase kappa, Werner helicase-interacting protein 1 (WRNIP1), and Rad18 [, , ]. The UBZ domain consists of two short antiparallel β-strands followed by one α-helix. The α-helix packs against the β-strands with a zinc ion sandwiched between the α-helix and the β-strands. The zinc ion is coordinated by two cysteines located on the fingertip formed by the β-strands and two histidines [ , ] or one histidine and one cysteine [] on the α-helix [].This domain is the type 2 UBZ found in Fanconi anemia-associated protein of 20kDa (FAAP20) [ , , , ].
Protein Domain
Name: Guanine nucleotide exchange factor SopE, GEF domain
Type: Domain
Description: The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell [ ] and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal"state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE [ ]. Recently, it has been found that SopE and its protein homologue SopE2 can activate different sets of Rho-GTPases in the host cell [ ]. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in the host cell during infection.This entry represents the guanine nucleotide exchange factor domain of SopE. This domain has an α-helical structure consisting of two three-helix bundles arranged in a lamdba shape [ , ].
Protein Domain
Name: Spectrin, beta subunit
Type: Family
Description: Spectrins are involved in the support of general membrane integrity, stabilisation of cell-cell interactions, axonal growth, normal functioning of the Golgi complex and organisation of synaptic vesicles [ , , ]. Spectrin is a tetrameric actin cross-linking protein, which contains two alpha and two beta subunits. Two genes for alpha-spectrin [, , ] and five for beta-spectrin have been identified in both mice and humans, each of which is alternatively spliced to produce multiple spectrin isoforms [, ]. Beta-spectrins are more diverse than alpha-spectrin, and include mammalian erythrocytic beta-spectrin, non-erythroid beta-spectrin/Fodrin (beta-G, the general form of beta-spectrin expressed in multiple tissues), a novel beta-G spectrin (ELF1-4) that lacks the C-terminal PH (pleckstrin homology) domain, the brain-specific SPTBN2, and beta-V spectrin.ELF ( ), a modulator of the Smad adaptor proteins involved in the TGF-beta signalling pathway, was originally identified from endodermal stem/progenitor cells committed to foregut lineage [ , ]. ELF is a beta-spectrin that is important for distinct functional membrane generation, protein sorting, cell adhesion and the development of a polarized differentiated epithelial cell [, ]. ELF-deficient mice display disruption of transforming growth factor-beta (TGF-beta) signalling by Smad proteins []. Evidence from null mutants of ELF confirms that ELF is a novel beta-G spectrin and not an isoform of beta-spectrins []. Aberrations in Elf's involvement in Smad4 localization and subsequent activation of Smad4 could result in tumourigenesis.
Protein Domain
Name: Pancreatic lipase
Type: Family
Description: Triglyceride lipases ( ) are lipolytic enzymes that hydrolyse ester linkages of triglycerides []. Lipases are widely distributed inanimals, plants and prokaryotes. At least three tissue-specific isozymes exist in higher vertebrates: pancreatic, hepatic and gastric/lingual. Theselipases are closely related to each other and to lipoprotein lipase (), which hydrolyses triglycerides of chylomicrons and very low density lipoproteins (VLDL) [].Pancreatic lipase (triacylglycerol acylhydrolase, ) plays a key role in dietary fat absorption by hydrolysing dietary long chain triacyl-glycerol to free fatty acids and monoacylglycerols in the intestinal lumen[ ]. The activity of lipase is stimulated by colipase in the presence ofbile acids. The 3D structure of human pancreatic lipase has been determined by X-ray crystallography []. The enzyme is a single-chain glycoprotein of 449 amino acids. Structural results suggest that Ser 152 is the nucleophilic residue essential for catalysis []. The residue is located in the N-terminal domainat the C-terminal edge of a doubly-wound parallel β-sheet, and forms part of an Asp-His-Ser triad that is chemically analogous to, but structurallydifferent from, that of the serine proteases [ ]. The putative hydrolyticsite is covered by a surface loop, and is thus inaccessible to solvent. It is thought that interfacial activation may involve a reorientation ofthis flap, not only in pancreatic lipases but also in the related hepatic and lipoprotein lipases [].
Protein Domain
Name: EXOC6/PINT-1/Sec15/Tip20, C-terminal, domain 2
Type: Homologous_superfamily
Description: This superfamily represents a subdomain known as domain D found at the C terminus of Tip20, which is part of the Dsl1p vesicle tethering complex essential for trafficking from the Golgi apparatus to the ER. The structure of Tip20p consists entirely of α-helices and intervening loops of variable length, organised into a series of helix bundle domains [ ]. Subunits of the vesicle tethering complex (such as Tip20 and Dsl1) share protein sequence similarity with known subunits of the exocyst complex, establishing a structural connection among several multi-subunit tethering complexes and implying that many of their subunits are derived from a common progenitor. Proteins containing this domain includes EXOC6/PINT-1 from animals, Tip20 from budding yeast, Sec15 from fission yeasts and MAIGO2 (Mag2) from plants. Sec15 (EXOC6 homologue) is an exocyst complex component that links Sec4 and downstream fusion effectors at discrete cellular locations. Its C-terminal domain mediates Rab GTPases binding which occurs in a GTP-dependent manner [ ]. PINT-1/Tip20/MAIGO2 play a role in anterograde transport from the endoplasmic reticulum (ER) to the Golgi and/or retrograde transport from the Golgi to the ER. They share a similar domain organisation with an N-terminal leucine heptad repeat rich coiled coil and an ~500-residue C-terminal RINT1/TIP20 domain, which might be a protein-protein interaction module necessary for the formation of functional complexes.
Protein Domain
Name: HTH-type transcriptional activator IlvY, PBP2 domain
Type: Domain
Description: In Escherichia coli, IlvY is required for the regulation of ilvC gene expression that encodes acetohydroxy acid isomeroreductase (AHIR), a key enzyme in the biosynthesis of branched-chain amino acids (isoleucine, valine, and leucine). The ilvGMEDA operon genes encode remaining enzyme activities required for the biosynthesis of these amino acids [ ]. Activation of ilvC transcription by IlvY requires the additional binding of a co-inducer molecule (either alpha-acetolactate or alpha-acetohydoxybutyrate, the substrates for AHIR) to a preformed complex of IlvY protein-DNA [ ]. Like many other LysR-family members, IlvY negatively auto-regulates the transcription of its own divergently transcribed ilvY gene in an inducer-independent manner []. This substrate-binding domain has significant homology to the type 2 periplasmic binding proteins (PBP2). The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate- binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction [ , , ].
Protein Domain
Name: HTH-type transcriptional regulator AlsR, PBP2 domain
Type: Domain
Description: AlsR is responsible for activating the expression of the acetoin operon (alsSD) in response to inducing signals such as glucose and acetate. Like many other LysR family proteins, AlsR is transcribed divergently from the alsSD operon [ ]. The alsS gene encodes acetolactate synthase, an enzyme involved in the production of acetoin in cells of stationary-phase. AlsS catalyzes the conversion of two pyruvate molecules to acetolactate and carbon dioxide. Acetolactate is then converted to acetoin at low pH by acetolactate decarboxylase which encoded by the alsD gene. Acetoin is an important physiological metabolite excreted by many microorganisms grown on glucose or other fermentable carbon sources []. This substrate-binding domain shows significant homology to the type 2 periplasmic binding proteins (PBP2).The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate- binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction [ , , ].
Protein Domain
Name: Fringe
Type: Family
Description: The Notch receptor is a large, cell surface transmembrane protein involved in a wide variety of developmental processes in higher organisms [ ]. It becomes activated when its extracellular region binds to ligands located on adjacent cells. Much of this extracellular region is composed of EGF-like repeats, many of which can be O-fucosylated. A number of these O-fucosylated repeats can in turn be further modified by the action of a beta-1,3-N-acetylglucosaminyltransferase enzyme known as Fringe []. Fringe potentiates the activation of Notch by Delta ligands, while inhibiting activation by Serrate/Jagged ligands. This regulation of Notch signalling by Fringe is important in many processes [].Four distinct Fringe proteins have so far been studied in detail; Drosophila Fringe (Dfng) and its three mammalian homologues Lunatic Fringe (Lfng), Radical Fringe (Rfng) and Manic Fringe (Mfng). Dfng, Lfng and Rfng have all been shown to play important roles in developmental processes within their host, though the phenotype of mutants can vary between species eg Rfng mutants are retarded in wing development in chickens, but have no obvious phenotype in mice [ , , ]. Mfng mutants have not, so far, been charcterised. Biochemical studies indicate that the Fringe proteins are fucose-specific transferases requiring manganese for activity and utilising UDP-N-acetylglucosamine as a donor substrate []. The three mammalian proteins show distinct variations in their catalytic efficiencies with different substrates.
Protein Domain
Name: Eukaryotic translation initiation factor 3 subunit G, N-terminal
Type: Domain
Description: At least eleven different protein factors are involved in initiation of protein synthesis in eukaryotes. Binding of initiator tRNA and mRNA to the 40S subunit requires the presence of the translation initiation factors eIF-2 and eIF-3, with eIF-3 being particularly important for 80S ribosome dissociation and mRNA binding [ ]. eIF-3 is the most complex translation inititation factor, consisting of about 13 putative subunits and having a molecular weight of between 550 - 700kDa in mammalian cells. Subunits are designated eIF-3a - eIF-3m; the large number of subunits means that the interactions between the individual subunits that make up the eIF-3 complex are complex and varied. eIF-3G (also termed eIF-3 subunit 4, eIF-3-delta, eIF3-p42, or eIF3-p44) is the RNA-binding subunit of eIF3. Subunit eIF-3G binds 18 S rRNA and beta-globin mRNA, and therefore appears to be a nonspecific RNA-binding protein. It is one of the cytosolic targets and interacts with mature apoptosis-inducing factor (AIF). The yeast orthologue is known as eIF3-p33; it plays an important role in the initiation phase of protein synthesis in yeast. It binds both mRNA and rRNA fragments due to an RNA recognition motif near its C terminus [ , , , , , , , , , , , , , ].This entry represents a domain of approximately 130 amino acids in length found at the N terminus of eukaryotic translation initiation factor 3 subunit G. This domain is commonly found in association with the RNA recognition domain .
Protein Domain
Name: Rubicon Homology Domain
Type: Domain
Description: This is the Rubicon homology domain (RH) characterised at the C-terminal of Rubicon, PLEKHM1 and Pacer, proteins that modulate late steps in autophagy [ , ]. Rubicon (RUBCN) negatively regulates autophagy and endolysosomal trafficking by inhibiting PI3K complex II (PI3KC3-C2), which impairs autophagosome maturation process. Decrease in autophagy is associated to aging, then suppression of this process by Rubicon has been linked to decreased clearance of alpha-synuclein aggregates in neural tissues, impairment of liver cell homeostasis, and interstitial fibrosis in the kidney. PLEKHM1 is an adapter protein that regulates Rab7-dependent and HOPS complex-dependent fusion events in the endolysosomal system and couples autophagic and the endocytic trafficking pathways [, ], being involved in the suppression of endocytic transport rather than autophagosome maturation. Mutations in PLEKHM1 causes osteopetrosis []. On the other hand, Pacer (Protein associated with UVRAG as autophagy enhancer or Rubicon-like) positively regulates autophagy, promoting autophagosome maturation by facilitating the biogenesis of phosphatidylinositol 3-phosphate (PtdIns3P) in late steps of autophagy [, ]. It antagonizes RUBCN, thereby stimulating phosphatidylinositol 3-kinase activity of the PI3K/PI3KC3 complex []. Pacer is involved in neuronal autophagy, whose deficiency leads to impaired autophagy and accumulation of protein aggregates in ALS which correlates with cell death and vulnerability of motoneurons during ALS pathogenesis [].This domain contains nine conserved cysteines and one conserved histidine, which have been predicted to bind divalent zinc cations, being required for Rubicon and PLEKHM1 to interact with Rab7 [, ].
Protein Domain
Name: Transglutaminase-like
Type: Domain
Description: This domain is found in many proteins known to have transglutaminase activity, i.e. which cross-link proteins through an acyl-transfer reaction between the gamma-carboxamide group of peptide-bound glutamine and the ε-amino group of peptide-bound lysine, resulting in a epsilon-(gamma-glutamyl)lysine isopeptide bond. Transglutaminases have been found in a diverse range of species, from bacteria to mammals. The enzymes require calcium binding and their activity leads to post-translational modification of proteins through acyl-transfer reactions, involving peptidyl glutamine residues as acyl donors and a variety of primary amines as acyl acceptors, with the generation of proteinase resistant isopeptide bonds [ ]. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa' [ ]. On the basis of the experimentally demonstrated activity of the Methanobacterium phage psiM2 pseudomurein endoisopeptidase [], it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease [].A subunit of plasma Factor XIII revealed that each Factor XIIIA subunit is composed of four domains (termed N-terminal β-sandwich, core domain (containing the catalytic and the regulatory sites), and C-terminal β-barrels 1 and 2) and that two monomers assemble into the native dimer through the surfaces in domains 1 and 2, in opposite orientation. This organisation in four domains is highly conserved during evolution among transglutaminase isoforms [ ].
Protein Domain
Name: Rad9/Ddc1
Type: Family
Description: This entry represents the DNA damage checkpoint protein Rad9 and its homologue in budding yeast, Ddc1. Rad9 forms a complex with Hus1 and Rad1 (called 9-1-1 complex). Ddc1 forms a similar complex with Mec1 and Rad17. Structurally, the 9-1-1 / Ddc1-Mec3-Rad17 complex is similar to the PCNA complex, which forms trimeric ring-shaped clamps. The 9-1-1 / Ddc1-Mec3-Rad17 complex plays a role in checkpoint activation that permits DNA-repair pathways to prevent cell cycle progression in response to DNA damage and replication stress [ , ].In humans, 9-1-1 binds to TopBP1 and activates the ATR-Chk1 checkpoint pathway [ ]. Besides its function in the 9-1-1 complex, Rad9 can also act as a transcriptional factor and participate in immunoglobulin class switch recombination []. It also shows 3'-5' exonuclease activity []. Aberrant Rad9 expression has been associated with prostate, breast, lung, skin, thyroid, and gastric cancers [].In budding yeast, Ddc1 can activate Mec1 (the principal checkpoint protein kinase, human ATR homologue) in G1 phase. In G2 phase, Ddc1 can either activate Mec1 directly or recruit Dpb11 (the orthologue of human TopBP1) and subsequently activate Mec1 []. Ddc1 does not have DNA exonuclease function [].It is worth noting that the Rad9 proteins referred to in this entry are the mammalian and fission yeast homologues of budding yeast Ddc1. Members of this family do not share the sequence homology another DNA damage-dependent checkpoint protein from budding yeast, confusingly also called Rad9.
Protein Domain
Name: Ubiquitin E2 variant, N-terminal
Type: Domain
Description: The N-terminal ubiquitin E2 variant (UEV) domain is ~145 amino acid residues in length and shows significant sequence similarity to E2 ubiquitin ligases but is unable to catalyze ubiquitin transfer as it lacks the active site cysteine that forms the transient thioester bond with the C terminus of ubiquitin (Ub). Nevertheless, at least some UEVs have retained the ability to bind Ub, and appear to act either as cofactors in ubiquitylation reactions, or as ubiquitin sensors. UEV domains also frequently contain other protein recognition motifs, and may generally serve to couple protein and Ub binding functions to facilitate the formation of multiprotein complexes [ , , , ]. The UEV domain consists of a twisted four-stranded antiparallel β-sheet having a meander topology, with four α-helices packed against one face of the sheet. The UEV fold is generally similar to canonical E2 ligases in the hydrophobic core and 'active site' regions, but differs significantly at both its N- and C-termini [ , ]. The UEV domain is found in the eukaryotic tumour susceptibility gene 101 protein (TSG101). Altered transcripts of this gene have been detected in sporadic breast cancers and many other Homo sapiens malignancies. However, the involvement of this gene in neoplastic transformation and tumourigenesis is still elusive. TSG101 is required for normal cell function of embryonic and adult tissues but this gene is not a tumour suppressor for sporadic forms of breast cancer [ ].
Protein Domain
Name: Signal recognition particle, SRP19 subunit
Type: Family
Description: The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [, ]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [].This entry represents the SRP19 subunit. The SRP19 protein is unstructured but forms a compact core domain and two extended RNA-binding loops upon binding the signal recognition particle (SRP) RNA [ ].
Protein Domain
Name: Carbohydrate binding module family 20
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This entry represents , which binds starch. The crystal structure of CBM20 has been solved [ ]. It consists of seven β-strands forming an open-sided distorted β-barrel. Several aromatic residues, especially the well-conserved Trp and Tyr residues, participate in granular starch binding.
Protein Domain
Name: Seven-in-absentia protein, TRAF-like domain
Type: Domain
Description: The seven in absentia (sina) is a RING-type E3 ubiquitin ligase first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non-neuronal cell type. The Sina protein contains an N-terminal RING finger domain C3HC4-type, through which it binds E2 ubiquitin-conjugating enzymes (UbcD1). Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that Sina targets TTK88 for degradation, therefore promoting the R7 pathway. The remainder C-terminal part is involved in interactions with other proteins, and consists of two zinc fingers and a TRAF-like domain.Murine and human homologues of Sina have also been identified, namely Siah1 and Siah2 [ , ]. The human homologue Siah-1 [] also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus, this pathway links DNA damage to beta-catenin degradation [, ].In addition to the Drosophila protein and mammalian homologues, whose similarity was noted previously, this family also includes putative homologues from Caenorhabditis elegans, Arabidopsis thaliana [ ].
Protein Domain
Name: Translation elongation factor EFTu-like, domain 2
Type: Domain
Description: Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome [ , , ]. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding [ , ]. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).EF1A consists of three structural domains. This entry represents domain 2 of EF2, which adopts a β-barrel structure, and is involved in binding to both charged tRNA [ ]. This domain is structurally related to the C-terminal domain of EF2 (), to which it displays weak sequence matches. This domain is also found in other proteins such as translation initiation factor IF-2 and tetracycline-resistance proteins.
Protein Domain
Name: T2SS_GspF/T4SS_PilC conserved site
Type: Conserved_site
Description: GspF is the inner membrane component of the type II secretion system (T2SS). It interacts with GspE, a cytoplasmic hexameric ATPase of the T2SS [ ]. It shows considerable sequence similarity to PilC, which is required for the formation of type IV pili [].The type II secretion system (T2SS) is one of several extracellular secretion systems in gram-negative bacteria. It delivers toxins and a range of hydrolytic enzymes including proteases, lipases and carbohydrate-active enzymes to the cell surface or extracellular space [ ]. T2SS systems are composed of 11 to 15 different proteins, which are generally called GspA to GspO and GspS. The T2SS spans the two bacterial membranes and ensures secretion of folded proteins across the outer membrane pore formed by GspD. The inner membrane complex contains GspC, GspL, GspM, and GspF. The cytoplasmic domains of GspL and GspF interact with an ATPase, GspE. GspE is thought to energize the formation of a short pseudopilus by several pilin-like proteins, GspG to GspK []. GspD has been shown to interact with the inner membrane component GspC []. The T2SS pseudopilus is a periplasmic filament composed of the major pseudopilin, EpsG, and four minor pseudopilins, EpsH, EpsI, EpsJ and EpsK. Pseudopilus is assembled by the polymerization of GspG (also known as PulG) subunits. Pseudopilin proteins have a conserved N-terminal hydrophobic segment followed by a more variable C-terminal periplasmic and globular domain [ ].
Protein Domain
Name: GspF/PilC family
Type: Family
Description: GspF is the inner membrane component of the type II secretion system (T2SS). It interacts with GspE, a cytoplasmic hexameric ATPase of the T2SS [ ]. It shows considerable sequence similarity to PilC, which is required for the formation of type IV secretion system pili [].The type II secretion system (T2SS) is one of several extracellular secretion systems in gram-negative bacteria. It delivers toxins and a range of hydrolytic enzymes including proteases, lipases and carbohydrate-active enzymes to the cell surface or extracellular space [ ]. T2SS systems are composed of 11 to 15 different proteins, which are generally called GspA to GspO and GspS. The T2SS spans the two bacterial membranes and ensures secretion of folded proteins across the outer membrane pore formed by GspD. The inner membrane complex contains GspC, GspL, GspM, and GspF. The cytoplasmic domains of GspL and GspF interact with an ATPase, GspE. GspE is thought to energize the formation of a short pseudopilus by several pilin-like proteins, GspG to GspK []. GspD has been shown to interact with the inner membrane component GspC []. The T2SS pseudopilus is a periplasmic filament composed of the major pseudopilin, EpsG, and four minor pseudopilins, EpsH, EpsI, EpsJ and EpsK. Pseudopilus is assembled by the polymerization of GspG (also known as PulG) subunits. Pseudopilin proteins have a conserved N-terminal hydrophobic segment followed by a more variable C-terminal periplasmic and globular domain [ ].
Protein Domain
Name: Lutropin-choriogonadotropic hormone receptor
Type: Family
Description: Glycoprotein hormone receptors are members the rhodopsin-like G-protein coupled receptor (GPCR) family. They function as receptors for the pituitary hormones thyrotropin (TSH receptor), follitropin (FSH receptor) and lutropin (LH receptor). In mammals the LH receptor is also the receptor for the placental hormone, human chorionic gonadotropin (hCG), so is denominated as a lutropin-choriogonadotropic hormone receptor (LHCG receptor). The receptors share close sequence similarity, and are characterised by large extracellular domains believed to be involved in hormone binding via leucine-rich repeats (LRR) [ ].The beta subunits of the luteinizing hormone and human chorionic gonadotropin are closely related in sequence and elicit their biological actions via the same receptor. LH is released from the anterior pituitary under the influence of gonadotrophin-releasing hormone and progesterones. hCG is released by the placenta during pregnancy. In females, LH stimulates ovulation and is the major hormone involved in the regulation of progesterone secretion by the corpus luteum. In males, it stimulates Leydig cells to secrete androgens, particularly testosterone.This entry represents lutropin-choriogonadotropic hormone receptor (LCGR), also known as luteinizing hormone-choriogonadotropin receptor (LHCGR), or luteinizing hormone receptor (LH receptor). It is a transmembrane receptor found predominantly in organs invloved in reproductive biology such as the ovary and testis [ , , ]. The receptor interacts with both luteinizing hormone and human chorionic gonadotropins and represents a G protein-coupled receptor, rhodopsin-type (GPCRA). Its activation is necessary for the hormonal functioning during reproduction and activates adenylyl cyclase through G proteins [].
Protein Domain
Name: Frizzled domain
Type: Domain
Description: The frizzled (fz) domain is an extracellular domain of about 120 amino acids.It was first identified in the alpha-1 chain of type XVIII collagen and in members of the Frizzled family of seven transmembrane (7TM) proteins which act as receptors for secreted Wingless (Wg)/Wnt glycoproteins [ ]. In addition to these proteins, one or two copies of the fz domain are also found [, , , , ] in:The Frzb family; secreted frizzled-like proteins.Smoothened; another 7TM receptor involved in hedgehog signaling.Carboxpeptidase Z (CPZ).Transmembrane serine protease corin (atrial natriuretic peptide-converting enzyme).Two receptor tyrosine kinases (RTKs) subfamilies, the Ror family and the muscle-specific kinase (MuSK) family.As the fz domain contains 10 cysteines which are largely conserved, it has also been called cysteine-rich domain (CRD) [ ]. The fz domain also contains several other highly conserved residues, for example, a basic amino acid follows C6, and a conserved proline residues lies four residues C-terminal to C9 []. The crystal structure of a fz domain shows that it is predominantly α-helical with all cysteines forming disulphide bonds. In addition to helical regions, two short β-strands at the N terminus form a minimal β-sheet with the second beta sheet passing through a knot created by disulphide bonds [].Several fz domains have been shown to be both necessary and sufficient for Wg/Wnt ligand binding, strongly suggesting that the fz domain is a Wg/Wnt interacting domain [ , ].
Protein Domain
Name: Type III secretion system outer membrane pore YscC/HrcC
Type: Family
Description: Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior. There have been four secretion systems described in animal enteropathogens such as Salmonella and Yersinia, with further sequence similarities in plant pathogens like Ralstonia and Erwinia [].The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell [, ] and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis []. However, while the latter forms a ring structure to allow secretion of flagellin and is an integral part of the flagellum itself [], type III subunits in the outer membrane translocate secreted proteins through a channel-like structure. The structure core of this system consists of the needle complex, a continuous channel formed by the highly oligomerized inner and outer membrane hollow rings and a polymerized helical needle filament which spans through and projects into the infected host cell [].This family aids in the structural assembly of the invasion complex [ ]. Another characteristic of this family is its ability to form a channel through the outer bacterial membrane, allowing secretion to take place. Members include the Salmonella InvG and SpiA gene, the Shigella MxiD, and the Yersinia Kim5 and YscC proteins. Plant pathogen members include the Hypersensitivity Response (HR) genes of Burkholderia and Erwinia.
Protein Domain
Name: Signal recognition particle, subunit SRP19-like superfamily
Type: Homologous_superfamily
Description: The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [ , ]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [].This entry represents the SRP19 subunit. The SRP19 protein is unstructured but forms a compact core domain and two extended RNA-binding loops upon binding the signal recognition particle (SRP) RNA [ ].
Protein Domain
Name: Frizzled cysteine-rich domain superfamily
Type: Homologous_superfamily
Description: This entry represents the frizzled domain superfamily.The frizzled (fz) domain is an extracellular domain of about 120 amino acids.It was first identified in the alpha-1 chain of type XVIII collagen and in members of the Frizzled family of seven transmembrane (7TM) proteins which act as receptors for secreted Wingless (Wg)/Wnt glycoproteins [ ]. In addition to these proteins, one or two copies of the fz domain are also found [, , , , ] in:The Frzb family; secreted frizzled-like proteins.Smoothened; another 7TM receptor involved in hedgehog signaling.Carboxpeptidase Z (CPZ).Transmembrane serine protease corin (atrial natriuretic peptide-converting enzyme).Two receptor tyrosine kinases (RTKs) subfamilies, the Ror family and the muscle-specific kinase (MuSK) family.As the fz domain contains 10 cysteines which are largely conserved, it has also been called cysteine-rich domain (CRD) [ ]. The fz domain also contains several other highly conserved residues, for example, a basic amino acid follows C6, and a conserved proline residues lies four residues C-terminal to C9 []. The crystal structure of a fz domain shows that it is predominantly α-helical with all cysteines forming disulphide bonds. In addition to helical regions, two short β-strands at the N terminus form a minimal β-sheet with the second beta sheet passing through a knot created by disulphide bonds [].Several fz domains have been shown to be both necessary and sufficient for Wg/Wnt ligand binding, strongly suggesting that the fz domain is a Wg/Wnt interacting domain [ , ].
Protein Domain
Name: DExH-box ATP-dependent RNA helicase, R3H domain
Type: Domain
Description: This R3H domain is found in a group of proteins which also contain a DEXH-box helicase domain, and may function as ATP-dependent DNA or RNA helicases [ ].The R3H domain is a conserved sequence motif found in proteins from a diverse range of organisms including eubacteria, green plants, fungi and various groups of metazoans, but not in archaea and Escherichia coli. The domain is named R3H because it contains an invariant arginine and a highly conserved histidine, that are separated by three residues. It also displays a conserved pattern of hydrophobic residues, prolines and glycines. It can be found alone, in association with AAA domain or with various DNA/RNA binding domains like DSRM, KH, G-patch, PHD, DEAD box, or RRM. The functions of these domains indicate that the R3H domain might be involved in polynucleotide-binding, including DNA, RNA and single-stranded DNA [ ].The 3D structure of the R3H domain has been solved. The fold presents a small motif, consisting of a three-stranded antiparallel β-sheet, against which two α-helices pack from one side. This fold is related to the structures of the YhhP protein and the C-terminal domain of the translational initiation factor IF3. Three conserved basic residues cluster on the same face of the R3H domain and could play a role in nucleic acid recognition. An extended hydrophobic area at a different site of the molecular surface could act as a protein-binding site [ ].
Protein Domain
Name: Tensin-like, SH2 domain
Type: Domain
Description: This entry represents the SH2 domain found in tensin-like proteins. The tensins are a family of intracellular proteins that interact with receptor tyrosine kinases (RTKs), integrins, and actin. They are thought act as signaling bridges between the extracellular space and the cytoskeleton. There are four homologues: tensin1, tensin2 (TENC1, C1-TEN), tensin3 and tensin4 (cten), all of which contain a C-terminal tandem SH2-PTB domain pairing, as well as actin-binding regions that may localize them to focal adhesions. The isoforms of tensin2 and tensin3 contain N-terminal C1 domains, which are atypical and not expected to bind to phorbol esters. Tensins 1-3 contain a phosphatase (PTPase) and C2 domain pairing which resembles PTEN (phosphatase and tensin homologue deleted on chromosome 10) protein [ ].PTEN is a lipid phosphatase that dephosphorylates phosphatidylinositol 3,4,5-trisphosphate (PtdIns(3,4,5)P3) to yield phosphatidylinositol 4,5-bisphosphate (PtdIns(4,5)P2). As PtdIns(3,4,5)P3 is the product of phosphatidylinositol 3-kinase (PI3K) activity, PTEN is therefore a key negative regulator of the PI3K pathway [ ]. Because of their PTEN-like domains, the tensins may also possess phosphoinositide-binding or phosphatase capabilities. However, only tensin2 and tensin3 have the potential to be phosphatases since only their PTPase domains contain a cysteine residue that is essential for catalytic activity. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites [, , ].
Protein Domain
Name: Transglutaminase-like superfamily
Type: Homologous_superfamily
Description: This domain superfamily is found in many proteins known to have transglutaminase activity, i.e. which cross-link proteins through an acyl-transfer reaction between the gamma-carboxamide group of peptide-bound glutamine and the ε-amino group of peptide-bound lysine, resulting in a ε-(gamma-glutamyl)lysine isopeptide bond. Tranglutaminases have been found in a diverse range of species, from bacteria through to mammals. The enzymes require calcium binding and their activity leads to post-translational modification of proteins through acyl-transfer reactions, involving peptidyl glutamine residues as acyl donors and a variety of primary amines as acyl acceptors, with the generation of proteinase resistant isopeptide bonds [ ]. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa' [ ]. On the basis of the experimentally demonstrated activity of the Methanobacterium phage psiM2 pseudomurein endoisopeptidase [], it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease [].A subunit of plasma Factor XIII revealed that each Factor XIIIA subunit is composed of four domains (termed N-terminal β-sandwich, core domain (containing the catalytic and the regulatory sites), and C-terminal β-barrels 1 and 2) and that two monomers assemble into the native dimer through the surfaces in domains 1 and 2, in opposite orientation. This organisation in four domains is highly conserved during evolution among transglutaminase isoforms [ ].
Protein Domain
Name: Importin-alpha, importin-beta-binding domain superfamily
Type: Homologous_superfamily
Description: The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.Members of the importin-alpha (karyopherin-alpha) family can form heterodimers with importin-beta. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Proteins can contain one (monopartite) or two (bipartite) NLS motifs. Importin-alpha contains several armadillo (ARM) repeats, which produce a curving structure with two NLS-binding sites, a major one close to the N terminus and a minor one close to the C terminus.Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. The N-terminal importin-beta-binding (IBB) domain of importin-alpha contains an auto-regulatory region that mimics the NLS motif []. The release of importin-beta frees the auto-regulatory region on importin-alpha to loop back and bind to the major NLS-binding site, causing the cargo to be released [].This entry represents the N-terminal IBB domain superfamily of importin-alpha that contains the auto-regulatory region.
Protein Domain
Name: Immunoglobulin subtype 2
Type: Domain
Description: The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable; ), C1-set (constant-1; ), C2-set (constant-2; ) and I-set (intermediate; ) [ ]. Structural studies have shown that these domains share a common core Greek-key β-sandwich structure, with the types differing in the number of strands in the β-sheets as well as in their sequence patterns [, ].Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system [ ]. This entry represents a subtype of the immunoglobulin domain, and is found in a diverse range of protein families that includes glycoproteins, fibroblast growth factor receptors, vascular endothelial growth factor receptors, interleukin-6 receptor, and neural cell adhesion molecules [ ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom