Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 13801 to 13900 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.028s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: WASH complex, subunit strumpellin
Type: Family
Description: The WASH (WASP and Scar homologue) complex is present at the surface of endosomes and recruits and activates the Arp2/3 complex to induce mediated actin nucleation. The WASH complex plays a key role in the fission of tubules that serve as transport intermediates during endosome sorting [ ].The WASH complex consists of several subunits: F-actin-capping protein subunit alpha (CAPZA1, CAPZA2 or CAPZA3), F-actin-capping protein subunit beta (CAPZB), WASH (WASH1, WASH2P, WASH3P, WASH4P, WASH5P or WASH6P), FAM21 (FAM21A, FAM21B or FAM21C), KIAA1033, KIAA0196 (strumpellin) and CCDC53.Strumpellin contains one known domain called a spectrin repeat that consists of three α-helices of a characteristic length wrapped in a left-handed coiled coil. The spectrin proteins have multiple copies of this repeat, which can then form multimers in the cell. Spectrin associates with the cell membrane via spectrin repeats in the ankyrin protein. The spectrin repeat is a structural platform for cytoskeletal protein assemblies. Two closely situated point mutations in human strumpellin lead to the condition of hereditary spastic paraplegia.
Protein Domain
Name: Neutrophil cytosol factor 4, PX domain
Type: Domain
Description: The PX domain is a phosphoinositide binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. p40phox contains an N-terminal PX domain, a central SH3 domain that binds p47phox, and a C-terminal PB1 domain that interacts with p67phox [ ]. It is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox) which plays a crucial role in the cellular response to bacterial infection. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species. p40phox positively regulates NADPH oxidase in both phosphatidylinositol-3-phosphate (PI3P)-dependent and PI3P-independent manner []. The PX domain is a phospholipid-binding module involved in the membrane targeting of proteins. The p40phox PX domain binds to PI3P, an abundant lipid in phagosomal membranes, playing an important role in the localization of NADPH oxidase []. The PX domain of p40phox is also involved in protein-protein interaction.
Protein Domain
Name: Sorting nexin 4, PX domain
Type: Domain
Description: Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds phosphoinositides (PIs) and targets the protein to PI-enriched membranes [ , ]. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway [, , ].SNX4 is involved in recycling traffic from the sorting endosome (post-Golgi endosome) back to the late Golgi. It shows a similar domain architecture as SNX1-2, among others, containing a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain [ ]. SNX4 is implicated in the regulation of plasma membrane receptor trafficking and interacts with receptors for EGF, insulin, platelet-derived growth factor and the long form of the leptin receptor [].This entry represents the SNX4 Phox Homology (PX) domain.
Protein Domain
Name: Tim23-like
Type: Family
Description: This entry includes mitochondrial import inner membrane translocase subunit Tim23 and chloroplastic outer envelope pore protein 16-1/2 (OEP161/162). In plants OEP161/162 are distantly related to the translocase of the inner mitochondrial membrane Tim17 [ , ]. OEP161/162 are amino acid-selective channel proteins and translocation pores for NADPH:protochlorophyllide oxidoreductase A (PORA) [].The TIM23 complex forms a dynamic multisubunit machinery that recognises preproteins and can transport them into the inner membrane or into the matrix. It is composed of three subunits, Tim50, Tim23, and Tim17, that expose the domains to the intermembrane space. This entry includes the mitochondrial import inner membrane translocase subunit Tim23, consisting of an N-terminal intermembrane space domain that interacts with Tim50, and a C-terminal membrane-embedded domain that, together with Tim17, forms a protein-conducting channel across the inner membrane. Tim50 helps keeping the Tim23 channel in a closed state in the absence of presequence proteins, regulating the gating through the TIM channel [ , ].
Protein Domain
Name: E3 ubiquitin-protein ligase Dma1/Dma2, RING finger, H2 subclass
Type: Domain
Description: Proteins containing this domain include one Schizosaccharomyces pombe protein Dma1 (SpDma1p), two Saccharomyces cerevisiae proteins, Dma1 (ScDma1p) and Dma2 (ScDma2p), and their homologues from fungi [ ]. They are related to the mammalian CHFR/RNF8 family of ubiquitin ligases. SpDma1p functions to prevent mitotic exit and cytokinesis during spindle checkpoint arrest by inhibiting septation initiation network (SIN) signalling []. ScDma1p and ScDma2p, also known as checkpoint forkhead associated with RING domains-containing protein 1 and 2 respectively, seem to be functionally redundant. They are involved in proper septin ring positioning and cytokinesis []. Dma1 and Dma2 are ubiquitin ligases that regulate protein kinase Swe1 levels and localization, and hence play a role in cell cycle control [ ]. The simultaneous lack of Dma1 and Dma2 leads to spindle mispositioning and defects in the spindle position checkpoint []. All members in this family contain a forkhead-associated domain (FHA) and a C3H2C3-type RING-H2 finger, the latter suggesting they may possibly possess E3 ubiquitin-ligase activities.
Protein Domain
Name: SWA2-like, ubiquitin-associated domain
Type: Domain
Description: Ubiquitin-associated (UBA) domains contain approximately 40 residues and bind ubiquitin non-covalently. They adopt a secondary structure consisting of three α-helices, and have been identified in various modular proteins involved in protein trafficking, clathrin assembly/disassembly, DNA repair, proteasomal degradation, and cell cycle regulation. Proteins containing this domain include Swa2 and other uncharacterised hypothetical proteins from Saccharomyces. Swa2 is the yeast auxilin ortholog that is a multifunctional protein with three N-terminal clathrin-binding (CB) motifs, a ubiquitin-association (UBA) domain, a tetratricopeptide repeat (TPR) domain, and a C-terminal J-domain. It is required for disassembly of clathrin-coated vesicles (CCVs) in an ATP-dependent manner, as well as for cortical endoplasmic reticulum (ER) inheritance [ ]. The N-terminal region of SWA2, the Saccharomyces cerevisiae orthologue of mammalian auxilin [], has a characteristic three-helix UBA fold. However, the third helix in SWA2 UBA contains a bulkier tyrosine in place of smaller residues found in other UBAs, and cannot pack as close to the second helix [].
Protein Domain
Name: AAA+ ATPase ClpV1
Type: Family
Description: The type VI secretion system (T6SS) is a supra-molecular bacterial complex that resembles phage tails. It is a toxin delivery systems which fires toxins into target cells upon contraction of its TssBC sheath [ ]. Thirteen essential core proteins are conserved in all T6SSs: the membrane associated complex TssJ-TssL-TssM, the baseplate proteins TssE, TssF, TssG, and TssK, the bacteriophage-related puncturing complex composed of the tube (Hcp), the tip/puncturing device VgrG, and the contractile sheath structure (TssB and TssC). Finally, the starfish-shaped dodecameric protein, TssA, limits contractile sheath polymerization at its distal part when TagA captures TssA []. ClpV is an AAA(+) ATPase that disassembles the type VI secretion system contracted sheath, which resets the systems for reassembly of an extended sheath that is ready to fire again [ ]. The ClpV proteins are most similar to ClpB proteins within the Hsp100/Clp family, but cluster in a separate phylogenetic tree with a remarkable distance to ClpB []. However, this entry also includes ClpB from Yersinia enterocolitica [].
Protein Domain
Name: PID1, PTB domain
Type: Domain
Description: PID1 (PTB-containing, cubilin and LRP1-interacting protein; also known as NYGGF4) is a phosphotyrosine-binding (PTB) domain-containing protein. It is an inhibitor of insulin-mediated signaling in adipocytes and muscle cells. It binds through its PTB domain to the second NPXY motif in the cytoplasmic tail of the low density lipoprotein receptor-related protein 1 (LRP1)[ ]. Besides being involved in obesity-associated insulin resistance [], PID1 has been shown to inhibit growth of medulloblastoma, glioblastoma and atypical teratoid rhabdoid tumour cell lines [].This entry represents the PTB domain of PID1.Proteins encoding phosphotyrosine binding (PTB) domains function as adaptors or scaffolds to organise the signaling complexes involved in wide-ranging physiological processes including neural development, immunity, tissue homeostasis and cell growth. Due to structural differences, PTB domains are divided into three groups represented by phosphotyrosine-dependent IRS-like, phosphotyrosine-dependent Shc-like, and phosphotyrosine-independent Dab-like PTBs. The last two PTBs have been named as phosphotyrosine interaction domain (PID or PI domain). PID domain has an average length of about 160 amino acids [ ].
Protein Domain
Name: Proteinase inhibitor I1, Kazal-type, metazoa
Type: Family
Description: This family of Kazal eukaryotic proteinase inhibitors, belongs to MEROPS inhibitor family I1, clan IA. They inhibit serine peptidases of the S1 family ( ) [ ]. They are restricted to the metazoa (arthropoda and chordata), and includes a single exception in the alveolata (apicomplexa). No members are found in nematodes, fungi or plants.Kazal inhibitors, which inhibit a number of serine proteinases (such as trypsin and elastase), belong to family of proteins that includespancreatic secretory trypsin inhibitor; avian ovomucoid; acrosin inhibitor; and elastase inhibitor. These proteins contain between 1 and 7 Kazal-typeinhibitor repeats [ , ].The structure of the Kazal repeat includes a large quantity of extended chain, 2 short α-helices and a 3-stranded anti-parallel beta sheet [ ].The inhibitor makes 11 contacts with its enzyme substrate: unusually, 8 of these important residues are hypervariable []. Altering the enzyme-contact residues, and especially that of the active site bond, affects the the strength of inhibition and specificity of the inhibitor for particular serine proteinases [, ].
Protein Domain
Name: Glycophorin
Type: Family
Description: Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others.Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane [ ]. Structurally, glycophorin A consists ofan N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.
Protein Domain
Name: Nck2, SH3 domain 1
Type: Domain
Description: This entry represent the first SH3 domain of Nck2. It binds the PxxDY sequence in the CD3e cytoplasmic tail; this binding inhibits phosphorylation by Src kinases, resulting in the downregulation of TCR surface expression [ ]. Nck2 (also known as Grb4) is a member of the Nck family. It plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling [ ]. It binds neuronal signaling proteins such as ephrinB []. Cytoplasmic proteins Nck are non-enzymatic adaptor proteins composed of three SH3 (Src homology 3) domains and a C-terminal SH2 domain []. They regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates []. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics []. They associate with tyrosine-phosphorylated growth factor receptors or their cellular substrates [, ]. There are two vertebrate Nck proteins, Nck1 and Nck2.
Protein Domain
Name: Enhancer of filamentation 1, SH3 domain
Type: Domain
Description: Enhancer of filamentation 1 (also known as NEDD9 or Cas-L) is a member of the CAS family. It is a scaffolding protein that assembles signaling complexes regulating multiple cellular processes, such as cell adhesion, migration, invasion, and metastasis. It is commonly dysregulated during cancer progression. It interacts with Aurora-A kinase to control ciliary resorption, and with Src and other partners to influence proliferative signaling pathways often activated in autosomal dominant polycystic kidney disease [ ].CAS (Crk-associated substrate) family members are adaptor proteins that contain a highly conserved N-terminal SH3 domain, an adjacent unstructured domain (substrate domain) containing multiple tyrosine phosphorylation sites that enable binding by SH2-domain containing proteins, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Most of these domains mediate protein-protein interactions. Through these interactions, they assemble larger signaling complexes that are essential for cell proliferation, survival, migration, and other processes [ ]. The CAS family consists of four members: BCAR1, HEF1, EFS, and CASS4 [].
Protein Domain
Name: Dedicator of cytokinesis 3, SH3 domain
Type: Domain
Description: DOCK family members are evolutionarily conserved guanine nucleotide exchange factors (GEFs) for Rho-family GTPases [ ]. DOCK proteins are required during several cellular processes, such as cell motility and phagocytosis. The N-terminal SH3 domain of the DOCK proteins functions as an inhibitor of GEF, which can be relieved upon its binding to the ELMO1-3 adaptor proteins, after their binding to active RhoG at the plasma membrane [, ]. DOCK family proteins are categorised into four subfamilies based on their sequence homology: DOCK-A subfamily (DOCK1/180, 2, 5), DOCK-B subfamily (DOCK3, 4), DOCK-C subfamily (DOCK6, 7, 8), DOCK-D subfamily (DOCK9, 10, 11) []. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). This entry represents the SH3 domain found in DOCK3, which has been linked to Alzheimer's disease due to its interaction with presenilin proteins and ability to stimulate Tau/MAPT phosphorylation [ ].
Protein Domain
Name: Calsequestrin
Type: Family
Description: Calsequestrin is the principal calcium-binding protein present in the sarcoplasmic reticulum of cardiac and skeletal muscle []. It is a highly acidic protein that is able to bind over 40 calcium ions and acts as an internal calcium store in muscle. Sequence analysis has suggested that calcium isnot bound in distinct pockets via EF-hand motifs, but rather via presentation of a charged protein surface. Two forms of calsequestrin have been identified. The cardiac form is present in cardiac and slowskeletal muscle and the fast skeletal form is found in fast skeletal muscle. The release of calsequestrin-bound calcium (through a a calciumrelease channel) triggers muscle contraction. The active protein is not highly structured, more than 50% ofit adopting a random coil conformation [ ]. When calcium binds there is a structural change wherebythe α-helical content of the protein increases from 3 to 11% [ ].Both forms of calsequestrin are phosphorylated by casein kinase II, but the cardiac form is phosphorylated more rapidly and to a higher degree [].
Protein Domain
Name: Fibrinogen-binding domain 2
Type: Domain
Description: This entry represents the fibrinogen-binding domain from bacterial proteins such as fibrinogen-binding adhesion SdrG and clumping factor A. In both SdrG and clumping factor A, there are two fibrinogen-binding domains with similar core β-sandwich topologies, but with different modulations in their structure. This entry represents the second domain, while represents the first domain. Gram-positive pathogens, such as Staphylococci, Streptococci, and Enterococci, contain multiple cell wall-anchored proteins. Some of these proteins act as adhesins and mediate bacterial attachment to host tissues through lock-and-interactions with host ligands, such as fibrinogen, a glycoprotein found in blood plasma that plays a key role in haemostasis and coagulation. For pathogenic bacteria that do not invade host cells, extracellular matrix proteins are preferred targets for bacterial adhesion; adhesins mediating these interactions have been termed MSCRAMMs (microbial surface components recognizing adhesive matrix molecules). A common binding domain organisation found within MSCRAMMs suggests a common ancestry. Both fibrinogen-binding adhesion SdrG and clumping factor A are MSCRAMMs.
Protein Domain
Name: NAF/FISL domain
Type: Domain
Description: The NAF domain is a 24 amino acid domain that is found in a plant-specific subgroup of serine-threonine protein kinases (CIPKs), that interact with calcineurin B-like calcium sensor proteins (CBLs). Whereas the N-terminal part of CIPKs comprises a conserved catalytic domain typical of Ser-Thr kinases, the much less conserved C-terminal domain appears to be unique to this subgroup of kinases. The only exception is the NAF domain that forms an 'island of conservation' in this otherwise variable region. The NAF domain has been named after the prominent conserved amino acids Asn-Ala-Phe. It represents a minimum protein interaction module that is both necessary and sufficient to mediate the interaction with the CBL calcium sensor proteins [ ].The secondary structure of the NAF domain is currently not known, but secondary structure computation of the C-terminal region of Arabidopsis thaliana CBL-interacting protein kinase 1 revealed a long helical structure [ ].The NAF domain has also been named FISL motif for its conserved amino acid residues [ ].
Protein Domain
Name: Cysteine-rich transmembrane CYSTM domain
Type: Domain
Description: Proteins containing CYSTM domain are short cysteine-rich membrane proteins that most probably dimerise together to form a transmembrane sulfhydryl-lined pore. The CYSTM domain is always present at the extreme -terminus of the protein in which it is present. Furthermore, like the yeast prototypes, the majority of the proteins also possess a proline/glutamine-rich segment upstream of the CYSTM domain that is likely to form a polar, disordered head in the cytoplasm. The presence of an atypical well-conserved acidic residue at the C-terminal end of the TM helix suggests that this might interact with a positively charged moiety in the lipid head group. Consistently across the eukaryotes, the different versions of the CYSTM domain appear to have roles in stress-response or stress-tolerance, and, more specifically, in resistance to deleterious substances, implying that these might be general functions of the whole family [ , ]. This entry also includes Protein CADMIUM TOLERANCE 1-5 from rice, which confers resistance to heavy metal ions such as cadmium and copper [, ].
Protein Domain
Name: Abr/Bcr
Type: Family
Description: Abr (active breakpoint cluster region-related protein) and Bcr (breakpoint cluster region protein) are homologous proteins containing a C-terminal domain with GTPase-activating protein (GAP) activity specific for Rac. They control multiple cellular functions of murine macrophages [ ]. They contain several domains, including tandem DH-PH, C2 and GAP domains. Bcr has an extra N-terminal oligomerization domain []. Bcr has been shown to fused to Abl tyrosine kinase in leukemia. The fusion of Bcr to Abl deregulates the tyrosine kinase activity of Abl [ ]. The N-terminal oligomerization domain is thought to be the most critical component that allows the formation of homo-tetramer Bcr/Abl complexes and deregulates the Abl tyrosine kinase [, ]. The GTPase-activating activity of Bcr has been shown to be regulated by transglutaminase 2 (TG2), a multifunctional protein that has been implicated in numerous pathologies including that of neurodegeneration and celiac disease [, ].Abr is a critical regulator of Rho and Cdc42 during the single cell wound healing [ ].
Protein Domain
Name: GRAF, BAR domain
Type: Domain
Description: This entry represents the BAR domain of GRAF. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The BAR domain of GRAF directly interacts with its Rho GAP domain and inhibits its activity. Autoinhibited GRAF is capable of binding membranes and tubulating liposomes, showing that the membrane-tubulation and GAP-inhibitory functions of the BAR domain can occur simultaneously [ ].Rho GTPase-activating protein 26 (ARHGAP26), also known as GTPase regulator associated with focal adhesion kinase (GRAF), is a GTPase-activating protein for the small GTPases of the Rho family RhoA and CDC42 [ , ]. GRAF influences cytoskeletal changes mediated by Rho proteins []. It is recognised as a tumor suppressor []. GRAF contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, a Rho GAP domain, and a C-terminal SH3 domain. The SH3 domain of GRAF binds PKNbeta, a target of the small GTPase Rho [].
Protein Domain
Name: Dynamin, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain found in dynamins. Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily [ ] that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins, OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins [, , , ], and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance.The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.
Protein Domain
Name: Casein Kinase 2, subunit alpha
Type: Family
Description: Ser/Thr protein kinase CK2 is a tetrameric protein with two catalytic (alpha) and two regulatory (beta) subunits [ , ]. It is constitutively active and ubiquitously expressed, and is found in the cytoplasm, nucleus, as well as in the plasma membrane. It phosphorylates a wide variety of substrates including gylcogen synthase, cell cycle proteins, nuclear proteins (e.g. DNA topoisomerase II), and ion channels (e.g. ENaC), among others. It may be considered a master kinase controlling the activity or lifespan of many other kinases and exerting its effect over cell fate, gene expression, protein synthesis and degradation, and viral infection [, ]. CK2 is implicated in every stage of the cell cycle and is required for cell cycle progression. It plays crucial roles in cell differentiation, proliferation, and survival, and is thus implicated in cancer. CK2 is not an oncogene by itself but elevated CK2 levels create an environment that enhances the survival of tumour cells [].
Protein Domain
Name: B-cell lymphoma/leukemia 10/E10
Type: Family
Description: This entry includes animal B-cell lymphoma/leukemia 10 (BCL10) and Equine herpesvirus protein E10.In lymphoid cells BCL10 plays a critical role in the activation of the transcription factor nuclear factor kappa B (NF-kB) downstream of a variety of immune receptors, including the TCR, B cell receptor, NK-cell receptors, C type family lectin receptors, Ig family receptors and G protein-coupled receptors [ , , ]. Activation of NF-kB through receptor-mediated pathways requires BCL10 to form a complex (known as CBM) with the paracaspase MALT1 and with CARD-containing adaptor protein CARMA []. BCL10 is involved in the regulation of apoptosis; a BCL10 gene tanslocation is found in mucosa-associated lymphoid tissue (MALT) lymphomas [, ]. BCL10 is involved in the adaptive and innate immune response [, ]. Equine herpesvirus protein E10 (v-E10) is the viral homologue of the BCL10 protein [ ]. It induces membrane recruitment of cellular BCL10, and this induces TRAF-mediated NF-kappaB activation [].
Protein Domain
Name: UHRF1/2-like
Type: Family
Description: This entry includes UHRF1/2 from animals and ORTHRUS 1-5 from Arabidopsis. They are ubiquitin-like proteins with PHD and RING finger domains. UHRF1, also known as ICBP90, is a transcription and cell cycle regulator and a methyl K9 H3-specific binding protein [ ]. UHRF2 is a ubiquitin E3 ligase for cell cycle proteins, such as CCND1 and CCNE1 []. It can also act as a SUMO E3 ligase for ZNF131 []. This family also includes UHRF1-like protein from Cryptococcus neoformans, which binds hemimethylated DNA and is involved in DNA methylation maintenance. Unlike the human orthologue, UHRF1-like lacks the Tudor H3K9me reader and RING E3 ligase domains found in its human ortholog [].In plants, ORTHRUS family members are E3 ligases mediating DNA methylation status in vivo. ORTH1-ORTH5 are predicted to encode proteins that contain one plant homeodomain (PHD), two really interesting new gene (RING) domains, and one set ring associated (SRA) domain [ ].
Protein Domain
Name: Effector-associated domain 8
Type: Domain
Description: This entry represents the effector-associated domain 8 (EAD8). Similarly to EAD1, it may primarily recruit other effectors to the systems [ ]. It is predicted to be an all α-helical domain.Effector-associated domains (EADs) are predicted to function as adaptor domains mediating protein-protein interactions. The EADs show a characteristic architectural pattern. One copy is always fused, typically to the N- or C-terminal, of a core component of a biological conflict system; examples include VMAP (vWA-MoxR associated protein), iSTAND (inactive STAND (iSTAND) NTPase system), or GAP1 (GTPase-associated protein 1). Further copies of the same EAD are fused to either effector or signal-transducing domains, or additional EADs. EAD pairs are frequently observed together on the genome in conserved gene neighborhoods, but can also be severed from such neighborhoods and located in distant regions, indicating EAD-EAD protein domain coupling approximates the advantages of collinear transcription [ , ]. EADs are all small domains with no enzymatic features.
Protein Domain
Name: Effector-associated domain 9
Type: Domain
Description: This entry represents the effector-associated domain 9 (EAD9). Similarly to EAD2, it may primarily recruit the signalling components of the system [ ]. It is predicted to be an all α-helical domain.Effector-associated domains (EADs) are predicted to function as adaptor domains mediating protein-protein interactions. The EADs show a characteristic architectural pattern. One copy is always fused, typically to the N- or C-terminal, of a core component of a biological conflict system; examples include VMAP (vWA-MoxR associated protein), iSTAND (inactive STAND (iSTAND) NTPase system), or GAP1 (GTPase-associated protein 1). Further copies of the same EAD are fused to either effector or signal-transducing domains, or additional EADs. EAD pairs are frequently observed together on the genome in conserved gene neighborhoods, but can also be severed from such neighborhoods and located in distant regions, indicating EAD-EAD protein domain coupling approximates the advantages of collinear transcription [ , ]. EADs are all small domains with no enzymatic features.
Protein Domain
Name: Effector-associated domain 7
Type: Domain
Description: This entry represents the effector-associated domain 7 (EAD7). Similarly to EAD1, it may primarily recruit other effectors to the systems [ ]. It is predicted to be an all α-helical domain.Effector-associated domains (EADs) are predicted to function as adaptor domains mediating protein-protein interactions. The EADs show a characteristic architectural pattern. One copy is always fused, typically to the N- or C-terminal, of a core component of a biological conflict system; examples include VMAP (vWA-MoxR associated protein), iSTAND (inactive STAND (iSTAND) NTPase system), or GAP1 (GTPase-associated protein 1). Further copies of the same EAD are fused to either effector or signal-transducing domains, or additional EADs. EAD pairs are frequently observed together on the genome in conserved gene neighborhoods, but can also be severed from such neighborhoods and located in distant regions, indicating EAD-EAD protein domain coupling approximates the advantages of collinear transcription [ , ]. EADs are all small domains with no enzymatic features.
Protein Domain
Name: Effector-associated domain 6
Type: Domain
Description: This entry represents the effector-associated domain 6 (EAD6) found in cyanobacteria. Similar to EAD2, it has been suggested to recruit the signalling components of the system [ ]. It is predicted to be an all α-helical domain.Effector-associated domains (EADs) are predicted to function as adaptor domains mediating protein-protein interactions. The EADs show a characteristic architectural pattern. One copy is always fused, typically to the N- or C-terminal, of a core component of a biological conflict system; examples include VMAP (vWA-MoxR associated protein), iSTAND (inactive STAND (iSTAND) NTPase system), or GAP1 (GTPase-associated protein 1). Further copies of the same EAD are fused to either effector or signal-transducing domains, or additional EADs. EAD pairs are frequently observed together on the genome in conserved gene neighborhoods, but can also be severed from such neighborhoods and located in distant regions, indicating EAD-EAD protein domain coupling approximates the advantages of collinear transcription [ , ]. EADs are all small domains with no enzymatic features.
Protein Domain
Name: Effector-associated domain 4
Type: Domain
Description: This entry represents the effector-associated domain 4 (EAD4) found in cyanobacteria that appears to primarily recruit other effectors to the systems, similar to EAD1. This domain has a predicted mixed alpha beta arrangement [ ].Effector-associated domains (EADs) are predicted to function as adaptor domains mediating protein-protein interactions. The EADs show a characteristic architectural pattern. One copy is always fused, typically to the N- or C-terminal, of a core component of a biological conflict system; examples include VMAP (vWA-MoxR associated protein), iSTAND (inactive STAND (iSTAND) NTPase system), or GAP1 (GTPase-associated protein 1). Further copies of the same EAD are fused to either effector or signal-transducing domains, or additional EADs. EAD pairs are frequently observed together on the genome in conserved gene neighborhoods, but can also be severed from such neighborhoods and located in distant regions, indicating EAD-EAD protein domain coupling approximates the advantages of collinear transcription [, ]. EADs are all small domains with no enzymatic features.
Protein Domain
Name: Effector-associated domain 5
Type: Domain
Description: This entry represents the effector-associated domain 5 (EAD5) that appears to primarily recruit other effectors to the systems, similar to EAD1. This domain has a predicted mixed alpha+beta arrangement [ ].Effector-associated domains (EADs) are predicted to function as adaptor domains mediating protein-protein interactions. The EADs show a characteristic architectural pattern. One copy is always fused, typically to the N- or C-terminal, of a core component of a biological conflict system; examples include VMAP (vWA-MoxR associated protein), iSTAND (inactive STAND (iSTAND) NTPase system), or GAP1 (GTPase-associated protein 1). Further copies of the same EAD are fused to either effector or signal-transducing domains, or additional EADs. EAD pairs are frequently observed together on the genome in conserved gene neighborhoods, but can also be severed from such neighborhoods and located in distant regions, indicating EAD-EAD protein domain coupling approximates the advantages of collinear transcription [ , ]. EADs are all small domains with no enzymatic features.
Protein Domain
Name: Antifreeze, type III
Type: Family
Description: Marine teleosts from polar oceans can be protected from freezing in icy sea-water by serum antifreeze proteins (AFPs) or glycoproteins (AFGPs) [ ]: these function by binding to, and preventing the growth of, ice crystals within the fish. Despite functional similarity, the proteins are structurally diverse and include glycosylated and at least 3 non-glycosylated forms: the AFGP of notothenioids and cods are polymers of atripeptide repeat, Ala-Ala-Thr, with a disaccharide attached to the threonine residue; type I AFPs are Ala-rich, α-helical peptides found in flounder and sculpin; type II AFPs of sea-raven, smelt and herring are Cys-rich proteins; and type III AFPs, found in eelpouts, are rich in β-structure [ ]. As well as antifreeze proteins, this domain is also found in the C-terminal region of at least one sialic acid synthase (SAS). The function of this domain in SAS is not known, but it may be involved in sugar binding [].
Protein Domain
Name: MRVI1
Type: Family
Description: This family consists of mammalian MRVI1 proteins which are related to the lymphoid-restricted membrane protein (JAW1) and the IP3 receptor associated cGMP kinase substrates A and B (IRAGA and IRAGB). The function of MRVI1 is unknown although mutations in the Mrvi1 gene induces myeloid leukaemia by altering the expression of a gene important for myeloid cell growth and/or differentiation so it has been speculated that Mrvi1 is a tumour suppressor gene [ ]. IRAG is very similar in sequence to MRVI1 and is an essential NO/cGKI-dependent regulator of IP3-induced calcium release. Activation of cGKI decreases IP3-stimulated elevations in intracellular calcium, induces smooth muscle relaxation and contributes to the antiproliferative and pro-apoptotic effects of NO/cGMP [, [ ]. Jaw1 is a member of a class of proteins with COOH-terminal hydrophobic membrane anchors and is structurally similar to proteins involved in vesicle targeting and fusion. This suggests that the function and/or the structure of the ER in lymphocytes may be modified by lymphoid-restricted resident ER proteins [].
Protein Domain
Name: Beta-1,3-glucan-binding protein, N-terminal domain superfamily
Type: Homologous_superfamily
Description: Beta 1,3-glucan recognition proteins (GRP, also called Gram-negative bacteria binding proteins or GNBPs) have specific affinity for beta 1,3-glucan, a component on the surface of fungi and bacteria. Beta-GRP (beta-1,3-glucan recognition protein) is one of several pattern recognition receptors (PRRs), also referred to as biosensor proteins, that complexes with pathogen-associated beta-1,3-glucans and then transduces signals necessary for activation of an appropriate innate immune response. They are present in insects and lack all catalytic residues [ , , , , ].BGRP consists of a well conserved N-terminal domain and a C-terminal beta-1,3-glucanase-like domain [ , ]. The N-terminal domain of BGRP plays a critical role for the detection of pathogen. In contrast, the C-terminal glucanase-like domain has neither glucanase activity, nor affinity with the beta-1,3-glucan [, ].This superfamily represents a domain is found at the N terminus of beta-1,3-glucan-binding proteins (BGRP-N). Structurally, BGRP-N has an immunoglobulin-like β-sandwich fold composed of two antiparallel β-sheets containing three and five β-strands [ ].
Protein Domain
Name: Viral ssDNA binding protein, head domain
Type: Homologous_superfamily
Description: This entry includes the major DNA-binding protein (DBP, UL57 or ICP8) from Herpesviruses. DBP binds single-stranded DNA, and the region encompassing residues 368-902 contains the DNA-binding site [ ]. UL5, UL8 and UL52 genes encode an essential heterotrimeric DNA helicase-primase that is responsible for concomitant DNA unwinding and primer synthesis at the viral DNA replication fork. DBP may stimulate DNA unwinding and enable bypass of cisplatin damaged DNA by recruiting the helicase-primase to the DNA []. DBP helps initiate DNA replication by binding to the origin-binding protein (UL9) []. It also reorganizes the host nucleus leading to the formation of prereplicative sites and replication compartments [].This superfamily represents the head domain found in Viral ssDNA-binding protein. The head domain interacts with the C-terminal domain (CTD) of the protein and gives the CTD structure. The CTD is involved in increasing the ssDNA binding protein's cooperativity when binding ICP8, which is believed to stimulate helicase activity. Structurally, this domain consists of 8 alpha helices [ ].
Protein Domain
Name: Calsequestrin, conserved site
Type: Conserved_site
Description: Calsequestrin is the principal calcium-binding protein present in the sarcoplasmic reticulum of cardiac and skeletal muscle []. It is a highly acidic protein that is able to bind over 40 calcium ions and acts as an internal calcium store in muscle. Sequence analysis has suggested that calcium isnot bound in distinct pockets via EF-hand motifs, but rather via presentation of a charged protein surface. Two forms of calsequestrin have been identified. The cardiac form is present in cardiac and slowskeletal muscle and the fast skeletal form is found in fast skeletal muscle. The release of calsequestrin-bound calcium (through a a calciumrelease channel) triggers muscle contraction. The active protein is not highly structured, more than 50% ofit adopting a random coil conformation [ ]. When calcium binds there is a structural change wherebythe α-helical content of the protein increases from 3 to 11% [ ].Both forms of calsequestrin are phosphorylated by casein kinase II, but the cardiac form is phosphorylated more rapidly and to a higher degree [].
Protein Domain
Name: Tumor necrosis factor receptor 1A, death domain
Type: Domain
Description: This entry represents the death domain (DD) found in tumor necrosis factor receptor-1 (TNFR-1). TNFR-1 has many names including TNFRSF1A, CD120a, p55, p60, and TNFR60. It activates two major intracellular signaling pathways that lead to the activation of the transcription factor NF-kB and the induction of cell death. Upon binding of its ligand TNF, TNFR-1 trimerizes which leads to the recruitment of an adaptor protein named TNFR-associated death domain protein (TRADD) through a DD/DD interaction [ ]. Mutations in the TNFRSF1A gene causes TNFR-associated periodic syndrome (TRAPS), a rare disorder characterized recurrent fever, myalgia, abdominal pain, conjunctivitis and skin eruptions [, ].DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes [ , ].
Protein Domain
Name: Chitin-binding type R&R consensus
Type: Conserved_site
Description: This entry represents the 35-36 amino acid motif known as the R&R consensus. This is a conserved region found in insect cuticular proteins [ ]. Insect cuticle is composed of proteins and chitin. The cuticular proteins seem to be specific to the type of cuticle (flexible or stiff) that occur at stages of the insect development. The proteins found in the flexible cuticle of larva and pupa of different insects share a conserved C-terminal section [ ]; such a region is also found in the soft endocuticle of adults insects [] as well as in other cuticular proteins including in arachnids []. This conserved motifof 35-36 amino acids is known as the R&R consensus since it was first recognised by Rebers and Riddiford. N-terminal to the consensus is a region ofhydrophilic amino acids. The two regions together have been called the extended R&R consensus, and form an approximately 70 amino acids chitin-binding domain [, ].
Protein Domain
Name: Flap endonuclease GEN, chromatin organization modifier domain
Type: Domain
Description: Chromodomains serve as chromatin-targeting modules, general protein interaction elements as well as dimerization sites. They are found in many chromatin-associated proteins that bind modified histone tails for chromatin targeting. Chromodomains often recognize modified lysines through their aromatic cage thus targeting proteins to chromatin. Family members such as GEN1 carry a chomodomain which directly contacts DNA and its truncation severely hampers GEN1's catalytic activity. The chromodomain allows GEN1 to correctly position itself against DNA molecules, and without the chromodomain, GEN1's ability to cut DNA was severely impaired. The GEN1 chromodomain was found to be distantly related to the CDY chromodomains and chromobox proteins, particularly to the chromo-shadow domains of CBX1, CBX3 and CBX5. Furthermore, it is conserved from yeast (Yen1) to humans with the only exception being the Caenorhabditis elegans GEN1, which has a much smaller protein size of 443 amino acids compared to yeast Yen1 (759 aa) or human GEN1 (908 aa) [ ].
Protein Domain
Name: Tol-Pal system-associated acyl-CoA thioesterase
Type: Family
Description: The tol-pal system consists of five critical genes. Inner membrane proteins TolQ and TolR convert proton motive force to energy that is transduced through TolA to an outer membrane complex of TolB and Pal. The system is known to be required to maintain outer membrane integrity. In a system with several homologous parts, ExbB and ExbD transduces energy through TonB to a variety of outer membrane proteins, many of which are siderophore receptors. The tol-pal system therefore may also be involved in transport. This family consists of a protein nearly always found in operons with the genes of the tol-pal system. The significance of this thioesterase to the tol-pal system is unclear, but either of two observations may be relevant. First, Pal, or peptidoglycan-associated lipoprotein, has a conserved N-terminal cleavage and acylation that makes it a lipoprotein. Second, the tol-pal system is implicated not only in the import of certain organics but also in the maintenance of outer membrane integrity (by an unknown mechanism) [ ].
Protein Domain
Name: Peptidyl-prolyl cis-trans isomerase E
Type: Family
Description: Cyclophilins exhibit peptidyl-prolyl cis-trans isomerase (PPIase) activity ( ), accelerating protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides [ , ]. They also have protein chaperone-like functions [] and are the major high-affinity binding proteins for the immunosuppressive drug cyclosporin A (CSA) in vertebrates [].Cyclophilins are found in all prokaryotes and eukaryotes, and have been structurally conserved throughout evolution, implying their importance in cellular function [ ]. They share a common 109 amino acid cyclophilin-like domain (CLD) and additional domains unique to each member of the family. The CLD domain contains the PPIase activity, while the unique domains are important for selection of protein substrates and subcellular compartmentalisation [].This entry represents the peptidyl-prolyl cis-trans isomerase E family of enzymes, which are a type of cyclophilin. In addition to their PPIase activity and role in protein folding, PPIase E family members also possess RNA-binding activity and may be involved in pre-mRNA splicing [ , ].
Protein Domain
Name: RelA-associated inhibitor
Type: Family
Description: This entry represents RelA-associated inhibitor (also known as iASPP), which is a regulator that plays a central role in regulation of apoptosis and transcription via its interaction with NF-kappa-B and p53/TP53 proteins [ ]. iASPP is an ankyrin-repeat-, SH3-domain- and proline-rich-region-containing proteins that is homologous with ASPP1 and ASPP2 (). The ASPPs proteins regulate the apoptotic function of p53; iASPP inhibits p53, whereas ASPP1 and ASPP2 activates p53 [ ]. The p53 tumour suppressor gene is one of the most frequently mutated genes in human cancer that can suppress tumour growth through its ability to induce apoptosis or cell-cycle arrest. Therefore, the ASPP family of proteins may be a novel target for cancer therapy []. This entry also includes ANK repeat-containing protein nipk-1 from the nematode Caenorhabditis elegans, which has been shown to mediate signaling of the receptor complex composed of ilcr-1 and ilcr-2. This complex acts directly on neurons, altering their response properties, modifying behaviour and is the receptor for interleukin-17 [].
Protein Domain
Name: Transcriptional enhancer factor TEF-3 (TEAD4)
Type: Family
Description: The TEAD family (also known as the TEF family) transcription factors play a key role in the Hippo signaling pathway, a pathway involved in organ size control and tumor suppression by restricting proliferation and promoting apoptosis. The core of this pathway is composed of a kinase cascade wherein MST1/MST2, in complex with its regulatory protein SAV1, phosphorylates and activates LATS1/2 in complex with its regulatory protein MOB1, which in turn phosphorylates and inactivates YAP1 oncoprotein and WWTR1/TAZ. TEAD transcription factors act by mediating gene expression of YAP1 and WWTR1/TAZ, thereby regulating cell proliferation, migration and epithelial mesenchymal transition (EMT) induction [, ].Four TEAD genes exist in mammals (TEAD 1 to 4). TEAD4 protein (also known as TEF-3) was reported to regulate muscle-specific genes in cardiac and smooth muscle cells [ ]. Alternatively spliced transcripts for TEAD4 have been identified in human retinal vascular endothelial cells []. TEAD4 protein has been shown to enhance VEGF gene expression in bovine aortic endothelial cells [].
Protein Domain
Name: Calcium-activated chloride channel protein, chordata
Type: Family
Description: This entry represents a family of Ca(2+)-regulated chloride channels (CLCA) which includes bovine, murine and human proteins [ , ]. Each CLCA exhibits a distinct, often overlapping, tissue expression pattern. With the exception of the truncated, secreted protein hCLCA3 [], they are synthesized as an approximately 125kDa precursor transmembrane glycoprotein that is rapidly cleaved into 90 and 35kDa subunits. The human proteins have been shown to affect a large number of cell functions including chloride conductance, epithelial secretion, cell-cell adhesion, apoptosis, cell cycle control, mucus production in asthma, and blood pressure. The CLCA proteins expressed on the luminal surface of lung vascular endothelia (bCLCA2; mCLCA1; hCLCA2) serve as adhesion molecules for lung metastatic cancer cells, mediating vascular arrest and lung colonization. Expression of hCLCA2 in normal mammary epithelium is consistently lost in human breast cancer and in all tumorigenic breast cancer cell lines. Re-expression of hCLCA2 in human breast cancer cells abrogates tumorigenicity in nude mice, implying that hCLCA2 acts as a tumour suppressor in breast cancer.
Protein Domain
Name: Structural maintenance of chromosomes 3, ABC domain, eukaryotic
Type: Domain
Description: The structural maintenance of chromosomes (SMC) proteins are large (approximately 110 to 170kDa), and each is arranged into five recognizable domains. Amino-acid sequence homology of SMC proteins between species is largely confined to the amino- and carboxy-terminal globular domains. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif, and a motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. In eukaryotic cells, the proteins are found as heterodimers of SMC1 paired with SMC3, SMC2 with SMC4, and SMC5 with SMC6 [ ].This entry represents the ATP-binding cassette domain of eukaryotic SMC3 proteins, which is found at the N terminus.
Protein Domain
Name: Iron sulphur domain-containing, mitoNEET, N-terminal
Type: Domain
Description: The CDGSH iron sulphur domain are a group of iron-sulphur (Fe-S) clusters and a unique 39 amino acid CDGSH domain [C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H]. The CDGSH iron sulphur domain protein (also referred to as mitoNEET) is an integral membrane protein located in the outer mitochondrial membrane and whose function may be to transport iron into the mitochondria [ ]. Iron in turn is essential for the function of several mitochondrial enzymes.This entry represents the N-terminal of the mitoNEET and Miner-type proteins that carry a CDGSH-type cluster-binding domain ( ) that coordinate a redox-active 2Fe-2S cluster. In the outer mitochondrian membrane (OMM), the CDGSH 2Fe-2S-containing domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by the N-terminal domain found in higher vertebrates [ , , ]. The whole protein regulates oxidative capacity and may function in electron transfer, for instance in redox reactions with metabolic intermediates, cofactors and/or proteins localized at the OMM.
Protein Domain
Name: AIR12, DOMON domain
Type: Domain
Description: This entry represents the DOMON domain found in AIR12 and related proteins. Proteins containing this domain are plant proteins that may have a cytochrome b561 domain C-terminal to the DOMON domain. In Arabidopsis, AIR12 is a plasma membrane b-type cytochrome specific to flowering plants [ ]. AIR12 functions as either antioxidant in its oxidized state or a pro-oxidant in its reduced state depending on the redox status of the plasma membrane and the apoplast [].DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N terminus of sensor histidine kinases [ , ].
Protein Domain
Name: Bacteriophage lambda, Tail fiber protein, repeat-1
Type: Repeat
Description: This entry represents repeat 1 of Tail fibre protein from Bacteriophage lambda (Stf or Gp27) and similar proteins from the tailed bacteriophages Caudovirales, such as Long-tail fibre protein gp37 from Bacteriophage T4 (Gp37), and prophages mainly found in Gammaproteobacteria such as Prophage side tail fibre protein homologue StfQ from Escherichia coli (strain K12).The strain of the Bacteriophage lambda used in most laboratories in the early 1990's carried some mutations respect to the wild type. Stf is the gene product of one of these mutations, which allow the virus to bind to an additional outer membrane receptor and accelerate the rate of adsorption onto the host cell surface but a higher failed infection frequency [ , ]. Gp37 is a structural component of the distal-half of the long-tail fibre. It constitutes the part of the long-tail fibres that recognises the bacterial receptor [ ]. Antibodies to this protein have shown to inactivate T4 by blocking infection. Its crystal structure shows three chains intertwisted forming a trimer [, ].
Protein Domain
Name: Solute carrier family 17 member 9-like
Type: Family
Description: This subfamily includes solute carrier family 17 member 9 (SLC17A9) and similar proteins including plant inorganic phosphate transporters (PHT4) that are also probably anion transporters. SLC17A9, also called vesicular nucleotide transporter (VNUT), is involved in vesicular storage and exocytosis of ATP. It facilitates the accumulation of ATP and other nucleotides in secretory vesicles such as adrenal chromaffin granules and synaptic vesicles [ ]. It also functions as a lysosomal ATP transporter and regulates cell viability []. Plant PHT4 family transporters mediate the transport of inorganic phosphate and may also transport organic anions. The Arabidopsis protein AtPHT4;4 is a chloroplast-localized ascorbate transporter []. PHT4 proteins show differential expression that suggests specialized functions [].The SLC17A9-like subfamily belongs to the Solute carrier 17 (SLC17) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins. MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement [ ].
Protein Domain
Name: Fibulin-1
Type: Family
Description: Fibulins are a family of ECM glycoproteins characterized by a fibulin-type C-terminal domain preceded by tandem calcium-binding epidermal growth factor (EGF)-like modules. They are involved in protein-protein interaction with the components of basement membrane and extracellular matrix proteins. There are five fibulins, which can be classified into two subgroups. Fibulin-1 and -2 constitute one subgroup. These fibulins are larger than the others due to the presence of a higher number of EGF modules and an extra domain with three anaphylatoxin modules [ ]. Members of the second subgroup, fibulin-3, -4, and -5, are similarly small in size and highly homologous to one another in modular structure. They consist of a modified cbEGF domain at the N terminus followed by five tandem cbEGF modules and the fibulin-type C-terminal region.This entry represents the fibulin-1 proteins, which are incorporated into fibronectin-containing matrix fibres and may play a role in cell adhesion and migration [ ].
Protein Domain
Name: BRPF1, PHD domain
Type: Domain
Description: This entry represents the PHD finger of BRPF1. BRPF1, also termed peregrin, or protein Br140, is a multi-domain protein that binds histones, mediates monocytic leukemic zinc-finger protein (MOZ) -dependent histone acetylation, and is required for Hox gene expression and segmental identity [ ]. It is a close partner of the MOZ histone acetyltransferase (HAT) complex and a novel Trithorax group (TrxG) member with a central role during development [, ]. BRPF1 is primarily a nuclear protein that has a broad tissue distribution and is abundant in testes and spermatogonia []. It contains a plant homeodomain (PHD) zinc finger followed by a non-canonical ePHD finger, a bromodomain and a proline-tryptophan-tryptophan-proline (PWWP) domain. PHD and ePHD fingers both bind to lysine 4 of histone H3 (K4H3), bromodomains interact with acetylated lysines on N-terminal tails of histones and other proteins, and PWWP domains show histone-binding and chromatin association properties. BRPF1 may be involved in chromatin remodeling [, ].
Protein Domain
Name: L,D-transpeptidase catalytic domain
Type: Domain
Description: This family of proteins are found in a range of bacteria. The conserved region contains a conserved histidine and cysteine, suggesting that these proteins have an enzymatic activity. Several members of this family contain peptidoglycan binding domains. So these proteins may use peptidoglycan or a precursor as a substrate. The molecular structure of YkuD protein shows this domain has a novel tertiary fold consisting of a β-sandwich with two mixed sheets, one containing five strands and the other, six strands. The two β-sheets form a cradle capped by an α-helix. This domain contains a putative catalytic site with a tetrad of invariant His123, Gly124, Cys139, and Arg141. The stereochemistry of this active site shows similarities to peptidotransferases and sortases, and suggests that the enzymes of this family may play an important role in cell wall biology. This family was formerly called the ErfK/YbiS/YcfS/YnhG family, but is now named after the first protein of known structure [ , ].
Protein Domain
Name: ABC transporter Tap-like
Type: Family
Description: The ABC transporter family is a group of membrane proteins that use the hydrolysis of ATP to power the translocation of a wide variety of substrates across cellular membranes. ABC transporters minimally consist of two conserved regions: a highly conserved nucleotide-binding domain (NBD) and a less conserved transmembrane domain (TMD). Eukaryotic ABC proteins are usually organised either as full transporters (containing two NBDs and two TMDs), or as half transporters (containing one NBD and one TMD), that have to form homo- or heterodimers in order to constitute a functional protein [ ].This entry includes Tap1/2 (transporter associated with antigen processing 1/2) and its homologue, Mdl2, from budding yeasts. They are a group of eukaryotic proteins belonging to the ABC transporter family. Tap proteins play a crucial role in the processing and presentation of the MHC class I-restricted antigens. Mdl2 is a mitochondrial inner membrane half-type ABC transporter that is required for respiratory growth at high temperature [ , ].
Protein Domain
Name: TccP2/EspF(U)-like superfamily
Type: Homologous_superfamily
Description: This superfamily includes EspF(U) and related proteins. Enteropathogenic Escherichia coli O127:H6 attaches to the intestinal mucosa through actin pedestals that are created after it has injected the Type III secretion protein EspF (E. coli secreted protein F-like protein from prophage U) into the cells. EspF recruits the actin machinery by activating the WASP (Wiscott-Aldrich syndrome protein) family of actin nucleating factors [ ]. Subsequent cell-death (apoptosis) is caused by EspF being targeted to the mitochondria as a consequence of its mitochondrial targeting sequence. Import into mitochondria leads to a loss of membrane potential, leakage of cytochrome c and activation of the apoptotic caspase cascade. Mutation of leucine to glutamic at position 16 of EspF (L16E) resulted in the failure of EspF import into mitochondria; mitochondrial membrane potential was not affected and cell death abolished. This suggests that the targeting of EspF to mitochondria is essential for bacterial pathogenesis and apoptosis [, ].
Protein Domain
Name: Type VII secretion system EssB, C-terminal
Type: Homologous_superfamily
Description: This superfamily includes proteins homologous to YukC in B. subtilis and EssB in Staphylococcus aureus. The YukC protein family is thought to participate to the formation of a translocon required for the secretion (type VII secretion) of WXG100 proteins ( ) in monoderm bacteria, the WXG100 protein secretion system (Wss) ( ). The membrane-bound EssB is an integral and essential component of the bacterial type VII secretion system that can contribute to pathogenicity. The asymmetric unit consists of a single polypeptide folded into an elongated structure, which consists of two domains. This superfamily represents the C-terminal segment, which is positioned on the exterior of the membrane and adopts a helical fold. This segment contributes most to dimer formation. The domain may serve as an anchor point for the secretion apparatus, which is embedded in the cytoplasmic membrane, the C-terminal domain protruding out to interact with partner proteins or components of peptidoglycan [ ].
Protein Domain
Name: Translation elongation factor EFG/EF2
Type: Family
Description: Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome [ , , ]. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.EF-G is a large, five-domain GTPase that promotes the directional movement of mRNA and tRNAs on the ribosome in a GTP-dependent manner. Unlike other GTPases, but by analogy to the myosin motor, EF-G performs its function of powering translocation in the GDP-bound form; that is, in a kinetically stable ribosome-EF-G(GDP) complex formed by GTP hydrolysis on the ribosome. The complex undergoes an extensive structural rearrangement, in particular affecting the small ribosomal subunit, which leads to mRNA-tRNA movement. Domain 4, which extends from the 'body' of the EF-G molecule much like a lever arm, appears to be essential for the structural transition to take place. In a hypothetical model, GTP hydrolysis induces a conformational change in the G domain of EF-G, which affects the interactions with neighbouring domains within EF-G. The resulting rearrangement of the domains relative to each other generates conformational strain in the ribosome to which EF-G is fixed. Because of structural features of the tRNA-ribosome complex, this conformational strain results in directional tRNA-mRNA movement. The functional parallels between EF-G and motor proteins suggest that EF-G differs from classical G-proteins in that it functions as a force-generating mechanochemical device rather than a conformational switch [ ].Every completed bacterial genome has at least one copy, but some species have additional EF-G-like proteins. The closest homologue to canonical (e.g. Escherichia coli) EF-G in the spirochetes clusters as if it is derived from mitochondrial forms, while a more distant second copy is also present. Synechocystis sp. (strain PCC 6803) has a few proteins more closely related to EF-G than to any other characterised protein. Two of these resemble E. coli EF-G more closely than does the best match from the spirochetes; it may be that both function as authentic EF-G.
Protein Domain
Name: Peptidase S8/S53 domain
Type: Domain
Description: These proteins contain a domain found in serine peptidases belonging to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin) and S53 (sedolisin), both of which are members of clan SB [ ].The subtilisin family is one of the largest serine peptidase families characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence [ ]. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses []. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase [, ]. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity [, ]. Some subtilisins are mosaic proteins, while others contain N- and C-terminal extensions that show no sequence similarity to any other known protein [].The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site [, ]. Members of the kexin subfamily, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of a cysteine near to the active site histidine []. Only one viral member of the subtilisin family is known, a 56kDa protease from herpes virus 1, which infects the channel catfish []. Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease [ ]. This domain is also found in Neisserial autotransporter lipoprotein NalP from Neisseria meningitidis, a major human immunogenic protein that cleaves human (host) complement factor C3, generating a shorter alpha chain and a longer beta chain than normal [ ].
Protein Domain
Name: Peroxidases heam-ligand binding site
Type: Binding_site
Description: Peroxidases are haem-containing enzymes that use hydrogen peroxide as the electron acceptor to catalyse a number of oxidative reactions.Most haem peroxidases follow the reaction scheme: Fe3++ H 2O 2-->[Fe 4+=O]R' (Compound I) + H2O [Fe4+=O]R' + substrate -->[Fe 4+=O]R (Compound II) + oxidised substrate[Fe4+=O]R + substrate -->Fe 3++ H 2O + oxidised substrate In this mechanism, the enzyme reacts with one equivalent of H 2O 2to give [Fe4+=O]R' (compound I). This is a two-electron oxidation/reduction reaction where H 2O 2is reduced to water and the enzyme is oxidised. One oxidising equivalent resides on iron, giving the oxyferryl [] intermediate, while in many peroxidases the porphyrin (R) is oxidised to the porphyrin pi-cation radical (R'). Compound I then oxidises an organic substrate to give a substrate radical [ ].Haem peroxidases include two superfamilies: one found in bacteria, fungi, plants and the second found in animals. The first one can be viewed as consisting of 3 major classes. ClassI, the intracellular peroxidases, includes: yeast cytochrome c peroxidase (CCP), a soluble protein found in the mitochondrial electron transportchain, where it probably protects against toxic peroxides; ascorbate peroxidase (AP), the main enzyme responsible for hydrogen peroxide removalin chloroplasts and cytosol of higher plants; and bacterial catalase- peroxidases, exhibiting both peroxidase and catalase activities. It isthought that catalase-peroxidase provides protection to cells under oxidative stress [].Class II consists of secretory fungal peroxidases: ligninases, or lignin peroxidases (LiPs), and manganese-dependent peroxidases (MnPs). These aremonomeric glycoproteins involved in the degradation of lignin. In MnP, Mn2+serves as the reducing substrate [ ]. Class II proteins contain fourconserved disulphide bridges and two conserved calcium-binding sites. Class III consists of the secretory plant peroxidases, which have multiple tissue-specific functions: e.g., removal of hydrogen peroxide fromchloroplasts and cytosol; oxidation of toxic compounds; biosynthesis of the cell wall; defence responses towards wounding; indole-3-acetic acid (IAA) catabolism; ethylene biosynthesis; and so on. Class III proteins are also monomeric glycoproteins, containing four conserved disulphide bridges and two calcium ions, although the placement of the disulphides differs from class II enzymes. The crystal structures of a number of these proteins show that they share the same architecture - two all-alpha domains between which the haem group is embedded. This entry represents the binding site for heam in a number of peroxidases.
Protein Domain
Name: Haem peroxidase
Type: Domain
Description: Peroxidases are haem-containing enzymes that use hydrogen peroxide as the electron acceptor to catalyse a number of oxidative reactions.Most haem peroxidases follow the reaction scheme: Fe3++ H 2O 2-->[Fe 4+=O]R' (Compound I) + H2O [Fe4+=O]R' + substrate -->[Fe 4+=O]R (Compound II) + oxidised substrate[Fe4+=O]R + substrate -->Fe 3++ H 2O + oxidised substrate In this mechanism, the enzyme reacts with one equivalent of H 2O 2to give [Fe4+=O]R' (compound I). This is a two-electron oxidation/reduction reaction where H 2O 2is reduced to water and the enzyme is oxidised. One oxidising equivalent resides on iron, giving the oxyferryl [] intermediate, while in many peroxidases the porphyrin (R) is oxidised to the porphyrin pi-cation radical (R'). Compound I then oxidises an organic substrate to give a substrate radical [ ].Haem peroxidases include two superfamilies: one found in bacteria, fungi, plants and the second found in animals. The first one can be viewed as consisting of 3 major classes. ClassI, the intracellular peroxidases, includes: yeast cytochrome c peroxidase (CCP), a soluble protein found in the mitochondrial electron transportchain, where it probably protects against toxic peroxides; ascorbate peroxidase (AP), the main enzyme responsible for hydrogen peroxide removalin chloroplasts and cytosol of higher plants; and bacterial catalase- peroxidases, exhibiting both peroxidase and catalase activities. It isthought that catalase-peroxidase provides protection to cells under oxidative stress [].Class II consists of secretory fungal peroxidases: ligninases, or lignin peroxidases (LiPs), and manganese-dependent peroxidases (MnPs). These aremonomeric glycoproteins involved in the degradation of lignin. In MnP, Mn2+serves as the reducing substrate [ ]. Class II proteins contain fourconserved disulphide bridges and two conserved calcium-binding sites. Class III consists of the secretory plant peroxidases, which have multiple tissue-specific functions: e.g., removal of hydrogen peroxide fromchloroplasts and cytosol; oxidation of toxic compounds; biosynthesis of the cell wall; defence responses towards wounding; indole-3-acetic acid (IAA) catabolism; ethylene biosynthesis; and so on. Class III proteins are also monomeric glycoproteins, containing four conserved disulphide bridges and two calcium ions, although the placement of the disulphides differs from class II enzymes. The crystal structures of a number of these proteins show that they share the same architecture - two all-alpha domains between which the haem group is embedded. This entry represents the first type of haem peroxidases found in bacteria, fungi, plants.
Protein Domain
Name: Signal transduction response regulator, receiver domain
Type: Domain
Description: Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [ ]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [ , ].Bipartite response regulator proteins are involved in a two-component signal transduction system in bacteria, and certain eukaryotes like protozoa, that functions to detect and respond to environmental changes [ ]. These systems have been detected during host invasion, drug resistance, motility, phosphate uptake, osmoregulation, and nitrogen fixation, amongst others []. The two-component system consists of a histidine protein kinase environmental sensor that phosphorylates the receiver domain of a response regulator protein; phosphorylation induces a conformational change in the response regulator, which activates the effector domain, triggering the cellular response []. The domains of the two-component proteins are highly modular, but the core structures and activities are maintained.The response regulators act as phosphorylation-activated switches to affect a cellular response, usually by transcriptional regulation. Most of these proteins consist of two domains, an N-terminal response regulator receiver domain, and a variable C-terminal effector domain with DNA-binding activity. This entry represents the response regulator receiver domain, which belongs to the CheY family, and receives the signal from the sensor partner in the two-component system.
Protein Domain
Name: Zinc finger C2H2-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short β hairpin and an α helix (β/β/α structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 [ ]. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved β/β/α structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short α-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets [].This entry represents the classical C2H2 zinc finger domain.
Protein Domain
Name: Fibrinogen, alpha/beta/gamma chain, C-terminal globular domain
Type: Domain
Description: Fibrinogen plays key roles in both blood clotting and platelet aggregation. During blood clot formation, the conversion of soluble fibrinogen to insoluble fibrin is triggered by thrombin, resulting in the polymerisation of fibrin, which forms a soft clot; this is then converted to a hard clot by factor XIIIA, which cross-links fibrin molecules. Platelet aggregation involves the binding of the platelet protein receptor integrin alpha(IIb)-beta(3) to the C-terminal D domain of fibrinogen [ ]. In addition to platelet aggregation, platelet-fibrinogen interaction mediates both adhesion and fibrin clot retraction. Fibrinogen occurs as a dimer, where each monomer is composed of three non-identical chains, alpha, beta and gamma, linked together by several disulphide bonds [ ]. The N-terminals of all six chains come together to form the centre of the molecule (E domain), from which the monomers extend in opposite directions as coiled coils, followed by C-terminal globular domains (D domains). Therefore, the domain composition is: D-coil-E-coil-D. At each end, the C-terminal of the alpha chain extends beyond the D domain as a protuberance that is important for cross-linking the molecule. During clot formation, the N-terminal fragments of the alpha and beta chains (within the E domain) in fibrinogen are cleaved by thrombin, releasing fibrinopeptides A and B, respectively, and producing fibrin. This cleavage results in the exposure of four binding sites on the E domain, each of which can bind to a D domain from different fibrin molecules. The binding of fibrin molecules produces a polymer consisting of a lattice network of fibrins that form a long, branching, flexible fibre [ , ]. Fibrin fibres interact with platelets to increase the size of the clot, as well as with several different proteins and cells, thereby promoting the inflammatory response and concentrating the cells required for wound repair at the site of damage.This entry represents the C-terminal globular D domain of the alpha, beta and gamma chains. These domains are related to domains in other proteins: in the Parastichopus parvimensis (Sea cucumber) fibrogen-like FreP-A and FreP-B proteins; in the C terminus of the Drosophila scabrous protein that is involved in the regulation of neurogenesis, possibly through the inhibition of R8 cell differentiation; and in ficolin proteins, which display lectin activity towards N-acetylglucosamine through their fibrogen-like domains [ ].
Protein Domain
Name: Fibrinogen-like, C-terminal
Type: Homologous_superfamily
Description: Fibrinogen plays key roles in both blood clotting and platelet aggregation. During blood clot formation, the conversion of soluble fibrinogen to insoluble fibrin is triggered by thrombin, resulting in the polymerisation of fibrin, which forms a soft clot; this is then converted to a hard clot by factor XIIIA, which cross-links fibrin molecules. Platelet aggregation involves the binding of the platelet protein receptor integrin alpha(IIb)-beta(3) to the C-terminal D domain of fibrinogen [ ]. In addition to platelet aggregation, platelet-fibrinogen interaction mediates both adhesion and fibrin clot retraction. Fibrinogen occurs as a dimer, where each monomer is composed of three non-identical chains, alpha, beta and gamma, linked together by several disulphide bonds [ ]. The N-terminals of all six chains come together to form the centre of the molecule (E domain), from which the monomers extend in opposite directions as coiled coils, followed by C-terminal globular domains (D domains). Therefore, the domain composition is: D-coil-E-coil-D. At each end, the C-terminal of the alpha chain extends beyond the D domain as a protuberance that is important for cross-linking the molecule. During clot formation, the N-terminal fragments of the alpha and beta chains (within the E domain) in fibrinogen are cleaved by thrombin, releasing fibrinopeptides A and B, respectively, and producing fibrin. This cleavage results in the exposure of four binding sites on the E domain, each of which can bind to a D domain from different fibrin molecules. The binding of fibrin molecules produces a polymer consisting of a lattice network of fibrins that form a long, branching, flexible fibre [ , ]. Fibrin fibres interact with platelets to increase the size of the clot, as well as with several different proteins and cells, thereby promoting the inflammatory response and concentrating the cells required for wound repair at the site of damage.This entry represents the C-terminal globular D domain of the alpha, beta and gamma chains. These domains are related to domains in other proteins: in the Parastichopus parvimensis (Sea cucumber) fibrogen-like FreP-A and FreP-B proteins; in the C terminus of the Drosophila scabrous protein that is involved in the regulation of neurogenesis, possibly through the inhibition of R8 cell differentiation; and in ficolin proteins, which display lectin activity towards N-acetylglucosamine through their fibrogen-like domains [ ].
Protein Domain
Name: Dbl homology (DH) domain
Type: Domain
Description: The Rho family GTPases Rho, Rac and CDC42 regulate a diverse array of cellular processes. Like all members of the Ras superfamily, the Rho proteins cycle between active GTP-bound and inactive GDP-bound conformational states. Activation of Rho proteins through release of bound GDP and subsequentbinding of GTP, is catalysed by guanine nucleotide exchange factors (GEFs) in the Dbl family. The proteins encoded by members of the Dbl family share acommon domain, presented in this entry, of about 200 residues (designated the Dbl homology or DH domain) that has been shown to encode a GEF activity specific for a number of Rho family members. In addition, all family members possess a second, shared domain designated the pleckstrin homology (PH) domain ( ). Trio and its homologue UNC-73 are unique within the Dbl family insomuch as they encode two distinct DH/PH domain modules. The PH domain is invariably located immediately C-terminal to the DH domain and this invariant topography suggests a functional interdependence between these two structural modules. Biochemical data have established the role of the conserved DH domain in Rho GTPase interaction and activation, and the role of the tandem PH domain in intracellular targeting and/or regulation of DH domain function. The DH domain of Dbl has been shown to mediate oligomerisation that is mostly homophilic in nature. In addition to the tandem DH/PH domains Dbl family GEFs contain diverse structural motifs like serine/threonine kinase, RBD, PDZ, RGS, IQ, REM, Cdc25, RasGEF, CH, SH2, SH3, EF, spectrin or Ig. The DH domain is composed of three structurally conserved regions separated by more variable regions. It does not share significant sequence homology withother subtypes of small G-protein GEF motifs such as the Cdc25 domain and the Sec7 domain, which specifically interact with Ras and ARF family small GTPases, respectively, nor with other Rho protein interactive motifs, indicating that the Dbl family proteins are evolutionarily unique. The DH domain is composed of 11 alpha helices that are folded into a flattened, elongated α-helix bundle in which two of the three conserved regions, conserved region 1 (CR1) and conserved region 3 (CR3), are exposed near the centre of one surface. CR1 and CR3, together with a part of alpha-6 and the DH/PH junction site, constitute the Rho GTPase interacting pocket.
Protein Domain
Name: Zinc finger C2H2 superfamily
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short β hairpin and an α helix (β/β/α structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 [ ]. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved β/β/α structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short α-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets [].
Protein Domain
Name: Apelin receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].The human APJ gene which encodes this receptor was originally cloned in 1993 using a set of primers based on the 7 conserved TM domains. The putative sequence is closest in terms of identity (40-50% in the TM regions) to the angiotensin receptor (AT1); however, angiotensin II shows no affinity for the receptor [ ]. It is a receptor for apelin receptor early endogenous ligand (APELA) and apelin (APLN) hormones, which are coupled to G proteins and inhibit adenylate cyclase activity []. The mature transcript encodes a preproprotein that yields a 13 amino acid active peptide from the C-terminal end. Apelin has a similar mRNA distribution to angiotensin II and the active peptides share some similarity. It plays a role in regulation of blood vessel formation, blood pressure, heart contractility and heart failure [, , ].
Protein Domain
Name: Psychosine receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Psychosine is a glycosphingolipid implicated in the pathology of globoid cell leukodystrophy (GLD), a hereditary metabolic disorder that results fromthe absence of the enzyme galactosyl ceramide. This deficiency results in the accumulation of psychosine in the brain, leading to apoptosis ofoligodendrocytes, progressive demyelination and the existence of large, multinuclear cells (globoid cells) derived from microglia []. The molecular mechanism by which these toxic effects might be mediated hasrecently been elucidated by the identification of TDAG8, an orphan G protein-coupled receptor, as a receptor for psychosine []. TDAG8 isexpressed at high levels in the spleen, peripheral blood leukocytes, lymph nodes and lung. Activation of the receptor in RH7777 hepatoma cells bypsychosine and related lysoglycolipids results in a pertussis toxin- insensitive inhibition of forskolin-induced cAMP accummulation, possibly through coupling to Gz proteins [ ].
Protein Domain
Name: CELF1/2, RNA recognition motif 2
Type: Domain
Description: The human CELF family has six members, which can be divided into two subfamilies based on their phylogeny: CELF1-2 and CELF3-6. This entry represents the RNA recognition motif 2 (RRM2) of CELF-1 and CELF-2 protein. CELF-1 and CELF-2 belong to the CELF (CUGBP and ETR-3 Like Factor)/Bruno-like protein family, whose members play important roles in the regulation of alternative splicing and translation. CELF-1 and CELF-2 share sequence similarity to the Drosophila Bruno protein and binds to the Bruno response elements (cis-acting sequences in the 3'-untranslated region (UTR) ofoskar mRNA) [ ].The human CELF-1 (also known as CUG-BP or BRUNOL-2) binds to RNA substrates and recruits PARN deadenylase [ ]. It preferentially targets UGU-rich mRNA elements []. CELF-1 has been implicated in onset of type 1 myotonic dystrophy (DM1), a neuromuscular disease associated with an unstable CUG triplet expansion in the 3'-UTR (3'-untranslated region) of the DMPK (myotonic dystrophy protein kinase) gene [, ]. CELF-1 contain three highly conserved RNA recognition motifs (RRMs): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C terminus of the protein. The Xenopus homologue of CELF-1 is EDEN-BP (embryo deadenylation element-binding protein), which mediates sequence-specific deadenylation of Eg5 mRNA. It binds specifically to the EDEN motif in the 3'-untranslated regions of maternal mRNAs and targets these mRNAs for deadenylation and translational repression [ ]. The two N-terminal RRMs of EDEN-BP are necessary for the interaction with EDEN as well as a part of the linker region (between RRM2 and RRM3). Oligomerization of EDEN-BP is required for specific mRNA deadenylation and binding []. CELF-2 (also known as CUGBP2 or ETR-3) shares high sequence identity with CELF-1, but shows different binding specificity; it binds preferentially to sequences with UG repeats and UGUU motifs. It also binds to the 3'-UTR of cyclooxygenase-2 messages, affecting both translation and mRNA stability, and binds to apoB mRNA, regulating its C to U editing []. CELF-2 also contains three highly conserved RRMs. It binds to RNA via the first two RRMs, which are also important for localization in the cytoplasm. The splicing activation or repression activity of CELF-2 on some specific substrates is mediated by RRM1/RRM2. Both, RRM1 and RRM2 of CELF-2, can activate cardiac troponin T (cTNT) exon 5 inclusion. In addition, CELF-2 possesses a typical arginine and lysine-rich nuclear localization signal (NLS) in the C terminus, within RRM3 [].
Protein Domain
Name: CELF1/2, RNA recognition motif 3
Type: Domain
Description: The human CELF family has six members, which can be divided into two subfamilies based on their phylogeny: CELF1-2 and CELF3-6. This entry represents the RNA recognition motif 3 (RRM3) of CELF-1 and CELF-2 protein. CELF-1 and CELF-2 belong to the CELF (CUGBP and ETR-3 Like Factor)/Bruno-like protein family, whose members play important roles in the regulation of alternative splicing and translation. CELF-1 and CELF-2 share sequence similarity to the Drosophila Bruno protein and binds to the Bruno response elements (cis-acting sequences in the 3'-untranslated region (UTR) ofoskar mRNA) [ ].The human CELF-1 (also known as CUG-BP or BRUNOL-2) binds to RNA substrates and recruits PARN deadenylase [ ]. It preferentially targets UGU-rich mRNA elements []. CELF-1 has been implicated in onset of type 1 myotonic dystrophy (DM1), a neuromuscular disease associated with an unstable CUG triplet expansion in the 3'-UTR (3'-untranslated region) of the DMPK (myotonic dystrophy protein kinase) gene [, ]. CELF-1 contain three highly conserved RNA recognition motifs (RRMs): two consecutive RRMs (RRM1 and RRM2) situated in the N-terminal region followed by a linker region and the third RRM (RRM3) close to the C terminus of the protein. The Xenopus homologue of CELF-1 is EDEN-BP (embryo deadenylation element-binding protein), which mediates sequence-specific deadenylation of Eg5 mRNA. It binds specifically to the EDEN motif in the 3'-untranslated regions of maternal mRNAs and targets these mRNAs for deadenylation and translational repression [ ]. The two N-terminal RRMs of EDEN-BP are necessary for the interaction with EDEN as well as a part of the linker region (between RRM2 and RRM3). Oligomerization of EDEN-BP is required for specific mRNA deadenylation and binding []. CELF-2 (also known as CUGBP2 or ETR-3) shares high sequence identity with CELF-1, but shows different binding specificity; it binds preferentially to sequences with UG repeats and UGUU motifs. It also binds to the 3'-UTR of cyclooxygenase-2 messages, affecting both translation and mRNA stability, and binds to apoB mRNA, regulating its C to U editing []. CELF-2 also contains three highly conserved RRMs. It binds to RNA via the first two RRMs, which are also important for localization in the cytoplasm. The splicing activation or repression activity of CELF-2 on some specific substrates is mediated by RRM1/RRM2. Both, RRM1 and RRM2 of CELF-2, can activate cardiac troponin T (cTNT) exon 5 inclusion. In addition, CELF-2 possesses a typical arginine and lysine-rich nuclear localization signal (NLS) in the C terminus, within RRM3 [].
Protein Domain
Name: DNA integrity scanning protein, DisA, N-terminal domain superfamily
Type: Homologous_superfamily
Description: Cyclic di-AMP (c-di-AMP) is a bacterial secondary messenger molecule, which is associated with various physiological functions. It is involved in several important cellular processes, such as cell wall metabolism, maintenance of DNA integrity, ion transport, transcription regulation, and allosteric regulation of enzyme function. The 120-amino acid-long diadenylate cyclase (DAC) domain converts two ATP or ADP molecules into one c-di-AMP molecule. The majority of DAC domain-containing proteins are found in bacterial species, but a small number are also present in archaea of the phylum Euryarchaeota. In bacteria, DAC domain proteins are most frequently found in Gram-positive bacteria belonging to the phyla Firmicutes and Actinobacteria, including pathogenic bacteria such as Listeria monocytogenes or Staphylococcus aureus. Compared with the majority of bacterial species which encode only one DAC enzyme, members of the genus Bacillusgenerally encode three DAC domain-containing proteins: DisA, CdaA (previously named YbbP in the genus Bacillusor DacA in other genera) and CdaS (previously named YojJ in the genus Bacillusor DacB in others) [ , , , , ].The DAC domain exhibits an overall globular alpha/beta fold with the long N-terminally located helix (alpha1) flanking the core. A slightly twisted central β-sheet, made up of seven mixed-parallel and antiparallel β-strands, forms the core globular part. Both sides of the β-sheets are flanked by a total of five α-helices (alpha1-alpha5), resulting in the observed globular shape [ , ].The DisA protein is a bacterial checkpoint protein that dimerises into an octameric complex. The protein consists of three distinct domains. The DAC domain is the first and is a globular, nucleotide-binding region; the next 146-289 residues constitute the DisA-linker family, that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), which thus forms a spine like-linker between domains 1 and 3. The C-terminal residues, of domain 3, are represented by family HHH, the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis.
Protein Domain
Name: DNA integrity scanning protein, DisA, N-terminal
Type: Domain
Description: Cyclic di-AMP (c-di-AMP) is a bacterial secondary messenger molecule, which is associated with various physiological functions. It is involved in several important cellular processes, such as cell wall metabolism, maintenance of DNA integrity, ion transport, transcription regulation, and allosteric regulation of enzyme function. The 120-amino acid-long diadenylate cyclase (DAC) domain converts two ATP or ADP molecules into one c-di-AMP molecule. The majority of DAC domain-containing proteins are found in bacterial species, but a small number are also present in archaea of the phylum Euryarchaeota. In bacteria, DAC domain proteins are most frequently found in Gram-positive bacteria belonging to the phyla Firmicutes and Actinobacteria, including pathogenic bacteria such as Listeria monocytogenes or Staphylococcus aureus. Compared with the majority of bacterial species which encode only one DAC enzyme, members of the genus Bacillusgenerally encode three DAC domain-containing proteins: DisA, CdaA (previously named YbbP in the genus Bacillusor DacA in other genera) and CdaS (previously named YojJ in the genus Bacillusor DacB in others) [ , , , , ].The DAC domain exhibits an overall globular alpha/beta fold with the long N-terminally located helix (alpha1) flanking the core. A slightly twisted central β-sheet, made up of seven mixed-parallel and antiparallel β-strands, forms the core globular part. Both sides of the β-sheets are flanked by a total of five α-helices (alpha1-alpha5), resulting in the observed globular shape [ , ].The DisA protein is a bacterial checkpoint protein that dimerises into an octameric complex. The protein consists of three distinct domains. The DAC domain is the first and is a globular, nucleotide-binding region; the next 146-289 residues constitute the DisA-linker family, that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), which thus forms a spine like-linker between domains 1 and 3. The C-terminal residues, of domain 3, are represented by family HHH, the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis.
Protein Domain
Name: RTX toxin determinant A
Type: Family
Description: Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior [ , ]. Four principal exotoxin secretion systems have been described. In the type II and IV secretion systems, toxins are first exported to the periplasm by way of a cleaved N-terminal signal sequence; a second set of proteins is used for extracellular transport (type II), or the C terminus of the exotoxin itself is used (type IV). Type III secretion involves at least 20 molecules that assemble into a needle; effector proteins are then translocated through this without need of a signal sequence. In the Type I system, a complete channel is formed through both membranes, and the secretion signal is carried on the C terminus of the exotoxin. The RTX (repeats in toxin) family of cytolytic toxins belong to the Type I secretion system, and are important virulence factors in Gram-negative bacteria, such as Escherichia coli ( ), Actinobacillus pleuropneumoniae ( ) and Kingella kingae ( ). They consist of a hydrophobic pore-forming domain at the N-terminal that harbors four putative transmembrane α-helices, a typical glycine-rich repeats segment and a C-terminal signal sequence [ ]. The glycine-rich repeats are essential for binding calcium, and are critical for the biological activity of the secreted toxins []. They can be divided into two different groups, (i) hemolysins, which cause cause the lysis of erythrocytes and exhibit toxicity towards a wide range of cell types from various species; and (ii) leukotoxins, that exhibit narrow cell type and species specificity due to cell-specific binding through the beta2-integrins expressed on the cell surface of leukocytes []. All RTX toxin operons exist in the order rtxCABD, RtxA protein being the structural component of the exotoxin, both RtxB and D being required for its export from the bacterial cell; RtxC is an acyl-carrier-protein-dependent acyl-modification enzyme, required to convert RtxA to its active form [].Escherichia coli haemolysin (HlyA) is often quoted as the model for RTX toxins. Recent work on its relative rtxC gene product HlyC [ ] has revealed that it provides the acylation aspect for post-translational modification of two internal lysine residues in the HlyA protein. To cause pathogenicity, the HlyA toxin must first bind Ca2+ ions to the set of glycine-rich repeats and then be activated by HlyC []. This has been demonstrated both in vitroand in vivo.
Protein Domain
Name: Peptidase M13, N-terminal domain
Type: Domain
Description: This entry represents the N-terminal domain of M13 peptidases.This group of metallopeptidases belong to the MEROPS peptidase family M13 (neprilysin family, clan MA(E)). The M13 family includes neprilysin (neutral endopeptidase, NEP, enkephalinase, CD10, CALLA, ), endothelin-converting enzyme I (ECE-1, ), erythrocyte surface antigen KELL (ECE-3), phosphate-regulating gene on the X chromosome (PHEX), soluble secreted endopeptidase (SEP), and damage-induced neuronal endopeptidase (DINE)/X-converting enzyme (XCE). These proteins consist of a short N-terminal cytoplasmic domain, a single transmembrane helix, and a larger C-terminal extracellular domain containing the active site. The cytoplasmic domain contains a conformationally-restrained octapeptide, which is thought to act as a stop transfer sequence that prevents proteolysis and secretion [ , ]. Proteins in this family fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity [, ]. The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH [].M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk [ , ]. The family includes eukaryotic and prokaryotic oligopeptidases, as well as some of the proteins responsible for the molecular basis of the blood group antigens e.g. Kell []. Neprilysin (NEP), is expressed in a variety of tissues including kidney and brain, and is involved in many physiological and pathological processes, including blood pressure and inflammatory response. It is a plasma membrane-bound mammalian enzyme that is able to digest biologically-active peptides, including enkephalins [ ], substance P, cholecystokinin, neurotensin and somatostatin. It is an important enzyme in the regulation of amyloid-beta (Abeta) protein that forms amyloid plaques that are associated with Alzeimers disease (AD). The zinc ligands of neprilysin are known and are analogous to those in thermolysin, a related peptidase [, ]. Neprilysins, like thermolysin, are inhibited by phosphoramidon, which appears to selectively inhibit this family in mammals. The enzymes are all oligopeptidases, digesting oligo- and polypeptides, but not proteins [].
Protein Domain
Name: Peptidase M13, C-terminal domain
Type: Domain
Description: This entry represents the C-terminal domain of M13 peptidases.This group of metallopeptidases belong to the MEROPS peptidase family M13 (neprilysin family, clan MA(E)). The M13 family includes neprilysin (neutral endopeptidase, NEP, enkephalinase, CD10, CALLA, ), endothelin-converting enzyme I (ECE-1, ), erythrocyte surface antigen KELL (ECE-3), phosphate-regulating gene on the X chromosome (PHEX), soluble secreted endopeptidase (SEP), and damage-induced neuronal endopeptidase (DINE)/X-converting enzyme (XCE). These proteins consist of a short N-terminal cytoplasmic domain, a single transmembrane helix, and a larger C-terminal extracellular domain containing the active site. The cytoplasmic domain contains a conformationally-restrained octapeptide, which is thought to act as a stop transfer sequence that prevents proteolysis and secretion [, ]. Proteins in this family fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity [, ]. The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH [].M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk [ , ]. The family includes eukaryotic and prokaryotic oligopeptidases, as well as some of the proteins responsible for the molecular basis of the blood group antigens e.g. Kell []. Neprilysin (NEP), is expressed in a variety of tissues including kidney and brain, and is involved in many physiological and pathological processes, including blood pressure and inflammatory response. It is a plasma membrane-bound mammalian enzyme that is able to digest biologically-active peptides, including enkephalins [ ], substance P, cholecystokinin, neurotensin and somatostatin. It is an important enzyme in the regulation of amyloid-beta (Abeta) protein that forms amyloid plaques that are associated with Alzeimers disease (AD). The zinc ligands of neprilysin are known and are analogous to those in thermolysin, a related peptidase [, ]. Neprilysins, like thermolysin, are inhibited by phosphoramidon, which appears to selectively inhibit this family in mammals. The enzymes are all oligopeptidases, digesting oligo- and polypeptides, but not proteins [].
Protein Domain
Name: Interleukin-1 receptor type 1
Type: Family
Description: Interleukin-1 alpha and interleukin-1 beta (IL-1 alpha and IL-1 beta) are cytokines that participate in the regulation of immune responses, inflammatory reactions, and hematopoiesis []. Two types of IL-1 receptor, each with three extracellular immunoglobulin (Ig)-like domains, limited sequence similarity (28%) and different pharmacological characteristics have been cloned from mouse and human cell lines: these have been termed type I and type II receptors []. The receptors both exist in transmembrane (TM) and soluble forms: the soluble IL-1 receptor is thought to be post-translationally derived from cleavage of the extracellular portion of the membrane receptors.Both IL-1 receptors appear to be well conserved in evolution, and map to the same chromosomal location []. The receptors can both bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA).The crystal structures of IL1A and IL1B [ ] have been solved, showing them to share the same 12-stranded β-sheet structure as both the heparin binding growth factors and the Kunitz-type soybean trypsin inhibitors []. The β-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel β-barrel. Several regions, especially the loop between strands 4 and 5, have been implicated in receptor binding.The Vaccinia virus genes B15R and B18R each encode proteins with N-terminal hydrophobic sequences, possible sites for attachment of N-linked carbohydrate and a short C-terminal hydrophobic domain []. These propertiesare consistent with the mature proteins being either virion, cell surface or secretory glycoproteins. Protein sequence comparisons reveal that the gene products are related to each other (20% identity) and to the Ig superfamily. The highest degree of similarity is to the human and murine interleukin-1 receptors, although both proteins are related to a wide range of Ig superfamily members, including the interleukin-6 receptor. A novel method for virus immune evasion has been proposed in which the product of one or both of these proteins may bind interleukin-1 and/or interleukin-6, preventing these cytokines reaching their natural receptors [ ]. A similar gene product from Cowpox virus (CPV) has also been shown to specifically bind murine IL-1 beta [].This entry represents Interleukin-1 receptor type 1 (IL1R1), the crystal structure of the soluble extracellular part of type-I IL1R complexed with IL1RA has been determined to 2.7A resolution [ ]. The receptor structure is characterised by three Ig-like domains, of which domains 1 and 2 are tightly linked, while domain 3 is completely separate and connected by a flexible linker.
Protein Domain
Name: Interleukin-1 receptor type I/II
Type: Family
Description: Interleukin-1 alpha and interleukin-1 beta (IL-1 alpha and IL-1 beta) are cytokines that participate in the regulation of immune responses, inflammatory reactions, and hematopoiesis []. Two types of IL-1 receptor, each with three extracellular immunoglobulin (Ig)-like domains, limited sequence similarity (28%) and different pharmacological characteristics have been cloned from mouse and human cell lines: these have been termed type I and type II receptors []. The receptors both exist in transmembrane (TM) and soluble forms: the soluble IL-1 receptor is thought to be post-translationally derived from cleavage of the extracellular portion of the membrane receptors.Both IL-1 receptors appear to be well conserved in evolution, and map to the same chromosomal location []. The receptors can both bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA).The crystal structures of IL1A and IL1B [ ] have been solved, showing them to share the same 12-stranded β-sheet structure as both the heparin binding growth factors and the Kunitz-type soybean trypsin inhibitors []. The β-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel β-barrel. Several regions, especially the loop between strands 4 and 5, have been implicated in receptor binding.The Vaccinia virus genes B15R and B18R each encode proteins with N-terminal hydrophobic sequences, possible sites for attachment of N-linked carbohydrate and a short C-terminal hydrophobic domain []. These propertiesare consistent with the mature proteins being either virion, cell surface or secretory glycoproteins. Protein sequence comparisons reveal that the gene products are related to each other (20% identity) and to the Ig superfamily. The highest degree of similarity is to the human and murine interleukin-1 receptors, although both proteins are related to a wide range of Ig superfamily members, including the interleukin-6 receptor. A novel method for virus immune evasion has been proposed in which the product of one or both of these proteins may bind interleukin-1 and/or interleukin-6, preventing these cytokines reaching their natural receptors [ ]. A similar gene product from Cowpox virus (CPV) has also been shown to specifically bind murine IL-1 beta [].The crystal structure of the soluble extracellular part of type-I IL1R complexed with IL1RA has been determined to 2.7A resolution []. The receptor structure is characterised by three Ig-like domains, of whichdomains 1 and 2 are tightly linked, while domain 3 is completely separate and connected by a flexible linker.
Protein Domain
Name: Fibrinogen, conserved site
Type: Conserved_site
Description: Fibrinogen plays key roles in both blood clotting and platelet aggregation. During blood clot formation, the conversion of soluble fibrinogen to insoluble fibrin is triggered by thrombin, resulting in the polymerisation of fibrin, which forms a soft clot; this is then converted to a hard clot by factor XIIIA, which cross-links fibrin molecules. Platelet aggregation involves the binding of the platelet protein receptor integrin alpha(IIb)-beta(3) to the C-terminal D domain of fibrinogen [ ]. In addition to platelet aggregation, platelet-fibrinogen interaction mediates both adhesion and fibrin clot retraction. Fibrinogen occurs as a dimer, where each monomer is composed of three non-identical chains, alpha, beta and gamma, linked together by several disulphide bonds [ ]. The N-terminals of all six chains come together to form the centre of the molecule (E domain), from which the monomers extend in opposite directions as coiled coils, followed by C-terminal globular domains (D domains). Therefore, the domain composition is: D-coil-E-coil-D. At each end, the C-terminal of the alpha chain extends beyond the D domain as a protuberance that is important for cross-linking the molecule. During clot formation, the N-terminal fragments of the alpha and beta chains (within the E domain) in fibrinogen are cleaved by thrombin, releasing fibrinopeptides A and B, respectively, and producing fibrin. This cleavage results in the exposure of four binding sites on the E domain, each of which can bind to a D domain from different fibrin molecules. The binding of fibrin molecules produces a polymer consisting of a lattice network of fibrins that form a long, branching, flexible fibre [ , ]. Fibrin fibres interact with platelets to increase the size of the clot, as well as with several different proteins and cells, thereby promoting the inflammatory response and concentrating the cells required for wound repair at the site of damage.This entry represents a conserved site in the C-terminal globular D domain of the alpha, beta and gamma chains. These domains are related to domains in other proteins: in the Parastichopus parvimensis (Sea cucumber) fibrogen-like FreP-A and FreP-B proteins; in the C terminus of the Drosophila scabrous protein that is involved in the regulation of neurogenesis, possibly through the inhibition of R8 cell differentiation; and in ficolin proteins, which display lectin activity towards N-acetylglucosamine through their fibrogen-like domains [ ].
Protein Domain
Name: Leghaemoglobin
Type: Family
Description: Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms [ ]. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:Haemoglobin (Hb): tetramer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates [ ]. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors [].Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle [ ].Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia [ ]. Neuroglobin belongs to a branch of the globin family that diverged early in evolution. Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin [ ].Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers [ ].Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation.Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants [ ].Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin [ ].Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [ , ]. Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors [ ].Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 α-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features [ ].Leghaemoglobins are haem-proteins, first identified in root nodules of leguminous plants, where they are crucial for supplying sufficient oxygen to root nodule bacteria for nitrogen fixation to occur [, ]. Although leghaemoglobin and myoglobin both share a common fold, and both regulate the facilitated diffusion of oxygen, leghemoglobins regulate oxygen affinity through a mechanism different from that of myoglobin using a novel combination of haem pocket amino acids that lower the oxygen affinity [, ]. The structure of leghaemoglobins is similar to that of haemoglobins and myoglobins, although there is little sequence conservation []. The protein is largely α-helical, eight helices providing the scaffold for a well-defined haem-binding pocket []. By contrast with the tetrameric mammalian globin assembly, the plant form is monomeric []. The structural similarity of leghaemoglobins and haemoglobins has suggested a common evolutionary origin. It was thought that haemoglobins may be found in plants other than legumes [ ], and indeed globins have now been identified in the roots of non-leguminous plants, where they have a role in respiratory metabolism in the root cells []. This entry also represents Non-symbiotic haemoglobins (NsHb) which play important roles in a variety of cellular processes. A class I NsHb from cotton plants can be induced in plant roots as a defence mechanism against pathogen invasions, possibly by modulating nitric oxide (NO) levels [ ]. Several NsHbs appear to play a role NO scavenging in plants, indicating that the primordial function of haemoglobins may well be to protect against nitrosative stress and to modulate NO signalling functions [].
Protein Domain
Name: Plant PDR ABC transporter associated
Type: Domain
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [, , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].This domain is found on the C terminus of ABC-2 type transporter domains ( ). It seems to be associated with the plant pleiotropic drug resistance (PDR) protein family of ABC transporters. Like in yeast, plant PDR ABC transporters may also play a role in the transport of antifungal agents [ ] (see also ). The PDR family is characterised by a configuration in which the ABC domain is nearer the N terminus of the protein than the transmembrane domain [ ].
Protein Domain
Name: Endothelin receptor A
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Endothelins are able to activate a number of signal transduction processes including phospholipase A2, phospholipase C and phospholipase D, as well as cytosolic protein kinase activation. The play an important role in the regulation of the cardiovascular system [ , , ] and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels [, ]. As a result, endothelins are implicated in a number of vascular diseases, including the heart, general circulation and brain [, , ]. Endothelins stimulate the contraction in almost all other smooth muscles (e.g., uterus, bronchus, vas deferens, stomach) and stimulate secretion in several tissues e.g., kidney, liver and adrenals [, , ]. Endothelins have also been implicated in a variety of pathophysiological conditions associated with stress including hypertension, myocardial infarction, subarachnoid haemorrhage and renal failure [].Two endothelin receptor subtypes have been isolated and identified, endothelin A receptor(ETA) and endothelin B receptor (ETB) [ , , , ], and are members of the seven transmembrane rhodopsin-like G-protein coupled receptor family (GPCRA) which stimulate multiple effectors via several types of G protein []. ETA and ETB receptors are both widely distributed, ETA receptors are mainly located on vascular smooth muscle cells, whereas ETB receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain, e.g. cerebral cortex, cerebellum and glial cells [, ]. ETA receptors are considered to be the primary vasoconstrictor and growth-promoting receptor, and the binding of endothelin to ETA increases vasoconstriction (contraction of the blood vessel walls) and the retention of sodium, leading to increased blood pressure []. Endothelin B receptor on the other hand not only inhibits cell growth and vasoconstriction in the vascular system but also functions as a "clearance receptor". This receptor-mediated clearance mechanism is particularly important in the lung, which clears about 80% of circulating endothelin-1 [ , ].Both receptors are localised to non-vascular structures such as epithelial cells as well as occurring in the central nervous system (CNS) on glial cells and neurones, where they are thought to mediate neurotransmission and vascular functions [ ].This entry represents the endothelin A receptor.
Protein Domain
Name: Nickel ABC transporter, permease subunit NikB
Type: Family
Description: This family consists of the NikB family of nickel ABC transporter permeases. The NikABCDE uptake system that contains this protein also contain a homologous permease subunit NikC [ ]. Based on sequence similarity, NikA is the periplasmic binding protein, NikB and NikC are the membrane components, and NikD and NikE are the ATP-binding components of the ABC transporter. Nickel is used in cells as part of urease or certain hydrogenases or superoxide dismutases. To avoid nickel toxicity, the synthesis of the Nik system is tightly controlled by the nickel-responsive repressor NikR [].ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ]. The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].
Protein Domain
Name: Nickel ABC transporter, permease subunit NikC
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].This family consists of the NikC family of nickel ABC transporter permeases. Operons that contain this protein also contain a homologous permease subunit NikB. NikC is one of the two integral membrane proteins of E. coli ABC-type Ni2+ transporter [ ]. Nickel is used in cells as part of urease or certain hydrogenases or superoxide dismutases.
Protein Domain
Name: Peptidase A3A, cauliflower mosaic virus-type
Type: Domain
Description: This aspartic peptidase domain is found in viral enzymatic polyproteins. It belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the polgene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an asparticprotease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes theviral coat protein [ ]. The viral aspartic peptidase domain has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. Aspartic peptidases, also known as aspartyl proteases ([intenz:3.4.23.-]), are widely distributed proteolytic enzymes [, , ] known to exist in vertebrates, fungi, plants, protozoa, bacteria, archaea, retroviruses and some plant viruses. All known aspartic peptidases are endopeptidases. A water molecule, activated by two aspartic acid residues, acts as the nucleophile in catalysis. Aspartic peptidases can be grouped into five clans, each of which shows a unique structural fold [].Peptidases in clan AA are either bilobed (family A1 or the pepsin family) or are a homodimer (all other families in the clan, including retropepsin from HIV-1/AIDS) [ ]. Each lobe consists of a single domain with a closed β-barrel and each lobe contributes one Asp to form the active site. Most peptidases in the clan are inhibited by the naturally occurring small-molecule inhibitor pepstatin [].Clan AC contains the single family A8: the signal peptidase 2 family. Members of the family are found in all bacteria. Signal peptidase 2 processes the premurein precursor, removing the signal peptide. The peptidase has four transmembrane domains and the active site is on the periplasmic side of the cell membrane. Cleavage occurs on the amino side of a cysteine where the thiol group has been substituted by a diacylglyceryl group. Site-directed mutagenesis has identified two essential aspartic acid residues which occur in the motifs GNXXDRX and FNXAD (where X is a hydrophobic residue) [ ]. No tertiary structures have been solved for any member of the family, but because of the intramembrane location, the structure is assumed not to be pepsin-like.Clan AD contains two families of transmembrane endopeptidases: A22 and A24. These are also known as "GXGD peptidases"because of a common GXGD motif which includes one of the pair of catalytic aspartic acid residues. Structures are known for members of both families and show a unique, common fold with up to nine transmembrane regions [ ]. The active site aspartic acids are located within a large cavity in the membrane into which water can gain access [].Clan AE contains two families, A25 and A31. Tertiary structures have been solved for members of both families and show a common fold consisting of an α-β-alpha sandwich, in which the beta sheet is five stranded [ , ].Clan AF contains the single family A26. Members of the clan are membrane-proteins with a unique fold. Homologues are known only from bacteria. The structure of omptin (also known as OmpT) shows a cylindrical barrel containing ten beta strands inserted in the membrane with the active site residues on the outer surface [ ].There are two families of aspartic peptidases for which neither structure nor active site residues are known and these are not assigned to clans. Family A5 includes thermopsin, an endopeptidase found only in thermophilic archaea. Family A36 contains sporulation factor SpoIIGA, which is known to process and activate sigma factor E, one of the transcription factors that controls sporulation in bacteria [ ].
Protein Domain
Name: Endothelin receptor B
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Endothelins are able to activate a number of signal transduction processes including phospholipase A2, phospholipase C and phospholipase D, as well as cytosolic protein kinase activation. The play an important role in the regulation of the cardiovascular system [ , , ] and are the most potent vasoconstrictors identified, stimulating cardiac contraction, regulating the release of vasoactive substances, and stimulating mitogenesis in blood vessels [, ]. As a result, endothelins are implicated in a number of vascular diseases, including the heart, general circulation and brain [, , ]. Endothelins stimulate the contraction in almost all other smooth muscles (e.g., uterus, bronchus, vas deferens, stomach) and stimulate secretion in several tissues e.g., kidney, liver and adrenals [, , ]. Endothelins have also been implicated in a variety of pathophysiological conditions associated with stress including hypertension, myocardial infarction, subarachnoid haemorrhage and renal failure [].Two endothelin receptor subtypes have been isolated and identified, endothelin A receptor(ETA) and endothelin B receptor (ETB) [ , , , ], and are members of the seven transmembrane rhodopsin-like G-protein coupled receptor family (GPCRA) which stimulate multiple effectors via several types of G protein []. ETA and ETB receptors are both widely distributed, ETA receptors are mainly located on vascular smooth muscle cells, whereas ETB receptors are present on endothelial cells lining the vessel wall. Endothelin receptors have also been found in the brain, e.g. cerebral cortex, cerebellum and glial cells [, ]. ETA receptors are considered to be the primary vasoconstrictor and growth-promoting receptor, and the binding of endothelin to ETA increases vasoconstriction (contraction of the blood vessel walls) and the retention of sodium, leading to increased blood pressure []. Endothelin B receptor on the other hand not only inhibits cell growth and vasoconstriction in the vascular system but also functions as a "clearance receptor". This receptor-mediated clearance mechanism is particularly important in the lung, which clears about 80% of circulating endothelin-1 [ , ].Both receptors are localised to non-vascular structures such as epithelial cells as well as occurring in the central nervous system (CNS) on glial cells and neurones, where they are thought to mediate neurotransmission and vascular functions [ ].This entry represents the endothelin B receptor.
Protein Domain
Name: Nitrate transport permease
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].This entry comprises of the nitrate transport permease in bacteria, the gene product of ntrB. The nitrate transport permease is the integral membrane component of the nitrate transport system and belongs to the ATP-binding cassette (ABC) superfamily. At least in photosynthetic bacteria nitrate assimilation is aided by other proteins derived from the operon which among others include products of ntrA, ntrB, ntrC, ntrD, narB. Functionally ntrC and ntrD resemble the ATP binding components of the binding protein-dependent transport systems. Mutational studies have shown that ntrB and ntrC are mandatory for nitrate accumulation. Nitrate reductase is encoded by narB.
Protein Domain
Name: Peptidase C27, rubella virus endopeptidase
Type: Domain
Description: This group of cysteine peptidases belong to the MEROPS peptidase family C27 (clan CA). The type example is the rubella virus endopeptidase (Rubella virus), which is required for processing of the rubella virus replication protein.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Peptidase C36, beet necrotic yellow vein furovirus-type papain-like endopeptidase
Type: Domain
Description: A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.This group of cysteine peptidases correspond to MEROPS peptidase family C36 (clan CA). The type example is beet necrotic yellow vein furovirus-type papain-like endopeptidase (beet necrotic yellow vein virus), which is involved in processing the viral polyprotein.
Protein Domain
Name: Peptidase C53, pestivirus Npro
Type: Domain
Description: A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.This group of cysteine peptidases belong to MEROPS peptidase family C53 (clan C-). The active site residues occur in the order E, H, C in the sequence which is unlike that in any other family. They are unique to pestiviruses. The N-terminal cysteine peptidase (Npro) encoded by the bovine viral diarrhoea virus genome is responsible for the self-cleavage that releases the N terminus of the core protein. This unique protease is dispensable for viral replication, and its coding region can be replaced by a ubiquitin gene directly fused in frame to the core [ , , , ].
Protein Domain
Name: Lysophosphatidic acid receptor EDG-2
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Lysophospholipids (LPs), such as lysophosphatidic acid (LPA), sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine (SPC), have long been known to act as signalling molecules in addition to their roles as intermediates in membrane biosynthesis []. They have roles in the regulation of cell growth, differentiation, apoptosis and development, and have been implicated in a wide range of pathophysiological conditions, including: blood clotting, corneal wounding, subarachinoid haemorrhage, inflammation and colitis []. A number of G protein-coupled receptors bind members of the lysophopholipid family - these include: the cannabinoid receptors; platelet activating factor receptor; OGR1, an SPC receptor identified in ovarian cancer cell lines; PSP24, an orphan receptor that has been proposed to bind LPA; and at least 8 closely related receptors, the EDG family, that bind LPA and S1P [].LPA is found in all cell types in small quantities (associated with membrane biosynthesis) but is produced in significant quantities by some cellularsources, accounting for the levels of LPA in serum. LPA is also found in elevated levels in ovarian cancer ascites, and acts to stimulate proliferation and promote survival of the cancer cells []. The effects of LPA on the proliferation and morphology of a number of other cell types have been well documented [, ]. However, identification of the mechanisms by which these effects are accomplished has been complicated by a number of factors, such as: a lack of antagonists, difficulty in ligand-binding experiments and the responsiveness of many cell types to LPA []. The G protein-coupled receptors EDG-2, EDG-4 and EDG-7 have now been identifiedas high affinity receptors for LPA. It has been suggested that these receptors should now be referred to as lpA1, lpA2 and lpA3 respectively [ , ].EDG-2 was originally identified as a gene involved in neuron production from embryonic cerebral cortex []. EDG-2 is widely distributed, with highest levels in the brain (in which expression correlates with development ofoligodendrocytes and Schwann cells) [ ]. In the periphery, EDG-2 is found in many tissues, including the heart, kidney, testis, spleen and muscle in both humans and mouse []. The receptor is also expressed in a number of cancers []. Upon binding of LPA, EDG-2 couples to G proteins of the Gi, Gq and G12/13 classes, to mediate a range of effects including: inhibition of adenylyl cyclase; activation of phospholipase C, serum response element and MAP kinases; and actomyosin stimulation. These processes lead to cell rounding and proliferation [].
Protein Domain
Name: Lysophosphatidic acid receptor EDG-4
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Lysophospholipids (LPs), such as lysophosphatidic acid (LPA), sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine (SPC), have long been known to act as signalling molecules in addition to their roles as intermediates in membrane biosynthesis []. They have roles in the regulation of cell growth, differentiation, apoptosis and development, and have been implicated in a wide range of pathophysiological conditions, including: blood clotting, corneal wounding, subarachinoid haemorrhage, inflammation and colitis []. A number of G protein-coupled receptors bind members of the lysophopholipid family - these include: the cannabinoid receptors; platelet activating factor receptor; OGR1, an SPC receptor identified in ovarian cancer cell lines; PSP24, an orphan receptor that has been proposed to bind LPA; and at least 8 closely related receptors, the EDG family, that bind LPA and S1P [].LPA is found in all cell types in small quantities (associated with membrane biosynthesis) but is produced in significant quantities by some cellularsources, accounting for the levels of LPA in serum. LPA is also found in elevated levels in ovarian cancer ascites, and acts to stimulate proliferation and promote survival of the cancer cells []. The effects of LPA on the proliferation and morphology of a number of other cell types have been well documented [, ]. However, identification of the mechanisms by which these effects are accomplished has been complicated by a number of factors, such as: a lack of antagonists, difficulty in ligand-binding experiments and the responsiveness of many cell types to LPA []. The G protein-coupled receptors EDG-2, EDG-4 and EDG-7 have now been identifiedas high affinity receptors for LPA. It has been suggested that these receptors should now be referred to as lpA1, lpA2 and lpA3 respectively [ , ].EDG-4 is expressed at high levels in the testis and peripheral blood leukocytes of humans, and the testis, kidney and embryonic brain in mouse.Lower levels of expression are found in human pancreas, spleen, thymus and prostate, and mouse heart, lung, spleen, thymus, stomach and brain []. Variant forms of the receptor are also expressed in cancer cells []. Binding of LPA to EDG-4 results in increased calcium levels, inhibition of adenyly cylase, activation of MAP kinases and cell rounding, through coupling to Gi/o, Gq/11 and G12/13 proteins [].
Protein Domain
Name: Phosphotyrosyl phosphatase activator, PTPA
Type: Family
Description: Phosphotyrosyl phosphatase activator (PTPA, also known as protein phosphatase 2A activator) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognised phosphoserine/ threonine protein phosphorylase activity. It also reactivates the serine/threonine phosphatase activity of an inactive form of PP2A. The specific biological role of PTPA is unknown. PTPA has been suggested to play a role in the insertion of metals to the PP2A catalytic subunit (PP2Ac) active site, to act as a chaperone, and more recently, to have peptidyl prolyl cis/trans isomerase activity that specifically targets human PP2Ac [ , , , , , ]. Together, PTPA and PP2A constitute an ATPase and it has been suggested that PTPA alters the relative specificity of PP2A from phosphoserine/phosphothreonine substrates to phosphotyrosine substrates in an ATP-hydrolysis-dependent manner. Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumour suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1 [].
Protein Domain
Name: Sulfur carrier ThiS/MoaD-like
Type: Family
Description: ThiS (thiaminS) is a 66 aa protein involved in sulphur transfer. ThiS is coded in the thiCEFSGH operon in Escherichia coli. This family of proteins have two conserved Glycines at the COOH terminus. Thiocarboxylate is formed at the last G in the activation process. Sulphur is transferred from ThiI to ThiS in a reaction catalysed by IscS [ ]. MoaD, a protein involved in sulphur transfer during molybdopterin synthesis, is about the same length and shows limited sequence similarity to ThiS. Both have the conserved GG at the COOH end [].ThiS/MoaD proteins serve as sulfur carriers in thiamine and tungsten/molybdenum cofactor biosynthesis. Proteins in this entry also include TtuB from Thermus thermophilus. TtuB functions as the sulfur donor in the sulfurtransferase reaction catalyzed by TtuA [ ]. It is also required for the 2-thiolation of 5-methyluridine residue at position 54 in the T loop of tRNAs, leading to 5-methyl-2-thiouridine (m5s2U or s2T). This modification allows thermal stabilization of tRNAs in thermophilic microorganisms, and is essential for cell growth at high temperatures [].
Protein Domain
Name: Inositol monophosphatase-like
Type: Family
Description: It has been shown that several proteins share two sequence motifs [ ]. Two of these proteins, vertebrate and plant inositol monophosphatase (), and vertebrate inositol polyphosphate 1-phosphatase ( ), are enzymes of the inositol phosphate second messenger signalling pathway, and share similar enzyme activity. Both enzymes exhibit an absolute requirement for metal ions (Mg2 is preferred), and their amino acid sequences contain a number of conserved motifs, which are also shared by several other proteins related to MPTASE (including products of fungal QaX and qutG, bacterial suhB and cysQ, and yeast hal2) [ ]. The function of the other proteins is not yet clear, but it is suggested that they may act by enhancing the synthesis or degradation of phosphorylated messenger molecules []. Structural analysis of these proteins has revealed a common core of 155 residues, which includes residues essential for metal binding and catalysis. An interesting property of the enzymes of this family is their sensitivity to Li+. The targets and mechanism of action of Li+ are unknown, but overactive inositol phosphate signalling may account for symptoms of manic depression [].
Protein Domain
Name: Photosystem I PsaA/PsaB
Type: Family
Description: Photosystem I (PSI) [ ] is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. PSI is found in the chloroplast of plants and cyanobacteria. The electron transfer components of the reaction centre of PSI are a primary electron donor P-700 (chlorophyll dimer) and five electron acceptors: A0 (chlorophyll), A1 (a phylloquinone) and three 4Fe-4S iron-sulphur centres: Fx, Fa, and Fb.PsaA and psaB, two closely related proteins, are involved in the binding of P700, A0, A1, and Fx. psaA and psaB are both integral membrane proteins of 730 to 750 amino acids that seem to contain 11 transmembrane segments. The Fx 4Fe-4S iron-sulphur centre is bound by four cysteines; two of these cysteines are provided by the psaA protein and the two others by psaB. The two cysteines in both proteins are proximal and located in a loop between the ninth and tenth transmembrane segments. A leucine zipper motif seems to be present [ ] downstream of the cysteines and could contribute to dimerisation of psaA/psaB.
Protein Domain
Name: Ubiquitin-like modifier-activating enzyme Atg7
Type: Family
Description: This is a family of eukaryotic proteins found in animals, plants, and yeasts that includes Atg7p (YHR171W) from Saccharomyces cerevisiae (Baker's yeast) and ATG7 from Pichia angusta. Members are about 650 to 700 residues in length and include a central domain of about 150 residues shared with the ThiF/MoeB/HesA family of proteins. A low level of similarity to ubiquitin-activating enzyme E1 is described in a paper on peroxisome autophagy mediated by ATG7 [ ], and is the basis of the name ubiquitin activating enzyme E1-like protein. Members of the family are involved in protein lipidation events analogous to ubiquitination and required for membrane fusion events during autophagy. This protein is important for several processes. It plays a key role in the maintenance of axonal homeostasis, the prevention of axonal degeneration [ ], the maintenance of hematopoietic stem cells [], the formation of Paneth cell granules [[cite22291845]], as well as in adipose differentiation [ ]. It is involved in circadian clock regulation in the liver and glucose metabolism through the autophagic degradation of CRY1 (clock repressor) in a time-dependent manner [].
Protein Domain
Name: Acetyl-CoA biotin carboxyl carrier
Type: Family
Description: The proteins in this family are a component of the acetyl coenzyme A carboxylase complex( ) and are involved in the first step in long-chain fatty acid synthesis. In plants this is usually located in the chloroplast. In the first step, biotin carboxylase catalyses the carboxylation of the carrier protein to form an intermediate. Next, the transcarboxylase complex transfers the carboxyl group from the intermediate to acetyl-CoA forming malonyl-CoA. This protein functions in the transfer of CO 2from one site to another, the biotin binding site locates to the C-terminal of this protein. The biotin is specifically attached to a lysine residue in the sequence AMKM. The structure of the C-terminal domain of the biotin carboxyl carrier (BCC) protein was shown to be a flattened β-barrel structure comprising two four-stranded beta sheets interrupted by a structural loop forming a thumb structure. The biotinyl-lysine is located on a tight β-turn on the opposite end of the molecule. The thumb structure has been shown to attached biotin, thus stabilising the structure.
Protein Domain
Name: FeS cluster insertion, C-terminal, conserved site
Type: Conserved_site
Description: These proteins in this entry are small (106 to 135 amino-acid residues in bacteria, about 200 residues in fungi) that contain a number of conserved regions. They appear to be associated with the process of FeS-cluster assembly. The HesB proteins are associated with the nif gene cluster and the Rhizobium gene IscN has been shown to be required for nitrogen fixation [ ]. Nitrogenase includes multiple FeS clusters and many genes for their assembly. The Escherichia coli SufA protein is associated with SufS, a NifS homologue and SufD which are involved in the FeS cluster assembly of the FhnF protein []. The Azotobacter protein IscA (homologues of which are also found in E. coli) is associated which IscS, another NifS homologue and IscU, a nifU homologue as well as other factors consistent with a role in FeS cluster chemistry []. A homologue from Geobacter contains a selenocysteine in place of an otherwise invariant cysteine, further suggesting a role in redox chemistry.This entry represents a conserved site in the C-terminal extremity, it contains two conserved cysteines.
Protein Domain
Name: ICln
Type: Family
Description: ICln, known as methylosome subunit pICln or chloride conductance regulatory protein ICln, owes these different names to its function in multiple regulatory pathways [ ] as different as ion permeation, ribonucleoprotein biosynthesis and cytoskeletal organisation []. ICln can be identified both in the cytosol and in the cellular membrane, where it functions as a chloride current regulator and is important in regulating volume decrease after cellular swelling [, , , ].pLCln also functions as a Sm chaperone in the stepwise snRNP assembly process [ ]. snRNPs is a RNA-protein complex esessential to the removal of introns from pre-mRNA [, ]. In humans, the core of snRNPs is composed of seven Sm proteins bound to snRNA. pLCln tethers the hetero-oligomers SmD1/D2 and SmE/F/G into a ring-shaped 6S complex, which subsequently docks onto the SMN complex. The SMN complex then removes pICln and enables the transfer of pre-assembled Sm proteins onto snRNA []. Consistent with the role of human pICln, the orthologue from S. pombe is required for optimal production of the spliceosomal snRNPs and for efficient splicing [].
Protein Domain
Name: Frataxin/CyaY
Type: Family
Description: The eukaryotic proteins in this entry include frataxin, the protein that is mutated in Friedreich's ataxia [ ], and related sequences. Friedreich's ataxia is a progressive neurodegenerative disorder caused by loss of function mutations in the gene encoding frataxin (FRDA). Frataxin mRNA is predominantly expressed in tissues with a high metabolic rate (including liver, kidney, brown fat and heart). Mouse and yeast frataxin homologues contain a potential N-terminal mitochondrial targeting sequence, and human frataxin has been observed to co-localise with a mitochondrial protein. Furthermore, disruption of the yeast gene has been shown to result in mitochondrial dysfunction. Friedreich's ataxia is thus believed to be a mitochondrial disease caused by a mutation in the nuclear genome (specifically, expansion of an intronic GAA triplet repeat) [, , ].The bacterial proteins in this entry are iron-sulphur cluster (FeS) metabolism CyaY proteins homologous to eukaryotic frataxin. Partial Phylogenetic Profiling [ ] suggests that CyaY most likely functions as part of the ISC system for FeS cluster biosynthesis, and is supported by expermimental data in some species [, ].
Protein Domain
Name: Regulator of chromosome condensation, RCC1
Type: Repeat
Description: The regulator of chromosome condensation (RCC1) [ ] is a eukaryotic proteinwhich binds to chromatin and interacts with ran, a nuclear GTP-binding protein , to promote the loss of bound GDP and the uptake of fresh GTP, thus acting as a guanine-nucleotide dissociation stimulator (GDS).The interaction of RCC1 with ran probably plays an important role in the regulation of gene expression.RCC1, known as PRP20 or SRM1 in yeast, pim1 in fission yeast and BJ1 in Drosophila, is a protein that contains seven tandem repeats of a domain ofabout 50 to 60 amino acids. As shown in the following schematic representation, the repeats make up the major part of the length of theprotein. Outside the repeat region, there is just a small N-terminal domain of about 40 to 50 residues and, in the Drosophila protein only, a C-terminaldomain of about 130 residues.+----+-------+-------+-------+-------+-------+-------+-------+-------------+ |N-t.|Rpt. 1 |Rpt. 2 |Rpt. 3 |Rpt. 4 |Rpt. 5 |Rpt. 6 |Rpt. 7 | C-terminal |+----+-------+-------+-------+-------+-------+-------+-------+-------------+ The RCC1-type of repeat is also found in the X-linked retinitis pigmentosaGTPase regulator [ ]. The RCC repeats form a β-propellerstructure.
Protein Domain
Name: D-galactoside/L-rhamnose binding SUEL lectin domain
Type: Domain
Description: The D-galactoside binding lectin purified from sea urchin (Anthocidaris crassispina) eggs exists as a disulphide-linked homodimer of two subunits; the dimeric form is essential for hemagglutination activity [ ]. The sea urchin egg lectin (SUEL) forms a new class of lectins. Although SUEL was first isolated as a D-galactoside binding lectin, it was latter shown that it bind to L-rhamnose preferentially [, ]. L-rhamnose and D-galactose share the same hydroxyl group orientation at C2 and C4 of the pyranose ring structure.A cysteine-rich domain homologous to the SUEL protein has been identified in the following proteins [ , , ]:Plant beta-galactosidases ( ) (lactases). Mammalian latrophilin, the calcium independent receptor of alpha-latrotoxin (CIRL). The galactose-binding lectin domain is not required for alpha-latratoxin binding [ ].Human lectomedin-1.Rhamnose-binding lectin (SAL) from catfish (Silurus asotus, Namazu) eggs. This protein is composed of three tandem repeat domains homologous to the SUEL lectin domain. All cysteine positions of each domain are completely conserved [ ].The hypothetical B0457.1, F32A7.3A and F32A7.3B proteins from Caenorhabditis elegans.The human KIAA0821 protein.
Protein Domain
Name: H/ACA ribonucleoprotein complex, subunit Gar1/Naf1
Type: Family
Description: H/ACA ribonucleoprotein particles (RNPs) are a family of RNA pseudouridine synthases that specify modification sites through guide RNAs. The function of these H/ACA RNPs is essential for biogenesis of the ribosome, splicing of precursor mRNAs (pre-mRNAs), maintenance of telomeres and probably for additional cellular processes [ ]. All H/ACA RNPs contain a specific RNA component (snoRNA or scaRNA) and at least four proteins common to all such particles: Cbf5, Gar1, Nhp2 and Nop10. These proteins are highly conserved from yeast to mammals and homologues are also present in archaea []. The H/ACA protein complex contains a stable core composed of Cbf5 and Nop10, to which Gar1 and Nhp2 subsequently bind [].Naf1 is an RNA-binding protein required for the maturation of box H/ACA snoRNPs complex and ribosome biogenesis. During assembly of the H/ACA snoRNPs complex, it associates with the complex, disappearing during maturation of the complex and being replaced by Gar1 to yield mature H/ACA snoRNPs complex. The core domain of Naf1 is homologous to the core domain of Gar1, suggesting that they share a common Cbf5 binding surface [ ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom