Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 15601 to 15700 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.039s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: IRAK2, death domain
Type: Domain
Description: This entry represents the death domain found in interleukin-1 receptor-associated kinase-like 2 (IRAK2) [ ]. IRAK2 is an essential component of several signaling pathways, including NF-kappaB and the IL-1 signaling pathways. It is an inactive kinase that participates in septic shock mediated by TLR4 and TLR9 []. It plays a redundant role with IRAK1 in early NF-kB and MAPK responses, and remains present at later stages whereas IRAK1 disappears [, ].Interleukin-1 receptor-associated kinases (IRAKs) are essential components of innate immunity and inflammation in mammals and other vertebrates [ ]. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors, nuclear factor-kappaB (NF-kB), and mitogen-activated protein kinases (MAPKs). IRAKs contain an N-terminal death domain (DD) and a C-terminal kinase domain [, , , ].Death domains (DDs) are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes [ ].
Protein Domain
Name: Phasin, subfamily 1
Type: Family
Description: Phasins (or granule-associate proteins) are surface proteins found covering Polyhydroxyalkanoate (PHA) storage granules in bacteria. Polyhydroxyalkanoates are linear polyesters produced by bacterial fermentation of sugar or lipids for the purpose of storing carbon and energy, and are accumulated as intracellular granules by many bacteria under unfavorable conditions, enhancing their fitness and stress resistance [ ]. The layer of phasins stabilises the granules and prevents coalescence of separated granules in the cytoplasm and nonspecific binding of other proteins to the hydrophobic surfaces of the granules. For example, in Ralstonia eutropha (strain ATCC 17699/H16/DSM 428/Stanier 337) (Cupriavidus necator (strain ATCC 17699 / H16 / DSM 428 / Stanier 337)), the major surface protein of polyhydroxybutyrate (PHB) granules is phasin PhaP1(Reu), which occurs along with three homologues (PhaP2, PhaP3, and PhaP4) that have the capacity to bind to PHB granules but are present at minor levels [, ]. These four phasins lack a highly conserved domain but share homologous hydrophobic regions. This entry describes a group of phasins associated with polyhydroxyalkanoate (PHA) inclusions, the most common of which consist of polyhydroxybutyrate (PHB). However, the member from Magnetospirillum sp. (strain AMB-1) is called a magnetic particle membrane-specific GTPase.
Protein Domain
Name: Vitamin K epoxide reductase-like VKOR/LOT1
Type: Domain
Description: Proteins containing this domain (also known as VKOR domain) are from bacteria, plants and archaea. They are homologous to mammalian Vitamin K epoxide reductases (VKORC1). In some plant and bacterial homologues, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade [ ]. Proteins containing this domain include Thiol-disulfide oxidoreductase LTO1 (also known as Vitamin K reductase, AtVKOR) from Arabidopsis [] and Vitamin K epoxide reductase homologue (VKOR) from Synechococcus sp. [, ]. In general, they disulfide bond-forming enzymes which control disulfide bond formation []. All homologues of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle [].In Arabidopsis LTO1 catalyses disulfide bond formation of chloroplast proteins and is involved in thylakoid redox regulation and photosynthetic electron transport [ ]. It is required for the assembly of photosystem II (PSII) through the formation of disulfide bond in PSBO, a subunit of the PSII oxygen-evolving complex in the thylakoid lumen []. Bacterial VKOR homologues catalyse disulphide bridge formation in secreted proteins by cooperating with a periplasmic, Trx-like redox partner [, ].
Protein Domain
Name: PTK6, SH2 domain
Type: Domain
Description: Human protein-tyrosine kinase-6 (PTK6, also known as breast tumor kinase (Brk)) is a member of the non-receptor protein-tyrosine kinase family and is expressed in two-thirds of all breast tumours [ ]. PTK6 contains an SH3 domain, an SH2 domain, and catalytic domains. For the case of the non-receptor protein-tyrosine kinases, the SH2 domain is typically involved in negative regulation of kinase activity by binding to a phosphorylated tyrosine residue near to the C terminus. The C-terminal sequence of PTK6 (PTSpYENPT where pY is phosphotyrosine) is thought to be a self-ligand for the SH2 domain []. This entry represents the SH2 domain of PTK6. The structure of this domain resembles other SH2 domains except for a centrally located four-stranded antiparallel β-sheet (strands betaA, betaB, betaC, and betaD). There are also differences in the loop length which might be responsible for PTK6 ligand specificity [ ]. There are two possible means of regulation of PTK6: autoinhibitory with the phosphorylation of Tyr playing a role in its negative regulation and autophosphorylation at this site, though it has been shown that PTK6 might phosphorylate signal transduction-associated proteins Sam68 and signal transducing adaptor family member 2 (STAP/BKS) in vivo [].
Protein Domain
Name: Carboxypeptidase M, N-terminal domain
Type: Domain
Description: Carboxypeptidase M (CPM; MEROPS identifier M14.006; ) is an extracellular glycoprotein, bound to cell membranes via a glycosyl-phosphatidylinositol on the C terminus of the protein [ ]. It specifically removes C-terminal basic residues (Arg or Lys) from peptides and proteins [, ]. The highest levels of CPM have been found in human lung and placenta, but significant amounts are present in kidney, blood vessels, intestine, brain, and peripheral nerves. CPM has also been found in soluble form in various body fluids, including amniotic fluid, seminal plasma and urine. Due to its wide distribution in a variety of tissues, it is believed that it plays an important role in the control of peptide hormones and growth factor activity on the cell surface and in the membrane-localized degradation of extracellular proteins. For example, it hydrolyses the C-terminal arginine of epidermal growth factor (EGF) resulting in des-Arg-EGF which binds to the EGF receptor (EGFR) with an equal or greater affinity than native EGF. CPM is a required processing enzyme that generates specific agonists for the B1 receptor [, ].This entry represents the carboxypeptidase (N-terminal) domain of carboxypeptidase M.
Protein Domain
Name: Torsin
Type: Family
Description: Torsins are membrane-associated ATPases. They belong to the AAA+ (ATPase associated with a variety of cellular activities) superfamily of ATPases, but they lack conserved catalytic residues typically found in related ATPases. Accordingly, Torsins do not display ATPase activity unless they are engaged by their regulatory cofactors lamina-associated polypeptide 1 (LAP1) or luminal domain-like LAP1 (LULL1) [ , , ], which are type II transmembrane proteins located in the nuclear envelope and endoplasmic reticulum (ER) []. LAP1 and LULL1 integrate into the Torsin ring to produce the biologically active ATPase machine [].Torsion dystonia is an autosomal dominant movement disorder characterised by involuntary, repetitive muscle contractions and twisted postures. The most severe early-onset form of dystonia has been linked to mutations in the human DYT1 (TOR1A) gene encoding a protein termed torsinA. While causative genetic alterations have been identified, the function of torsin proteins and the molecular mechanism underlying dystonia remain unknown. It has been suggested that torsins play a role in effectively managing protein folding and that possible breakdown in a neuroprotective mechanism that is, in part, mediated by torsins may be responsible for the neuronal dysfunction associated with dystonia [ , ].
Protein Domain
Name: Rho guanine nucleotide exchange factor 3, PH domain
Type: Domain
Description: PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [ ]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [].ARHGEF3 (also known as XPLN) is a guanine nucleotide exchange factor (GEF) for RhoA and RhoB GTPases [ ]. It contains a tandem Dbl homology and a PH domain. This entry represents the PH domain.
Protein Domain
Name: Olduvai domain
Type: Domain
Description: Proteins of the neuroblastoma breakpoint family (NBPF) contain a highly conserved domain of unknown function, which is known as NBPF, also known as Olduvai [ ] or DUF1220 []. The NBPF/DUF1220 domain is present in multiple copies in NBPF proteins and once, with lower homology, in mammalian myomegalin, a protein localised in the Golgi/centrosomal area which functions as an anchor to localise components of the cyclic adenosine monophosphate-dependent pathway to this region. The implications of the resemblance of NBPF proteins to myomegalin remain obscure.NBPF domains are typically built of two exons [ , ]. The number of NBPF repeat copies is highly expanded in humans, reduced in African great apes, further reduced in orangutan and Old World monkeys, single-copy in nonprimate mammals, and absent in nonmammalian species. The NBPF domain that is found as a singly copy in nonprimate mammals is the likely ancestral domain. Studies suggest an association between NBPF/DUF1220 copy number and brain size, and more specifically neocortex volume []. An association has been established between DUF1220 subtype CON1 copy number and autism severity [], and between subtype CON2 copy number and cognitive function [].
Protein Domain
Name: Folliculin, DENN domain
Type: Domain
Description: Folliculin (FLCN) is a tumor suppressor that enables nutrient-dependent activation of the mechanistic target of rapamycin complex 1 (mTORC1) protein kinase via its guanosine triphosphatase (GTPase) Activating Protein (GAP) activity. It belong to the DENN module family of proteins and contains a divergent DENN module comprised of a N-terminal longin domain (also known as upstream DENN domain, u-DENN), followed by a DENN domain. It forms a complex with its partners, FNIP1 or FNIP2 (Folliculin interacting protein 1 or 2), which directly contacts the Rag GTPases RagC/D to stimulate GTP hydrolysis and thus promote the conversion to the GDP-bound state. FLCN-FNIP2 adopts an extended conformation with two pairs of heterodimerized domains. They contain longin domains that heterodimerize and contact both nucleotide binding domains of the Rag heterodimer, and C-terminal DENN domains which interact at the distal end of the structure [ , , ].This is the DENN domain found at the C-terminal of folliculin. This domain shares structural similarity with DENN domain of DENN1B (a Rab GEF). It mediates contact with the longin domain in the heterodimers to ensure a strong intersubunit interaction [ , , ].
Protein Domain
Name: Synaptojanin-1, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif (RRM) of synaptojanin-1. Synaptojanin-1 was originally identified as one of the major Grb2-binding proteins that may participate in synaptic vesicle endocytosis [ ]. It also acts as an Src homology 3 (SH3) domain-binding brain-specific inositol 5-phosphatase, with a putative role in clathrin-mediated endocytosis [, ]. Synaptojanin-1 contains an N-terminal domain homologous to the cytoplasmic portion of the yeast protein Sac1p [], a central inositol 5-phosphatase domain followed by a putative RNA recognition motif (RRM), and a C-terminal proline-rich region mediating the binding of synaptojanin-1 to various SH3 domain-containing proteins including amphiphysin, SH3p4, SH3p8, SH3p13, and Grb2 []. Synaptojanin-1 has two tissue-specific alternative splicing isoforms, synaptojanin-145 expressed in brain and synaptojanin-170 expressed in peripheral tissues. Synaptojanin-145 is very abundant in nerve terminals and may play an essential role in the clathrin-mediated endocytosis of synaptic vesicles [ ]. In contrast to synaptojanin-145, synaptojanin-170 contains three unique asparagine-proline-phenylalanine (NPF) motifs in the C-terminal region, and may function as a potential binding partner for Eps15, a clathrin coat-associated protein acting as a major substrate for the tyrosine kinase activity of the epidermal growth factor receptor [].
Protein Domain
Name: SinR repressor/SinI anti-repressor, dimerisation domain
Type: Domain
Description: The SinR repressor is part of a group of Sin (sporulation inhibition) proteins in Bacillus subtilis that regulate the commitment to sporulation in response to extreme adversity [ ]. SinR is a tetrameric repressor protein that binds to the promoters of genes essential for entry into sporulation and prevents their transcription. This repression is overcome through the activity of SinI, which disrupts the SinR tetramer through the formation of a SinI-SinR heterodimer, thereby allowing sporulation to proceed. The SinR structure consists of two domains: a dimerisation domain stabilised by a hydrophobic core, and a DNA-binding domain that is identical to domains of the bacteriophage 434 CI and Cro proteins that regulate prophage induction. The dimerisation domain is a four-helical bundle formed from two helices from the C-terminal residues of SinR and two helices from the central residues of SinI. These regions in SinR and SinI are similar in both structure and sequence. The interaction of SinR monomers to form tetramers is weaker than between SinR and SinI, since SinI can effectively disrupt SinR tetramers.This entry represents the dimerisation domain in both SinI and SinR proteins.
Protein Domain
Name: Globin-sensor domain
Type: Domain
Description: This domain is found at the N-terminal of globin-containing proteins mainly from bacteria, archaea and fungi [ ]. It is found in protoglobin, a single-domain globin of yet unknown biological function [], which has specific loops and an amino-terminal extension which leads to the burying of the heme within the matrix of the protein. Protoglobin-specific apolar tunnels allow the access of O2, CO and NO to the haem distal site [, ]. This domain can also recognise cyanide (Matilla et. al., FEMS Microbiology Reviews, fuab043, 45, 2021, 1. https://doi.org/10.1093/femsre/fuab043).Proteins containing this domain include protoglobins from the strictly anaerobic methanogen Methanosarcina acetivorans and from the obligate aerobic hyperthermophile Aeropyrum pernix [ ]. This domain is also found in the N-terminal regions of HemAT from the archaeon Halobacterium salinarum (HemAT-Hs) and from the Gram-positive bacterium Bacillus subtilis (HemAT-Bs) []. It contains a myoglobin-like motif, displays characteristic heme-protein absorption spectra, and binds oxygen reversibly []. This domain is present in Diguanylate cyclase DosC (also known as YddV) from E. coli which is coupled with a C-terminal diguanylate cyclase (DGC/GGDEF) domain, which likely functions as a c-di-GMP cyclase in the synthesis of the second messenger cyclic-di-GMP (c-di-GMP) [].
Protein Domain
Name: Ribosomal RNA small subunit methyltransferase E, methyltransferase domain
Type: Domain
Description: Methyltransferases (Mtases) are responsible for the transfer of methyl groups between two molecules. The transfer of the methyl group from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms. The reaction is catalysed by Mtases and modifies DNA, RNA, proteins or small molecules, such as catechol, for regulatory purposes. Proteins in this entry belong to the RsmE family of Mtases, this is supported by crystal structural studies, which show a close structural homology to other known methyltransferases [ ].This group of proteins includes Ribosomal RNA small subunit methyltransferase E (RsmE) from Escherichia coli, which specifically methylates the uridine in position 1498 of 16S rRNA in the fully assembled 30S ribosomal subunit [ , ]. This enzyme has two distinct but structurally related domains: the N-terminal PUA domain and the conserved MTase domain at the C-terminal end. This protein adopts a dimeric configuration that is functionally critical for substrate binding and catalysis [].This entry represents the C-terminal methyltransferase domain (MTase domain) found in RsmE which shows a deep trefoil knot. This domain is responsible for binding of one AdoMet molecule and for the catalytic process [ ].
Protein Domain
Name: Neutrophil cytosol factor 1, PX domain
Type: Domain
Description: The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions such as cell signaling, vesicular trafficking, protein sorting, and lipid modification, among others. p47phox is a cytosolic subunit of the phagocytic NADPH oxidase complex (also called Nox2 or gp91phox), which plays a key role in the ability of phagocytes to defend against bacterial infections. NADPH oxidase catalyzes the transfer of electrons from NADPH to oxygen during phagocytosis forming superoxide and reactive oxygen species []. p47phox is required for activation of NADH oxidase and plays a role in translocation [].Neutrophil cytosol factor 1 (also known as p47phox) contains an N-terminal PX domain, two Src Homology 3 (SH3) domains, and a C-terminal domain that contains PxxP motifs for binding SH3 domains. The PX domain of p47phox is unique in that it contains two distinct basic pockets on the membrane-binding surface: one preferentially binds phosphatidylinositol-3,4-bisphosphate [PI(3,4)P2] and is analogous to the PI3P-binding pocket of p40phox, while the other binds anionic phospholipids such as phosphatidic acid or phosphatidylserine. Simultaneous binding in the two pockets results in increased membrane affinity [, ]. The PX domain of p47phox is also involved in protein-protein interaction.
Protein Domain
Name: Macrophage colony-stimulating factor 1 receptor
Type: Family
Description: This entry represents a macrophage colony-stimulating factor 1 receptor (CSF1R), also known as macrophage colony-stimulating factor receptor (M-CSFR) or CD115 (Cluster of Differentiation 115). CSF1R is a tyrosine-protein kinase that acts as cell-surface receptor for CSF1 and IL34 and plays an essential role in the regulation of survival, proliferation and differentiation of hematopoietic precursor cells, especially mononuclear phagocytes, such as macrophages and monocytes [ ]. CSF-1 and CSF-1R play an important role in the development of the mammary gland and may be involved in the process of mammary gland carcinogenesis []. Mutations in the CSF1R gene cause hereditary diffuse leukoencephalopathy with spheroids, which is an autosomal-dominant central nervous system white-matter disease with variable clinical presentations, including personality and behavioral changes, dementia, depression, parkinsonism, seizures and other phenotypes [].This entry also includes the tyrosine-protein kinase transforming protein v-fms from Feline sarcoma virus. CSF1R (the c-fms product) is its cellular counterpart. It differs in some amino acid substitutions and the replacement of 50 C-terminal amino acids by 11 unrelated residues in v-fms [ , , ]. v-fms can bind CSF-1 and has constitutive tyrosine-specific protein kinase activity, providing growth stimulatory signals in the absence of ligand [].
Protein Domain
Name: Colipase, conserved site
Type: Conserved_site
Description: Colipase [ , ] is a small protein cofactor needed by pancreatic lipase for efficient dietary lipid hydrolyisis. It also binds to the bile-salt covered triacylglycerol interface, thus allowing the enzyme to anchor itself to the water-lipid interface. Efficient absorption of dietary fats is dependent on the action of pancreatic triglyceride lipase. Colipase binds to the C-terminal, non-catalytic domain of lipase, thereby stabilising as active conformation and considerably increasing the overall hydrophobic binding site. Structural studies of the complex and of colipase alone have revealed the functionality of its architecture [, ].Colipase is a small protein with five conserved disulphide bonds. Structural analogies have been recognised between a developmental protein (Dickkopf), the pancreatic lipase C-terminal domain, the N-terminal domains of lipoxygenases and the C-terminal domain of alpha-toxin. These non-catalytic domains in the latter enzymes are important for interaction with membrane. It has not been established if these domains are also involved in eventual protein cofactor binding as is the case for pancreatic lipase [ ].This entry represents a conserved site within colipase enzymes, covering two of the cysteines involved in disulphide bond formation, as well as three tyrosine residues which seem to be involved in the interfacial binding.
Protein Domain
Name: N-acetylgalactosaminyltransferase
Type: Domain
Description: This entry represents a domain found in N-acetylgalactosaminyltransferases (also known as pp-GalNAc-T). They initiate the formation of mucin-type, O-linked glycans by catalysing the transfer of alpha-N-acetylgalactosamine (GalNAc) from UDP-GalNAc to hydroxyl groups of Ser or Thr residues of core proteins to form the Tn antigen (GalNAc-a-1-O-Ser/Thr). These enzymes are type II membrane proteins with a GT-A type catalytic domain and a lectin domain located on the lumen side of the Golgi apparatus [ ]. In human, there are 15 isozymes of pp-GalNAc-Ts, representing the largest of all glycosyltransferase families. Each isozyme has unique but partially redundant substrate specificity for glycosylation sites on acceptor proteins [].Interestingly, some members, such as human GALNTL5, lack the C-terminal ricin B-type lectin domain, which contributes to the glycopeptide specificity. No glycosyltransferase activity has been detected for human GALNTL5 in an in vitro assay [ ].This entry also includes Putative inactive polypeptide N-acetylgalactosaminyltransferase 11/12 from Drosophila, which, although they are strongly related to polypeptide N-acetylgalactosaminyltransferase proteins, they lack the conserved His at position 211 which is part of the Asp-X-His motif that binds the cofactor Mn2, suggesting that they may have lost its activity.
Protein Domain
Name: RIMS-binding protein, second SH3 domain
Type: Domain
Description: RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2 [ ]. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3 []. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery []. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3 []. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis []. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits []. This entry represents the second SH3 domain of RBPs.
Protein Domain
Name: RIMS-binding protein, third SH3 domain
Type: Domain
Description: RIMs binding proteins (RBPs, RIM-BPs) associate with calcium channels present in photoreceptors, neurons, and hair cells; they interact simultaneously with specific calcium channel subunits, and active zone proteins, RIM1 and RIM2 [ ]. RIMs are part of the matrix at the presynaptic active zone and are associated with synaptic vesicles through their interaction with the small GTPase Rab3 []. RIM-BPs play a role in regulating synaptic transmission by serving as adaptors and linking calcium channels with the synaptic vesicle release machinery []. RIM-BPs contain three SH3 domains and two to three fibronectin III repeats. Invertebrates contain one, while vertebrates contain at least two RIM-BPs, RIM-BP1 and RIM-BP2. RIM-BP1 is also called peripheral-type benzodiazapine receptor associated protein 1 (PRAX-1). Mammals contain a third protein, RIM-BP3 []. RIM-BP1 and RIM-BP2 are predominantly expressed in the brain where they display overlapping but distinct expression patterns, while RIM-BP3 is almost exclusively expressed in the testis and is essential in spermiogenesis []. The SH3 domains of RIM-BPs bind to the PxxP motifs of RIM1, RIM2, and L-type (alpha1D) and N-type (alpha1B) calcium channel subunits []. This entry represents the third SH3 domain of RBPs.
Protein Domain
Name: Autotransporter-associated beta strand repeat
Type: Repeat
Description: This Autotransporter-associated beta strand repeat model represents a core 32-residue region of a class of bacterial protein repeat found in one to 30 copies per protein. Most proteins with a copy of this repeat have domains associated with membrane autotransporters ( ). The repeats occur with a periodicity of 60 to 100 residues. A pattern of sequence conservation is that every second residue is well-conserved across most of the domain. These repeats as likely to have a β-helical structure. This repeat plays a role in the efficient transport of autotransporter virulence factors to the bacterial surface during growth and infection. The repeat is always associated with the passenger domain of the autotransporter. For these reasons it has been coined the Passenger-associated Transport Repeat (PATR) [ ]. The mechanism by which the PATR motif promotes transport is uncertain but it is likely that the conserved glycines (see HMM Logo) are required for flexibility of folding and that this folding drives secretion []. Autotransporters that contain PATR(s) associate with distinct virulence traits such as subtilisin (S8) type protease domains and polymorphic outer-membrane protein repeats, whilst SPATE (S6) type protease and lipase-like autotransporters do not tend to contain PATR motifs [].
Protein Domain
Name: Transcription factor AP-2, C-terminal
Type: Domain
Description: Activator protein-2 (AP-2) transcription factors constitute a family of closely related and evolutionarily conserved proteins that bind to the DNA consensus sequence 5'-GCCNNNGGC-3' and stimulate target gene transcription [, ]. Five different isoforms of AP-2 have been identified in mammals, termed AP-2 alpha, beta, gamma, delta and epsilon. Each family member shares a common structure, possessing a proline/glutamine-rich domain in the N-terminal region, which is responsible for transcriptional activation [], and a helix-span-helix domain in the C-terminal region, which mediates dimerisation and site-specific DNA binding [].The AP-2 family have been shown to be critical regulators of gene expression during embryogenesis. They regulate the development of facial prominence and limb buds, and are essential for cranial closure and development of the lens [ , ]; they have also been implicated in tumorigenesis. AP-2 protein expression levels have been found to affect cell transformation, tumour growth and metastasis, and may predict survival in some types of cancer [, ]. Mutations in human AP-2 have been linked with bronchio-occular-facial syndrome and Char Syndrome, congenital birth defects characterised by craniofacial deformities and patent ductus arteriosus, respectively []. This entry represents the C-terminal region of these proteins, including the helix-span-helix domain.
Protein Domain
Name: WRKY domain
Type: Domain
Description: The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding [ ]. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene.Structural studies indicate that this domain is a four-stranded β-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure [ ]. The WRKYGQK residues correspond to the most N-terminal β-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the β-sheet.
Protein Domain
Name: 3'-5' exonuclease domain
Type: Domain
Description: This entry represents the domain that is responsible for the 3'-5' exonuclease proofreading activity of Escherichia coli DNA polymerase I (polI) and other enzymes which catalyse the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI and is also found in the Bifunctional 3'-5' exonuclease/ATP-dependent helicase WRN (also known as Werner syndrome helicase), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D) [ ].Werner syndrome is a human genetic disorder causing premature ageing; the WRN protein has helicase activity in the 3'-5' direction [ , ]. The FFA-1 protein is required for formation of a replication foci and also has helicase activity; it is a homologue of the WRN protein []. RNase D is a 3'-5' exonuclease involved in tRNA processing. Also found in this family is the autoantigen PM/Scl thought to be involved in polymyositis-scleroderma overlap syndrome.This domain is also found in some DNA polymerases from phages, including the DNA polymerase from Escherichia phage T5, exonucleolytic activity [], and the DNA polymerase DpoZ from Acinetobacter phage SH-Ab 15497, which preferentially incorporates the non-canonical base aminoadenine/dZTP instead of adenine into the synthesized DNA [].
Protein Domain
Name: LsmAD domain
Type: Domain
Description: This domain can be found in eukaryotic ataxin-2 [ ]. Ataxin-2 is predicted to consist of mostly non-globular domains []. This domain has been shown to interact with RNA helicase DDX6 [].Ataxin-2 has many functions, such as endocytic receptor cycling [ ], translational regulation, embryonic development [], energy metabolism and weight regulation []. Mutations of the Ataxin-2 gene cause spinocerebellar ataxia 2 (SCA2), a neurodegenerative disorder leading to predominant loss of Purkinje cells in the cerebellum and impairment of motor coordination [ ]. In SCA2, expansion of a CAG repeat in exon 1 of the Ataxin-2 (ATXN2) gene causes expansion of a polyQ domain in the ATXN2 protein []. ATXN2 has been shown to interact with many proteins. It interacts with multiple RNA-binding proteins (RBPs), staufen, IP3R, RGS8 mRNA, endophilins and CIN85 [].Proteins containing this domain also include Pbp1 from budding yeasts, Pbp1 interacts with Pab1 to regulate mRNA polyadenylation [ , ]. It promotes mating-type switching in mother cells by positively regulating HO mRNA translation [] and forms a condensate in response to respiratory status to regulate TORC1 signaling []. It is also involved in P-body-dependent granule assembly [].
Protein Domain
Name: Cypovirus polyhedrin
Type: Family
Description: This family is found in polyhedrin proteins of Cypoviruses. These viruses possess a single capsid layer with turrets and are commonly embedded in crystalline occlusion bodies called polyhedra, which are formed in the cell cytoplasm and mainly composed of a single virus-encoded protein, polyhedron. Cypoviruses have been classified into 21 distinct types. Within each type the amino acid sequence of polyhedrins are highly conserved, whilst between types there is little conservation. Structural analysis and comparison of the different polyhedrins reveals five variable regions: the N-terminal loop, connections between secondary structures (H2 and H3, beta-E and beta-F, beta-F and beta-G, beta-G and beta-H), and the C-terminal loop, which is designate V1-V5 respectively. V2 forms a 'cap' at one end of the protein and is subdivided across two sections of the polypeptide, V2n and V2c. Differences in these regions give each polyhedrin its characteristic appearance. The base domain (residues 74-110) is a region that is neither required for proper folding of the protein, nor for crystal assembly, but fine-tunes the crystal, 'locking-down' the structure, often in conjunction with NTPs. This region is also implicated in virion recognition and packaging [ ].
Protein Domain
Name: Prickle/Espinas/Testin
Type: Family
Description: This entry represents a family of proteins with a LIM-type zinc finger that are found in animals and some fungal species, including Protein prickle (Pk) and Protein espinas (Esn) from Drosophila melanogaster and human Testin (Tes). Pk has been implicated in regulation of cell movement in the planar cell polarity (PCP) pathway [ ], which requires the conserved Frizzled/Dishevelled (Dsh). Prickle interacts with Dishevelled [], thereby modulating the activity of Frizzled/Dishevelled and the PCP signalling. Two forms of prickle have been identified, namely prickle 1 and prickle 2. These are differentially expressed; prickle 1 is found in fetal heart and haematological malignancies, while prickle 2 is expressed in fetal brain, adult cartilage, pancreatic islet, and some types of tumorous cells [].Esn binds to the seven-pass transmembrane cadherin Flamingo. The interplay elicits repulsion between dendritic branches of sensory neurons [ ]. Tes is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibres, at cell-cell contact areas, and at focal adhesion plaques [ , ]. It plays a role in the regulation of cell proliferation; it may act as a tumour suppressor [, ].
Protein Domain
Name: Spartin-like
Type: Family
Description: This entry includes senescence/dehydration-associated proteins from plants and Spartin from animals. This group of proteins share the C-terminal AAA ATPase domain [ ].Besides its C-terminal AAA ATPase domain, Spartin contains an N-terminal MIT (contained within microtubule-interacting and transport molecules) domain and a -P-P-x-Y- motif. Spastin monomers assemble into hexameric, ring-shaped ATPases that sever microtubules along their lengths [, ]. In flies, Spartin has been shown to regulates both synaptic development and neuronal survival by controlling microtubule stability via the BMP-dFMRP-Futsch pathway []. In humans, Spartin (also known as SPG20) has been linked to Troyer syndrome, characterised by spastic dysarthria, cognitive impairment, short stature, and distal muscle wasting in addition to lower extremity spastic weakness []. SPG20 has been shown to be recruited to the midbody and participates in cytokinesis [].In plants, this group of proteins have been linked to senescence and dehydration [ , ]. Different from their animal homologues, they don't have the MIT domain. In Hemerocallis, petals have a genetically based program that leads to senescence and cell death approximately 24 hours after the, flower opens, and it is believed that senescence proteins produced around that time have a role in this program [].
Protein Domain
Name: C2 domain superfamily
Type: Homologous_superfamily
Description: The C2 domain is a Ca 2+-dependent membrane-targeting module found in many cellular proteins involved in signal transduction or membrane trafficking. C2 domains are unique among membrane targeting domains in that they show wide range of lipid selectivity for the major components of cell membranes, including phosphatidylserine and phosphatidylcholine. This C2 domain is about 116 amino-acid residues and is located between the two copies of the C1 domain in Protein Kinase C and the protein kinase catalytic domain [ ]. Regions with significant homology [] to the C2-domain have been found in many proteins. The C2 domain is thought to be involved in calcium-dependent phospholipid binding [] and in membrane targetting processes such as subcellular localisation. The 3D structure of the C2 domain of synaptotagmin has been reported [], the domain forms an eight-stranded β-sandwich constructed around a conserved 4-stranded motif, designated a C2 key []. Calcium binds in a cup-shaped depression formed by the N- and C-terminal loops of the C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel β-sandwich.
Protein Domain
Name: p53-like transcription factor, DNA-binding domain superfamily
Type: Homologous_superfamily
Description: This domain superfamily is found in a number of transcription factors, including p53, NFATC, TonEBP, STAT-1, and NFkappaB, where it is responsible for DNA-binding. These transcription factors play diverse roles in the regulation of cellular functions: the p53 tumour suppressor upregulates the expression of genes involved in cell cycle arrest and apoptosis [ ]; NFATC regulates the production of effector proteins involved in coordinating the immune response []; TonEBP regulates gene expression induced by osmotic stress and helps regulate intracellular volume during cell growth []; STAT-1 plays an important role in B lymphocyte growth and function []; and NFkappaB is involved in the inflammatory response []. The DNA-binding domain acts to clamp, or in the case of TonEBP, encircle the DNA target in order to stabilise the protein-DNA complex []. Protein interactions may also serve to stabilise the protein-DNA complex, for example in the STAT-1 dimer the SH2 (Src homology 2) domain in each monomer is coupled to the DNA-binding domain to increase stability []. The DNA-binding domain consists of a β-sandwich formed of 9 strands in 2 sheets with a Greek-key topology. This structure is found in many transcription factors, often within the DNA-binding domain.
Protein Domain
Name: ArsR-type transcription regulator, HTH motif
Type: Conserved_site
Description: Bacterial transcription regulatory proteins that bind DNA via a helix-turn-helix (HTH) motif can be grouped into families on the basis of sequence similarities. One such group, termed arsR, includes several proteins that appear to dissociate from DNA in the presence of metal ions: arsR, which functions as a transcriptional repressor of an arsenic resistance operon; smtB from Synechococcus sp. (strain PCC 7942), which acts as a transcriptional repressor of the smtA gene that codes for a metallothionein; cadC, a transcription regulator of the cadmium resistance (cad) operon which encodes a Cd/Pb-specific efflux ATPase [].The HTH motif is thought to be located in the central part of these proteins [ ]. The motif is characterised by a number of well-conserved residues: at its N-terminal extremity is a cysteine residue; a second Cys is found in arsR and cadC, but not in smtA; and at the C terminus lie one or two histidines. These residues may be involved in metal-binding (Zn in smtB; metal-oxyanions such as arsenite, antimonite and arsenate for arsR; and cadmium for cadC) []. It is believed that binding of a metal ion could induce a conformational change that would prevent the protein from binding DNA [].
Protein Domain
Name: Hemopexin, conserved site
Type: Conserved_site
Description: Hemopexin ( ) is a serum glycoprotein that binds haem and transports it to the liver for breakdown and iron recovery, after which the free hemopexin returns to the circulation [ ]. Hemopexin prevents haem-mediated oxidative stress. Structurally hemopexin consists of two similar halves of approximately two hundred amino acid residues connected by a histidine-rich hinge region. Each half is itself formed by the repetition of a basic unit of some 35 to 45 residues. Hemopexin-like domains have been found in two other types of proteins, vitronectin [], a cell adhesion and spreading factor found in plasma and tissues, and matrixins MMP-1, MMP-2, MMP-3, MMP-9, MMP-10, MMP-11, MMP-12, MMP-14, MMP-15 and MMP-16, members of the matrix metalloproteinase family that cleave extracellular matrix constituents []. These zinc endopeptidases, which belong to MEROPS peptidase subfamily M10A, have a single hemopexin-like domain in their C-terminal section. It is suggested that the hemopexin domain facilitates binding to a variety of molecules and proteins, for example the HX repeats of some matrixins bind tissue inhibitor of metallopeptidases (TIMPs).This entry represents a conserved sequence region located at the beginning of the second repeat in the hemopexin domain.
Protein Domain
Name: NusG-like
Type: Family
Description: Bacterial transcription antitermination protein, NusG, is a component of the transcription complex and interacts with the termination factor Rho and RNA polymerase [ , ]. NusG is a bacterial transcriptional elongation factor involved in transcription termination and antitermination [, ].RfaH is a transcription antitermination protein that enhances distal genes transcription elongation in a specialized subset of operons that encode extracytoplasmic components [ ]. It is most closely related to the transcriptional termination/antitermination protein NusG and contains the KOW motif []. This protein appears to be limited to the proteobacteria.RfaH is recruited into a multi-component RNA polymerase complex by the ops element, which is a short conserved DNA sequence located downstream of the main promoter of these operons. Once bound, RfaH suppresses pausing and inhibits Rho-dependent and intrinsic termination at a subset of sites. Termination signals are bypassed, which allows complete synthesis of long RNA chains. Enhances expression of several operons involved in synthesis of lipopolysaccharides, exopolysaccharides, hemolysin, and sex factor. Also negatively controls expression and surface presentation of AG43 and possibly another AG43-independent factor that mediates cell-cell interactions and biofilm formation [ , , ].This entry includes NusG and its paralogue RfaH [ ].
Protein Domain
Name: Four-carbon acid sugar kinase, nucleotide binding domain
Type: Domain
Description: This is the C-terminal domain found in proteins in a range of Proteobacteria as well as the Gram-positive Oceanobacillus iheyensis. Structural analysis of the whole protein indicates the N- and C-termini act together to produce a surface into which a threonate-ADP complex is bound, demonstrating that a sugar binding site is on the N-terminal domain, and a nucleotide binding site is in the C-terminal domain [ ]. There is a critical motif, DDXTG, at approximately residues 22-25. Proteins containing this domain have been predicted as kinases. Some members are associated with PdxA2 by physical clustering and gene fusion with PdxA2. Some members that are fused with PdxA2 have been shown to be involved in L-4-hydroxythreonine (4HT) phosphorylation, part of the alternative pathway to make PLP (pyridoxal 5'-phosphate) out of a toxic metabolite, 4HT. However, 4HT phosphorylation might not be the main function of this group of proteins. Moreover, some members that are not associated with pdxA2, and even one that is associated with pdxA2, have lost 4HT kinase activity [ ]. Functional analysis demonstrate that family members include D-Threonate kinases (DtnK), D-Erythronate kinases (DenK) and 3-Oxo-tetronate kinases (OtnK) [].
Protein Domain
Name: Amelogenin
Type: Family
Description: Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth. They seem to regulate formation of crystallites during the secretorystage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. Theextracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins [].Circular dichroism studies of porcine amelogenin have shown that the proteinconsists of 3 discrete folding units [ ]: the N-terminal region appears tocontain β-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive β-turn segment and a "β-spiral"between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region []. The β-spiraloffers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplasticamelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide [].
Protein Domain
Name: Axin-like
Type: Family
Description: This entry includes Axin-1 and Axin-2 from vertebrates and related proteins from invertebrates. Axin is a central component of the canonical Wnt signaling pathway that interacts with the adenomatous polyposis coli protein APC and the kinase GSK3beta to downregulate the effector beta-catenin [ ]. Axin-1 (axis inhibition protein 1) is a scaffold protein that is involved in many signalling pathways, including the Wnt, transforming growth factor-beta, MAP kinase pathways, as well as p53 activation cascades [ , ]. It controls many biological processes ranging from sugar intake, cell proliferation, and organ development to cell death [].Mutations in Axin-1 gene cause hepatocellular carcinoma (HCC), a primary malignant neoplasm of epithelial liver cells [ ], and caudal duplication anomaly (CADUA), a condition characterised by the occurrence of duplications of different organs in the caudal region [].Like its mammalian counterpart, Drosophila Axin homologue, Daxin, acts as a negative regulator of wg/Wnt signaling [ ]. In C. elegans AXL-1 functions redundantly with its ortholog, PRY-1, in negatively regulating BAR-1/beta-catenin signaling in the developing vulva and the Q neuroblast lineage. AXL-1 also functions independently of PRY-1 in negatively regulating canonical Wnt signaling during excretory cell development [ ].
Protein Domain
Name: C-20 methyltransferase CrtF-related
Type: Family
Description: Members of this protein family include bacteriochlorophyllide d C-20 (also known as methyltransferaseS-adenosylmethionine-dependent C-20 methyltransferase or BchU), part of the pathway of bacteriochlorophyll c production in photosynthetic green sulphur bacteria. The position modified by this enzyme represents the difference between bacteriochlorophylls c and d; strains lacking this protein can only produce bacteriochlorophyll d.Methyltransferases (EC [intenz:2.1.1.-]) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as oxygen leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule [, , ]. All these enzymes have in common a conserved region of about 130 amino acid residues that allow them to bind SAM []. The substrates that are methylated by these enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids [, , ]. Methyltransferase are therefore involved in many essential cellular processes including biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing [, , ]. More than 230 families of methyltransferases have been described so far, of which more than 220 use SAM as the methyl donor.
Protein Domain
Name: Flagellin, C-terminal domain, subdomain 2
Type: Homologous_superfamily
Description: Bacterial flagella are responsible for motility and chemotaxis [ ]. They comprise a basal body, a hook and a filament, the latter accounting for 98% of the mass []. Flagellin is the subunit protein that polymerises to form the flagella [], the subunits being transported through the centre of the filament to the tip, where they then polymerise [ ]. Both the N- and C-termini of the subunit protein, which are α-helical in structure [ ], are required to mediate polymerisation. Although no export or assembly, consensus sequences have been identified: Ala, Val, Leu, Ile, Gly, Ser, Thr, Asn, Gln and Asp tend to make up around 90% of the sequence, Cys and Trp being absent [].Flagellin plays a role in the activation of innate and adaptive immunity. It is an specific ligand for Toll-like receptor 5 (TLR5) in the host, which has lead to great interest to use it as adjuvant for vaccines [ , , ]. The protein is also recognised by the intracellular NAIP5/NLRC4 inflammasome receptor []. This superfamily represents the subdomain 2 found at the C terminus of flagellin proteins.
Protein Domain
Name: Cell-cell fusogen EFF/AFF
Type: Family
Description: Cell fusion is fundamental for reproduction and organ formation. Fusion between most Caenorhabditis elegans epithelial cells is mediated by the EFF1 fusogen. AFF was first identified in EFF1 mutants. Cell fusion in all epidermal and vulval epithelia was blocked in EFF1 mutants. However, fusion between the anchor cell and the utse syncytium that establishes a continuous uterine-vulval tube proceeded normally [ ]. AFF1 was established as necessary for this and for the fusion of heterologous cells in C. elegans [].The transmembrane forms of FF proteins, like most viral fusogens, possess an N-terminal signal sequence followed by a long extracellular portion, a predicted transmembrane domain, and a short intracellular tail. A striking conservation in the position and number of all 16 cysteines in the extracellular portion of FF proteins from different nematode species suggests that these proteins are folded in a similar 3D structure that is essential for their fusogenic activity [ ]. C. elegans AFF1 and EFF1 proteins are essential for developmental cell-to-cell fusion and can merge insect cells. Thus FFs comprise an ancient family of cellular fusogens that can promote fusion when expressed on a viral particle [].
Protein Domain
Name: HSPA4, nucleotide-binding domain
Type: Domain
Description: Human HSPA4 (also known as 70kDa heat shock protein 4, APG-2, HS24/P52, hsp70 RY, and HSPH2) responds to acidic pH stress, is involved in the radioadaptive response, is required for normal spermatogenesis and is overexpressed in hepatocellular carcinoma [ , , ]. It participates in a pathway along with NBS1 (Nijmegen breakage syndrome 1, also known as p85 or nibrin), heat shock transcription factor 4b (HDF4b), and HSPA14 (belonging to a different HSP70 subfamily) that induces tumor migration, invasion, and transformation []. HSPA4 expression in sperm was increased in men with oligozoospermia, especially in those with varicocele []. HSPA4 belongs to the 105/110kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family []. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent 'client' proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD) [].This entry represents the N-terminal nucleotide-binding domain of HSPA4.
Protein Domain
Name: Malonate utilization transcriptional regulator, PBP2 domain
Type: Domain
Description: This entry represents the C-terminal substrate binding domain of LysR-type transcriptional regulator (LTTR) MdcR that controls the expression of the malonate decarboxylase (mdc) genes [ ]. Like other members of the LTTRs, MdcR is a positive regulatory protein for its target promoter and composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins (PBP2) []. The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate- binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction [ , , ].
Protein Domain
Name: MMS19, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of MMS19. MMS19 is a key component of the cytosolic iron-sulfur protein assembly (CIA) complex, a multiprotein complex that mediates the incorporation of iron-sulfur cluster into apoproteins specifically involved in DNA metabolism and genomic integrity [ , ]. In humans, MMS19 acts as an adapter between early-acting CIA components and a subset of cellular target iron-sulfur proteins such as ERCC2/XPD, FANCJ and RTEL1, thereby playing a key role in nucleotide excision repair (NER) and RNA polymerase II (POL II) transcription [ , ]. It is also part of the MMXD (MMS19-MIP18-XPD) complex, which plays a role in chromosome segregation, probably by facilitating iron-sulfur cluster assembly into ERCC2/XPD [].In budding yeasts, the mms19 mutants were originally isolated in a screening for mutants hypersensitive to the alkylating agent methyl methanesulfonate (MMS) [ ]. Different from human MMS19, Mms19 in budding yeasts (also known as Met18) does not participate directly in NER []. In fission yeast, Mms19 is part of a silencing complex named Rik1-Dos2 complex, which contains Dos2, Rik1, Mms19 and Cdc20. This complex regulates RNA Pol II activity in heterochromatin, and is required for DNA replication and heterochromatin assembly [ ].
Protein Domain
Name: Four-carbon acid sugar kinase, nucleotide binding domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents the C-terminal domain found in proteins in a range of Proteobacteria as well as the Gram-positive Oceanobacillus iheyensis. Structural analysis of the whole protein indicates the N- and C-termini act together to produce a surface into which a threonate-ADP complex is bound, demonstrating that a sugar binding site is on the N-terminal domain, and a nucleotide binding site is in the C-terminal domain [ ]. There is a critical motif, DDXTG, at approximately residues 22-25. Proteins containing this domain have been predicted as kinases. Some members are associated with PdxA2 by physical clustering and gene fusion with PdxA2. Some members that are fused with PdxA2 have been shown to be involved in L-4-hydroxythreonine (4HT) phosphorylation, part of the alternative pathway to make PLP (pyridoxal 5'-phosphate) out of a toxic metabolite, 4HT. However, 4HT phosphorylation might not be the main function of this group of proteins. Moreover, some members that are not associated with pdxA2, and even one that is associated with pdxA2, have lost 4HT kinase activity []. Functional analysis demonstrate that family members include D-Threonate kinases (DtnK), D-Erythronate kinases (DenK) and 3-Oxo-tetronate kinases (OtnK) [].
Protein Domain
Name: DNA glycosylase/AP lyase, zinc finger domain, DNA-binding site
Type: Binding_site
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the DNA-binding site found in the C-terminal zinc finger domain of DNA glycosylase/AP lyase enzymes. These enzymes are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. These enzymes are primarily from bacteria, and have both DNA glycosylase activity ( ) and AP lyase activity ( ). Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei). These enzymes contain a zinc finger domain that is important for DNA-binding. Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; ) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; ). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3'- and 5'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge [ ]. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes []. Fpg binds one ion of zinc at the C terminus, which contains four conserved and essential cysteines [, ].Endonuclease VIII (Nei) has the same enzyme activities as Fpg above ( , ), but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine [ ].
Protein Domain
Name: Melanocortin 3 receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Adrenocorticotrophin (ACTH), melanocyte-stimulating hormones (MSH) and beta-endorphin are peptide products of pituitary pro-opiomelanocortin.ACTH regulates synthesis and release of glucocorticoids and aldosterone in the adrenal cortex; it also has a trophic action on these cells.ACTH and beta-endorphin are synthesised and released in response to corticotrophin-releasing factor at times of stress (heat, cold, infections,etc.) - their release leads to increased metabolism and analgesia. MSH has a trophic action on melanocytes, and regulates pigment productionin fish and amphibia. The ACTH receptor is found in high levels in the adrenal cortex - binding sites are present in lower levels in theCNS. The MSH receptor is expressed in high levels in melanocytes, melanomas and their derived cell lines. Receptors are found in lowlevels in the CNS. MSH regulates temperature control in the septal region of the brain and releases prolactin from the pituitary.A further gene, which encodes a melanocortin receptor that is functionally distinct from the ACTH and MSH receptors, has also been characterised [, , ].The protein contains ~323 amino acids, with calculated molecular mass of 35,800 Da, and potential N-linked glycosylation and phosphorylation sites[ ]. The melanocortin 3 receptor (MC3-R) is found in neurons of the arcuatenucleus known to express proopiomelanocortin and in a subset of the nuclei to which these neurons send projections []. The MC3-R is 43% identical tothe MSH receptor present in melanocytes and is strongly coupled to adenylyl cyclase [].
Protein Domain
Name: Prostanoid EP3 receptor type 3
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.EP3 receptors mediate contraction in a wide range of smooth muscles, including gastrointestinal and uterine. They also inhibit neurotransmitter release in central and autonomic nerves through a presynaptic action,and inhibit secretion in glandular tissues (e.g., acid secretion from gastric mucosa, and sodium and water reabsorption in the kidney). mRNAis found in high levels in the kidney and uterus, and in lower levels in the brain, thymus, lung, heart, stomach and spleen. The receptors activateadenylate cyclase via an uncharacterised G-protein, probably of the Gi/Go class.Sequence analysis shows the EP3 receptors to fall into distinct classes, based on their N- and C-terminal and loop signatures. For convenience, wehave designated these classes types 1 to 3.
Protein Domain
Name: Prostanoid EP3 receptor, type 2
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.EP3 receptors mediate contraction in a wide range of smooth muscles, including gastrointestinal and uterine. They also inhibit neurotransmitter release in central and autonomic nerves through a presynaptic action,and inhibit secretion in glandular tissues (e.g., acid secretion from gastric mucosa, and sodium and water reabsorption in the kidney). mRNAis found in high levels in the kidney and uterus, and in lower levels in the brain, thymus, lung, heart, stomach and spleen. The receptors activateadenylate cyclase via an uncharacterised G-protein, probably of the Gi/Go class.Sequence analysis shows the EP3 receptors to fall into distinct classes, based on their N- and C-terminal and loop signatures. For convenience, wehave designated these classes types 1 to 3.
Protein Domain
Name: Glutathione S-transferase, Mu class
Type: Family
Description: Glutathione S-transferases (GSTs) are soluble proteins with typical molecular masses of around 50kDa, each composed of two polypeptide subunits. GSTs catalyse the transfer of the tripeptide glutathione (gamma-glutamyl-cysteinyl-glycine; GSH) to a co-substrate (R-X) containing a reactive electrophillic centre to form a polar S-glutathionylated reaction product (R-SG). Each soluble GST is a dimer of approximately 26kDa subunits, typically forming a hydrophobic 50kDa protein with an isoelectric point in the pH range 4-5. The ability to form heterodimers greatly increases the diversity of the GSTs, but the functional significance of this mixing and matching of subunits has yet to be determined. Each GST subunit of the protein dimer contains an independent catalytic site composed of two components. The first is a binding site specific for GSH or a closely related homologue (the G site) formed from a conserved group of amino-acid residues in the amino-terminal domain of the polypeptide. The second component is a site that binds the hydrophobic substrate (the H site), which is much more structurally variable and is formed from residues in the carboxy-terminal domain. Between the two domains is a short variable linker region of 5-10 residues. The GST proteins have evolved by gene duplication to perform a range of functional roles. GSTs also have non-catalytic roles, binding flavonoid natural products in the cytosol prior to their deposition in the vacuole. Recent studies have also implicated GSTs as components of ultraviolet-inducible cell signalling pathways and as potential regulators of apoptosis. The mammalian GSTs active in drug metabolism are now classified into the alpha, mu and pi classes. Additional classes of GSTs have been identified in animals that do not have major roles in drug metabolism; these include the sigma GSTs, which function as prostaglandin synthases. In cephalopods, however, sigma GSTs are lens S-crystallins, giving an indication of the functional diversity of these proteins. The soluble glutathione transferases can be divided into the phi, tau, theta, zeta and lambda classes. The theta and zeta GSTs have counterparts in animals, whereas the other classes are plant-specific. In the case of phi and tau GSTs, only subunits from the same class will dimerise. Within a class, however, the subunits can dimerise even if they are quite different in amino-acid sequence. An insect-specific delta class has also been described, and bacteria contain a prokaryote-specific beta class of GST. Human mu-class GSTs have been subdivided into 5 isoforms based on differing substrate specificities [ ]. Mu-class GSTs are thought to beinvolved in the detoxification of reactive oxygen species (cyclised o-quinones) produced via oxidative metabolism of catecholamines. Thesetoxins are thought to be involved in neurological disorders of the nigrostriatal and mesolimbic systems (Parkinsons and Schizophrenia,respectively). Indeed, mu-class GSTs are expressed in the substantia nigra and have preferential substrate specificity for the cyclised o-quinonesformed by catecholamine metabolism [ ]. Mu-class GSTs possess the so-called "mu-loop", which occurs between strand beta-2 and helix alpha-3. This is a consequence of an insertion in the primary sequence and the loop allows the overall domain I topology to remain [ ].
Protein Domain
Name: Glutathione S-transferase, Pi class
Type: Family
Description: Glutathione S-transferases (GSTs) are soluble proteins with typical molecular masses of around 50kDa, each composed of two polypeptide subunits. GSTs catalyse the transfer of the tripeptide glutathione (gamma-glutamyl-cysteinyl-glycine; GSH) to a co-substrate (R-X) containing a reactive electrophillic centre to form a polar S-glutathionylated reaction product (R-SG). Each soluble GST is a dimer of approximately 26kDa subunits, typically forming a hydrophobic 50kDa protein with an isoelectric point in the pH range 4-5. The ability to form heterodimers greatly increases the diversity of the GSTs, but the functional significance of this mixing and matching of subunits has yet to be determined. Each GST subunit of the protein dimer contains an independent catalytic site composed of two components. The first is a binding site specific for GSH or a closely related homologue (the G site) formed from a conserved group of amino-acid residues in the amino-terminal domain of the polypeptide. The second component is a site that binds the hydrophobic substrate (the H site), which is much more structurally variable and is formed from residues in the carboxy-terminal domain. Between the two domains is a short variable linker region of 5-10 residues. The GST proteins have evolved by gene duplication to perform a range of functional roles. GSTs also have non-catalytic roles, binding flavonoid natural products in the cytosol prior to their deposition in the vacuole. Recent studies have also implicated GSTs as components of ultraviolet-inducible cell signalling pathways and as potential regulators of apoptosis. The mammalian GSTs active in drug metabolism are now classified into the alpha, mu and pi classes. Additional classes of GSTs have been identified in animals that do not have major roles in drug metabolism; these include the sigma GSTs, which function as prostaglandin synthases. In cephalopods, however, sigma GSTs are lens S-crystallins, giving an indication of the functional diversity of these proteins. The soluble glutathione transferases can be divided into the phi, tau, theta, zeta and lambda classes. The theta and zeta GSTs have counterparts in animals, whereas the other classes are plant-specific. In the case of phi and tau GSTs, only subunits from the same class will dimerise. Within a class, however, the subunits can dimerise even if they are quite different in amino-acid sequence. An insect-specific delta class has also been described, and bacteria contain a prokaryote-specific beta class of GST. Pi-class GSTs are recognised by ethacrynic acid substrate specificity [ ]. The pi-class H subsite has been found to be comparatively open [], perhaps explaining specificity towards the lesser hydrophobic substrates. This class has received particular interest in relation to carcinogenesis. Pi-class GSTs have been found to be markedly increased in the early stages of rat liver carcinogenesis. Expression levels of GST-P have also been found to be elevated in many human tumours. In addition, polycyclic aromatic hydrocarbon (a carcinogen found in cigarette smoke) induced tumourigenesis is increased in mice lacking this enzyme [].
Protein Domain
Name: GPCR, family 2, growth hormone-releasing hormone receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The secretin-like GPCRs include secretin [ ], calcitonin [], parathyroid hormone/parathyroid hormone-related peptides [] and vasoactive intestinal peptide [], all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique '7TM' signature). Their N-terminal is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allows the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products. Growth hormone (GH)-releasing hormone (GHRH) belongs to the family of gut-neuropeptide hormones that includes glucagon, secretin and vasoactive intestinal peptide (VIP) [ ]. The receptors for this peptide family involve similar signal transduction pathways - on hormone binding, they interact with G protein and cause stimulation of adenylate cyclase []. Acting through the GHRH receptor (GHRHR), GH plays a pivotal role in the regulation of GH synthesis and secretion in the pituitary, possibly serving other roles in different tissues []. Cryo-electron microscopy shows a hormone recognition pattern where an α-helical GHRH forms interactions involving all the extracellular loops, most TM helices, and a linker from GHRHR []. The human pituitary GHRHR is a 423-amino acid protein that has the characteristic 7TM signature of the secretin-like GPCR superfamily, sharing 47%, 42%, 35%, and 28% identity with receptors for VIP, secretin, calcitonin and PTH, respectively [].
Protein Domain
Name: Voltage-dependent calcium channel, gamma-2 subunit
Type: Family
Description: Ca2+ ions are unique in that they not only carry charge but they are also the most widely used of diffusible second messengers. Voltage-dependent Ca2+ channels (VDCC) are a family of molecules that allow cells to couple electrical activity to intracellular Ca2+ signalling. The opening and closing of these channels by depolarizing stimuli, such as action potentials, allows Ca2+ ions to enter neurons down a steep electrochemical gradient, producing transient intracellular Ca2+ signals. Many of the processes that occur in neurons, including transmitter release, gene transcription and metabolism are controlled by Ca2+ influx occurring simultaneously at different cellular locales. The pore is formed by the alpha-1 subunit which incorporates the conduction pore, the voltage sensor and gating apparatus, and the known sites of channel regulation by second messengers, drugs, and toxins [ ]. The activity of this pore is modulated by four tightly-coupled subunits: an intracellular beta subunit; a transmembrane gamma subunit; and a disulphide-linked complex of alpha-2 and delta subunits, which are proteolytically cleaved from the same gene product. Properties of the protein including gating voltage-dependence, G protein modulation and kinase susceptibility can be influenced by these subunits. Voltage-gated calcium channels are classified as T, L, N, P, Q and R, and are distinguished by their sensitivity to pharmacological blocks, single-channel conductance kinetics, and voltage-dependence. On the basis of their voltage activation properties, the voltage-gated calcium classes can be further divided into two broad groups: the low (T-type) and high (L, N, P, Q and R-type) threshold-activated channels.The voltage-dependent calcium channel gamma (VDCCG) subunit family consists of at least 8 members, which share a number of common structural features[ ]. Each member is predicted to possess 4 transmembrane domains, with intracellular N- and C-termini. The first extracellular loop contains a highly conserved N-glycosylation site and a pair of conserved cysteine residues. The C-terminal 7 residues of VDCCG-2, -3, -4 and -8 are also conserved andcontain a consensus site for phosphorylation by cAMP and cGMP-dependent protein kinases, and a target site for binding by PDZ domain proteins [].The VDCCG-2 subunit (also known as stargazin) was isolated by identifying the locus of the genetic disruption in the epileptic mouse mutant line known as stargazer []. VDCCG-2 subunits are brain specific and enriched in synaptic plasma membranes. In vitro studies using recombinant P/Q-type calcium channels show that VDCCG-2 subunit expression increases steady-state channel inactivation, leading to the suggestion that, in stargazer mutants, inappropriate calcium entry may contribute to the seizure phenotype. VDCCG-2 subunits are also implicated in cellular trafficking. They interact with ionotropic glutamate AMPA receptor subunits, a process that has beenshown to be essential in delivering functional AMPA receptors to the surface membranes of cerebellar granule cells []. In addition, VDCCG-2 subunits are capable of associating with PDZ proteins, such as PSD-95, through their C-terminal PDZ binding domains. This interaction is required to target AMPAreceptors to cerebellar synapses.
Protein Domain
Name: WLM domain
Type: Domain
Description: The WLM (WSS1-like metalloprotease) domain is a globular domain related to the zincin-like superfamily of Zn-dependent peptidase. Since the WLM domain contains all known active site residues of zincins, it is predicted to be a catalytically active peptidase domain. The WLM domain is a eukaryotic domain represented in plants, fungi, Plasmodium, and kinetoplastids. By contrast, it is absent in animals, Cryptosporidium, and Microsporidia, suggesting that it has been lost on multiple occasions during the evolution of eukaryotes. The WLM domain is found either in stand-alone form or in association with other domains such as the RanBP2 zinc finger , the ubiquitin domain, or the PUB/PUG domain. This domain could function as a specific de-SUMOylating domain of distinct protein complexes in the nucleus and the cytoplasm [ ]. It has been suggested to form a segregated alpha/beta structure with eight helices and five strands. Proteins containing this domain include yeast WSS1 (also known as weak suppressor of SMT3) which is involved in the repair of toxic DNA-protein cross-links (DPCs) such as covalently trapped topoisomerase 1 (TOP1) adducts on DNA lesions or DPCs induced by reactive compounds [ ], WSS1 homologues and various putative metalloproteases from plant and fungal species. This domain is also found in an uncharacterised protein from Acanthamoeba polyphaga mimivirus.
Protein Domain
Name: Carbohydrate binding module family 25
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].This entry represents , which has been shown to bind starch [ ].
Protein Domain
Name: MyTH4 domain
Type: Domain
Description: The microtubule-based kinesin motors and actin-based myosin motors generate movements required for intracellular trafficking, cell division, and muscle contraction. In general, these proteins consist of a motor domain that generates movement and a tail region that varies widely from class to class and is thought to mediate many of the regulatory or cargo binding functions specific to each class of motor [ ]. The Myosin Tail Homology 4 (MyTH4) domain has been identified as a conserved domain in the tail domains of several different unconventional myosins [] and a plant kinesin-like protein [], but has more recently been found in several non-motor proteins []. Although the function is not yet fully understood, there is an evidence that the MyTH4 domain of Myosin-X (Myo10) binds to microtubules and thus could provide a link between an actin-based motor protein and the microtubule cytoskeleton [].The MyTH4 domain is found in one or two copies associated with other domains, such as myosin head, kinesin motor, FERM, PH, SH3 and IQ. The domain is predicted to be largely α-helical, interrupted by three orfour turns. The MyTH4 domain contains four highly conserved regions designated MGD (consensus sequence L(K/R)(F/Y)MGDhP, LRDE (consensus LRDEhYCQhhKQHxxxN),RGW (consensus RGWxLh), and ELEA (RxxPPSxhELEA), where h indicates a hydrophobic residue and x is any residue [].
Protein Domain
Name: SOS response associated peptidase (SRAP)
Type: Family
Description: The SRAP (SOS-response associated peptidase) family is characterised by the SRAP domain with a novel thiol autopeptidase activity, whose active site in human HMCES is comprised of the catalytic triad residues C2, E127, and H210 [ ]. SRAP proteins are evolutionarily conserved in all domains of life. For instance, human HMCES and E. coli YedK are similar in both sequence and structure []. HMCES was originally identified as a possible reader of 5hmC in embryonic stem cell extracts using a double-stranded DNA molecule containing 5hmC as bait []. The bacterial members have operonic associations with the SOS DNA damage response, mutagenic translesion DNA polymerases, non-homologous DNA-ending-joining networks that employ Ku and an ATP-dependent ligase, and other repair systems []. Abasic (AP) sites are one of the most common DNA lesions that block replicative polymerases. SRAP proteins shield the AP site from endonucleases and error-prone polymerases [ ]. Both HMCES and YedK have been found to preferentially bind ssDNA and efficiently form DNA-protein crosslinks (DPCs) to AP sites in ssDNA. They crosslink to AP sites via a stable thiazolidine DNA-protein linkage formed with the N-erminal cysteine and the aldehyde form of the AP deoxyribose []. In B Cells, HMCES has also been shown to mediate microhomology-mediated alternative-end-joining through its SRAP domain [ ].
Protein Domain
Name: Peptidase M17, leucine aminopeptidase/peptidase B
Type: Family
Description: The majority of members of this family are zinc-dependent exopeptidases belonging to MEROPS peptidase family M17 (leucyl aminopeptidase, clan MF).Leucyl aminopeptidase (LAP; ) selectively release N-terminal amino acid residues from polypeptides and proteins; in general they are involved in the processing, catabolism and degradation of intracellular proteins [ , , ]. Leucyl aminopeptidase forms a homohexamer containing two trimers stacked on top of one another []. Each monomer binds two zinc ions. The zinc-binding and catalytic sites are located within the C-terminal catalytic domain []. Leucine aminopeptidase has been shown to be identical with prolyl aminopeptidase () in mammals [ ]. Interestingly, members of this group are also implicated in transcriptional regulation and are thought to combine catalytic and regulatory properties [ ]. The N-terminal domain of these proteins has been shown in Escherichia coli PepA to function as a DNA-binding protein in Xer site-specific recombination and in transcriptional control of the carAB operon [, ]. It is not well conserved and in some members can be found only by PSI-BLAST (after 4-6 iterations). It is not clear if the DNA binding function is preserved in all or even in most of the members.For additional information please see [ , , , ].
Protein Domain
Name: Splicing factor SF3a60 /Prp9 subunit, C-terminal
Type: Domain
Description: The mature U2 snRNP (small nuclear ribonucleoprotein particle) relies on the spliceosome assembly for its function, which involves the Splicing factor 3a (SF3a), an evolutionary eukaryotic conserved heterotrimeric complex, essential for pre-mRNA splicing [ ]. This complex includes three subunits: SF3a60, SF3a66 and SF3a120 in the human complex, being Prp9, Prp11 and Prp21 the counterparts in Saccharomyces cerevisiae. SF3a60 possess a highly conserved C2H2-type zinc-finger domain at the C-terminal, while Prp9 has two such domains; SF3a66 and Prp11 contain one C2H2-type zinc-finger domain and SF3a120/Prp21 has two suppressor-of-white-apricot and prp21/spp91 (SURP) domains, followed by a short segment of charged residues. The SURP2 domain of SF3a120/Prp21 has a role in SF3a60/Prp9 binding, however, SURP1 domain function is unknown. The yeast structure shows that Prp9 interacts with Prp21 via a bidentate-binding mode, and Prp21 is wrapped around Prp11 [ ].This entry represents the C-terminal region of SF3a60 subunit (also known as SF3a3 and Spliceosome-associated protein 61 or Prp9 from yeast) from SF3a complex, found in eukaryotes. This domain has two conserved sequence motifs: PIP and CEICG. It contains a zinc-finger domain of the C2H2-type which might be important for RNA binding and protein-protein interactions with components of the SF3b complex [ ].
Protein Domain
Name: Proteasome beta-type subunit, conserved site
Type: Conserved_site
Description: The proteasome (or macropain) ( ) [ , , , , ] is a multicatalytic proteinase complex in eukaryotes and archaea, and in some bacteria, that seems to be involved in an ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the proteasome is composed of 28 distinct subunits which form a highly ordered ring-shaped structure (20S ring) of about 700kDa. Most proteasome subunits can be classified, on the basis on sequence similarities into two groups, alpha (A) and beta (B). These are arranged in four rings of seven proteins, consisting of a ring of alpha subunits, two rings of beta subunits, and a ring of alpha subunits. In eukaryotes, each alpha and each beta ring consists of different proteins. Three of the beta subunits are peptidases in subfamily T1A, and each has a distinctive specificity (trypsin-like, chymotrypsin-like and glutamyl peptidase-like). The peptidases are N-terminal nucleophile hydrolases in which the N-terminal threonine is the nucleophile in the hydrolytic reaction []. In the immunoproteasome, the catalytic components are replaced by three specialist, catalytic beta subunits []. In bacteria and archaea there is only one alpha subunit and one beta subunit, and each ring is a homoseptamer.This entry represents a conserved sequence region found in the N-terminal region of these proteins.
Protein Domain
Name: Importin-beta, N-terminal domain
Type: Domain
Description: This entry represents the N-terminal domain of importin-beta (also known as karyopherins-beta) that is important for the binding of the Ran GTPase protein [ ].Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins [ , ], which is important for importin-beta mediated transport.Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. As a result, the N-terminal auto-inhibitory region on importin-alpha is free to loop back and bind to the major NLS-binding site, causing the cargo to be released [ ]. There are additional release factors as well.
Protein Domain
Name: Patatin-like phospholipase domain
Type: Domain
Description: The patatin glycoprotein is a nonspecific lipid acyl hydrolase that is found in high concentrations in mature potato tubers. Patatin is reported to play a role in plant signaling, to cleave fatty acids from membrane lipids, and to act as defense against plant parasites. Proteins encoding a patatin-like phospholipase (PNPLA) domain are ubiquitously distributed across all life forms, including eukaryotes and prokaryotes, and are observed to participate in a miscellany of biological roles, including sepsis induction, host colonization, triglyceride metabolism, and membrane trafficking. PNPLA domain containing proteins display lipase and transacylase properties and appear to have major roles in lipid and energy homeostasis [, , ].The ~180-amino acid PNPLA domain harbors the evolutionarily conserved consensus serine lipase motif Gly-X-Ser-X-Gly.cIt displays an alpha/beta class protein fold with approximately three layers, basically alpha/beta/alpha in content, in which a central six-stranded β-sheet is sandwiched essentially between α-helices front and back. The central β-sheet contains five parallel strands and an antiparallel strand at the edge of the sheet. The PNPLA domain has a Ser-Asp catalytic dyad. The catalytic Ser resides in a sharp nucleophile elbow turn loop which follows a β-strand(beta5) of the central β-sheet and precedes a helix (helix C) [ , ].
Protein Domain
Name: Chaperonin TCP-1, conserved site
Type: Conserved_site
Description: The TCP-1 protein [ , ] (Tailless Complex Polypeptide 1) was first identified in mice where it is especially abundant in testis but present in all cell types. It has since been found and characterised in many other animal species, as well as in yeast, plants and protists. TCP-1 is a highly conserved protein of about 60kDa (556 to 560 residues) which participates in a hetero-oligomeric 900kDa double-torus shaped particle [] with 6 to 8 other different subunits. These subunits, the chaperonin containing TCP-1 (CCT) subunit beta, gamma, delta, epsilon, zeta and eta are evolutionary related to TCP-1 itself [, ]. The CCT is known to act as a molecular chaperone for tubulin, actin and probably some other proteins. The CCT subunits are highly related to archaebacterial counterparts:TF55 and TF56 [ ], a molecular chaperone from Sulfolobus shibatae. TF55 has ATPase activity, is known to bind unfolded polypeptides and forms a oligomeric complex of two stacked nine-membered rings.Thermosome [ ], from Thermoplasma acidophilum. The thermosome is composed of two subunits (alpha and beta) and also seems to be a chaperone with ATPase activity. It forms an oligomeric complex of eight-membered rings.The TCP-1 family of proteins are weakly, but significantly [ ], related to the cpn60/groEL chaperonin family (see ).
Protein Domain
Name: Glycoside hydrolase family 18, catalytic domain
Type: Domain
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.The glycosyl hydrolases family 18 (GH18) is widely distributed in all kingdoms and contains hydrolytic enzymes with chitinase or endo-N-acetyl-beta-D-glucosaminidase (ENGase) activity as well as chitinase-like lectins (chi-lectins/proteins (CLPs). Chitinases ( ) are hydrolytic enzymes that cleave the beta-1,4-bond releasing oligomeric, dimeric (chitobiose) or monomeric (N-actetylglucosamine, GlcNAc) products. ENGases ( ) hydrolyze the beta-1,4 linkage in the chitobiose core of N-linked glycans from glycoproteins leaving one GlcNAc residue on the substrate. CLPs do not display chitinase activity but some of them have been reported to have specific functions and carbohydrate binding property [ ]. This family also includes glycoproteins from mammals, such as oviduct-specific glycoproteins.The catalytic domain of GH18s has a common (beta/alpha)8 triosephosphate isomerase (TIM)-barrel structure, which consists of a barrel-like framework made from eight internal parallel β-strands that are alternately connected by eight exterior α-helices. The active site motif DxxDxDxE is essential for the activity of the GH18 catalytic domain. [ , , ].
Protein Domain
Name: Flagellin, N-terminal domain
Type: Domain
Description: Bacterial flagella are responsible for motility and chemotaxis [ ]. They comprise a basal body, a hook and a filament, the latter accounting for 98% of the mass []. Flagellin is the subunit protein that polymerises to form the flagella [], the subunits being transported through the centre of the filament to the tip, where they then polymerise []. Both the N- and C-termini of the subunit protein, which are α-helical in structure [ ], are required to mediate polymerisation. Although no export or assembly, consensus sequences have been identified: Ala, Val, Leu, Ile, Gly, Ser, Thr, Asn, Gln and Asp tend to make up around 90% of the sequence, Cys and Trp being absent [ ].Flagellin plays a role in the activation of innate and adaptive immunity. It is an specific ligand for Toll-like receptor 5 (TLR5) in the host, which has lead to great interest to use it as adjuvant for vaccines [ , , ]. The protein is also recognised by the intracellular NAIP5/NLRC4 inflammasome receptor []. This entry represents the N-terminal domain of Flagellin and similar bacterial proteins. This domain comes together with the C-terminal domain ( ) to form the D0 and D1 structural domains [ ]. These domains are responsible for the activation of TLR5 [, ].
Protein Domain
Name: Antibiotic biosynthesis monooxygenase domain
Type: Domain
Description: The antibiotic biosynthesis monooxygenase (ABM) domain is found in proteins involved in a diverse range of biological processes, including metabolism,transcription, translation and biosynthesis of secondary metabolites:Streptomyces coelicolor ActVA-Orf6 monooxygenase, plays a role in the biosynthesis of aromatic polyketides, specifically the antibioticactinorhodin, by oxidizing phenolic groups to quinones [ ].Escherichia coli probable quinol monooxygenase YgiN, can oxidize menadiol to menadione [].Staphylococcus aureus heme-degrading enzymes IsdG and IsdI [ , ].Staphylococci signal transduction protein TRAP (target of RNAIII- activating protein) [].Mycobacterium tuberculosis heme-degrading monooxygenase MhuD (or Rv3592) [].Mycobacterium tuberculosis putative monooxygenase Rv0793, might be involved in antibiotic biosynthesis, or may act as reactive oxygen species scavengerthat could help in evading host defenses [ ].Thermus thermophilus hypothetical protein TT1380 [ ].The ABM domain has only moderate sequence homology while sharing a high degreeof structural similarity. The ABM domain crystallizes as a homodimer. Each monomer is composed of three α-helices (H1-3) and four β-strands (S1-4)and has a ferredoxin-like split BetaAlphaBeta-fold with an antiparallel beta- sheet []. The β-sheets of two monomers form a 10-strand, anti-parallel β-barrel. The barrel is built of two smaller sheets that are connected by long C-terminal strands crossing over from one monomer to theother providing important interactions within the dimer. The core of the barrel is mainly hydrophobic [, , , , , ].
Protein Domain
Name: CheW-like domain
Type: Domain
Description: The CheW-like domain is an around 150-residue domain that is found in proteins involved in the two-component signaling systems regulating bacterial chemotaxis. Two components systems are composed of a receptor kinase, whichmonitors the environmental conditions and its substrate, the response regulator, which acts as a binary switch depending on the phosphorylationstate. In Escherichia coli, the signal transduction pathway for chemotaxis consists of specialised membrane receptors, termed chemotaxis transducers; aCheA-CheY two-component system, which transmits the signal from transducers to flagellar motors; and a docking protein, CheW, which couples the CheAhistidine kinase to transducers. Whereas CheW is only made of a CheW-like domain, CheA additionally contains an HPt domain and anhistidine kinase domain. The CheW-like domain has been shown to mediate the interaction between CheA and the adaptor protein CheW. Somebacteria contain another bifunctional protein, CheV, consisting of an N- terminal CheW-like domain and a C-terminal response regulatory domain. Although its precise function in chemotaxis is unknown, CheVprobably acts in adaptation to attractants [ , , , ].The CheW-like domain is composed of two β-sheet subdomains, each of which forms a loose five-stranded β-barrel around an internal hydrophobic core. The interactions between the subdomains are contributed by athird hydrophobic core sandwiched between the two β-sheet subdomains. The CheW-like structure is stabilised by extensive hydrophobic interactions [, ].
Protein Domain
Name: Kazal domain
Type: Domain
Description: This entry represents the Kazal domain.Canonical serine proteinase inhibitors are distributed in a wide range of organisms from all kingdoms of life and play crucial role in various physiological mechanisms [ ]. They interact from the canonical proteinase-inhibitor binding loop, where P1 residue has a predominant role (the residue at the P1 position contributing the carbonyl portion to the reactive-site peptide bond). These so-called canonical inhibitors bind to their cognate enzymes in the same manner as a good substrate, but are cleaved extremely slowly. Kazal-type inhibitors represent the most studied canonical proteinase inhibitors. Kazal inhibitors are extremely variable at their reactive sites. However, some regularity prevails such as the presence of lysine at position P1 indicating strong inhibition of trypsin [].The Kazal inhibitor has six cysteine residues engaged in disulfide bonds arranged as shown in the following schematic representation:+------------------+ | |*******************|*** xxxxxxxxCxxxxxxCx#xxxxxCxxxxxxxxxxCxxCxxxxxxxxxxxxxxxxxC| | | | | +-------------|-----------------++----------------------------+ 'C': conserved cysteine involved in a disulfide bond.'#': active site residue. '*': position of the pattern.The structure of classical Kazal domains consists of a central α-helix, which is inserted between two β-strands and a third that is toward the C terminus []. The reactive site P1 and the conformation of the reactive site loop is structurally highly conserved, similar to the canonical conformation of small serine proteinase inhibitors.
Protein Domain
Name: Cytochrome c, class ID
Type: Family
Description: Cytochromes c (cytC) can be defined as electron-transfer proteins having one or several haem c groups, bound to the protein by one or, more generally, two thioether bonds involving sulphydryl groups of cysteine residues. The fifth haem iron ligand is always provided by a histidine residue. CytC possess a wide range of properties and function in a large number of different redox processes. Ambler [ ] recognised four classes of cytC.Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment sitetowards the N terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C terminus. On the basisof sequence similarity, class I cytC were further subdivided into five classes, IA to IE.Class ID (cyt c8) includes such bacterial proteins as Pseudomonas spp. cyt c-551, Hydrogenobacter thermophilus cyt c-552 and Rhodocyclus tenuis (Rhodospirillum tenue) cyt c-553 []. Sequence characteristics include severalPro residues around the sixth ligand Met, and a conserved Trp residue near the C terminus.The 3D structures of cyt C-551 from Pseudomonas aeruginosa and Pseudomonas stutzeri have been determined [ ]. The proteins consist of 5 α-helices;three 'core' helices form a 'basket' around the haem group, with one haem edge exposed to the solvent.
Protein Domain
Name: MyTH4 domain superfamily
Type: Homologous_superfamily
Description: The microtubule-based kinesin motors and actin-based myosin motors generate movements required for intracellular trafficking, cell division, and muscle contraction. In general, these proteins consist of a motor domain that generates movement and a tail region that varies widely from class to class and is thought to mediate many of the regulatory or cargo binding functions specific to each class of motor [ ]. The Myosin Tail Homology 4 (MyTH4) domain has been identified as a conserved domain in the tail domains of several different unconventional myosins [] and a plant kinesin-like protein [], but has more recently been found in several non-motor proteins []. Although the function is not yet fully understood, there is an evidence that the MyTH4 domain of Myosin-X (Myo10) binds to microtubules and thus could provide a link between an actin-based motor protein and the microtubule cytoskeleton [].The MyTH4 domain is found in one or two copies associated with other domains, such as myosin head, kinesin motor, FERM, PH, SH3 and IQ. The domain is predicted to be largely α-helical, interrupted by three orfour turns. The MyTH4 domain contains four highly conserved regions designated MGD (consensus sequence L(K/R)(F/Y)MGDhP, LRDE (consensus LRDEhYCQhhKQHxxxN),RGW (consensus RGWxLh), and ELEA (RxxPPSxhELEA), where h indicates a hydrophobic residue and x is any residue [].
Protein Domain
Name: SpoVA
Type: Family
Description: Members of this family are all transcribed from the spoVA operon [ ].Bacillus and Clostridium are two well studied endospore forming bacteria. Spore formation provides a resistance mechanism in response to extreme or unfavourable environmental conditions such as heat, radiation, and chemical agents or nutrient deprivation. The reverse process termed germination takes place where spores develop into growing cells in response to nutrient availability or stress reduction. Nutrient germinant receptors (GRs) and the SpoVA proteins are important players in the germination process. In B. subtilis SpoVAC and SpoVAEB, belonging to this family, are predicted to be membrane proteins, with two to five membrane spanning. Biophysical and biochemical studies suggest that SpoVAC acts as a mechano-sensitive channel with properties that would allow the release of Ca-DPA (dipicolinic acid) and amino acids during germination of the spore. The release of Ca-DPA is a crucial event during spore germination. When expressed in E. coli SpoVAC provides protection against osmotic downshift. Furthermore, SpoVAC acts as channel that facilitates the efflux down the concentration gradient of osmolytes up to a mass of at least 600 Da [ ]. Another conserved SpoVA protein in all spore-forming bacteria is SpoVAEb, which appears to be an integral membrane protein with no known function [].
Protein Domain
Name: Glycosyl transferase, family 11
Type: Family
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 11 comprises enzymes with only one known activity; galactoside 2-L-fucosyltransferase ( ). Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Galactoside 2-L-fucosyltransferase 1 () and Galactoside 2-L-fucosyltransferase 2 ( ) belong to the Hh blood group system and are associated with H/h and Se/se antigens.
Protein Domain
Name: Proteinase, regulatory CLIP domain superfamily
Type: Homologous_superfamily
Description: The CLIP domain is a regulatory domain which controls the proteinase action of various proteins of the trypsin family, e.g. easter and pap2. The domain is restricted to the arthropoda and found in varying copy numbers (from one to five in Drosophila proteins). It is always found N-terminal to the chymotrypsin serine protease domain, which belong to MEROPS peptidase family S1A. The CLIP domain remains linked to the protease domain after cleavage of a conserved residue which retains the protein in zymogen form. It is named CLIP because it can be drawn in the shape of a paper clip. It has many disulphide bonds and highly conserved cysteine residues, and so it folds extensively [ , ]. The clip domain adopts an α/β mixed fold consisting of two helices and an antiparallel distorted β-sheet made of four strands. The two helices are antiparallel and are almost perpendicular to the β-sheet. Three disulfide bridges (C1-C5, C2-C4, C3-C6) stabilize the β-sheet, C3 being the only cysteine that is not located on a β-strand. The clip domain is located opposite the activation loop and contacts the C-terminal α-helix of the SP domain [ ]. The CLIP domain is present in silkworm prophenoloxidase-activating enzyme [ ].
Protein Domain
Name: Monellin, A chain superfamily
Type: Homologous_superfamily
Description: Monellin is an intensely sweet-tasting protein derived from African berries. The protein has a very high specificity for the sweet receptors, making it ~100,000 times sweeter than sugar on a molar basis and several thousand times sweeter on a weight basis. Like the sweet-tasting protein thaumatin, it neither contains carbohydrates nor modified amino acids. Although there is no sequence similarity between the proteins, antibodies for thaumatin compete for monellin (and other sweet compounds, but not for chemically modified non-sweet monellin) and vice versa [ ]. It is thought that native conformations are important for the sweet taste. Monellin is a heterodimer, comprising an A chain of 44 amino acid residues, and a B chain of 50 residues. The individual subunits are not sweet, nor do they block the sweet sensation of sucrose or monellin. However, blocking the single SH of monellin abolishes its sweetness, as does reaction of its methionyl residue with CNBr []. The cysteinyl and methionyl residues are adjacent, and it has therefore been suggested that this part of the molecule is essential for its sweetness []. The structure of monellin belongs to the alpha/beta class, a 5-stranded β-sheet sequestering a single α-helix. The A chain contributes 3 strands to the sheet.This entry represents the Monellin chain A.
Protein Domain
Name: Terminase small subunit, N-terminal DNA-binding domain, HTH motif superfamily
Type: Homologous_superfamily
Description: Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far, which are hetero-oligomers composed of a small and a large subunit, do not have a significant level of sequence homology. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site [ ]. The small terminase protein is essential for the initial recognition of viral DNA and regulates the motor's ATPase and nuclease activities during DNA translocation [] and for switching between viral DNA replication and packaging. DNA packaging in tailed bacteriophages and in evolutionarily related herpesviruses is controlled by a viral-encoded terminase. The terminase complex characterised in Bacillus subtilis bacteriophages SF6 and SPP1 consists of two proteins: G1P and G2P [, ].This entry represents the N-terminal domain of the terminase small subunit, which contains a HTH DNA-binding motif [ ]. The first three helices of G1P form the typical helix-turn-helix DNA-binding motif, which is followed by a fourth helix. The fourth helix acts as a linker between the DNA-binding domain and the oligomerization domain [].
Protein Domain
Name: Calpain C2 domain
Type: Domain
Description: A single C2 domain is found in calpains (EC 3.4.22.52, EC 3.4.22.53), calcium-dependent, non-lysosomal cysteine proteases.The C2 domain is a Ca 2+-dependent membrane-targeting module found in many cellular proteins involved in signal transduction or membrane trafficking. C2 domains are unique among membrane targeting domains in that they show wide range of lipid selectivity for the major components of cell membranes, including phosphatidylserine and phosphatidylcholine. This C2 domain is about 116 amino-acid residues and is located between the two copies of the C1 domain in Protein Kinase C and the protein kinase catalytic domain [ ]. Regions with significant homology [] to the C2-domain have been found in many proteins. The C2 domain is thought to be involved in calcium-dependent phospholipid binding [] and in membrane targetting processes such as subcellular localisation. The 3D structure of the C2 domain of synaptotagmin has been reported [], the domain forms an eight-stranded β-sandwich constructed around a conserved 4-stranded motif, designated a C2 key []. Calcium binds in a cup-shaped depression formed by the N- and C-terminal loops of the C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel β-sandwich.
Protein Domain
Name: Uteroglobin
Type: Family
Description: Uteroglobin (blastokinin or Clara cell protein CC10) is a mammalian steroid-inducible secreted protein originally isolated from the uterus of rabbits during early pregnancy [ ]. The mucosal epithelia of several organs that communicate with the external environment express uteroglobin. Its tissue-specific expression is regulated by steroid hormones, and is augmented in the uterus by non-steroidal prolactin. Uteroglobin may be a multi-functional protein with anti-inflammatory/immunomodulatory properties, acting to inhibit phospholipase A2 activity [, ], and binding to (and possibly sequestering) several hydrophobic ligands such as progesterone, retinols, polychlorinated biphenyls, phospholipids and prostaglandins [, ]. In addition, uteroglobin has anti-chemotactic, anti-allergic, anti-tumourigenic and embryo growth-stimulatory properties. Uteroglobin may have a homeostatic role against oxidative damage, inflammation, autoimmunity and cancer [, , , ]. However, the true biological function of uteroglobin is poorly understood. Uteroglobin consists of a disulphide-linked homodimer with a large hydrophobic pocket located between the two dimers []. Each monomer being composed of four helices that do not form a canonical four helix-bundle motif but rather a boomerang-shaped structure in which helices H1, H3, and H4 are able to bind a homodimeric partner []. The hydrophobic pocket binds steroids, particularly progesterone, with high specificity. It is a member of the secretoglobin superfamily.This entry represents uteroglobin proteins from several mammalian species [ ].
Protein Domain
Name: Olfactory receptor
Type: Family
Description: The olfactory system is a highly-specialised chemical recognition system that, like the immune system, is capable of discriminating with tremendous sensitivity between numerous foreign molecules in the environment. Olfactory transduction is believed to be initiated by the binding of odorants to specific receptor proteins in the cilia of olfactory receptor cells. Although little is known about the precise mechanism by which odorant binding might initiate membrane depolarisation, it is believed that cyclic AMP may serve as an intracellular messenger for olfactory transduction [].Olfactory receptors are integral membrane proteins that belong to the seven transmembrane (TM), rhodopsin-like G-protein coupled receptor family. Although the sequences of these proteins are very diverse, reflecting to some extent their broad range of activating ligands, nevertheless, motifs have been identified in the TM regions that are characteristic of virtually the entire superfamily [ , ]. Amongst the exceptions are the olfactory receptors, which cluster together in a subfamily that lacks significant matches with TM domains 2, 4 and 6 [].Olfactory receptor genes form the largest known multigene family in the human genome [ ]. Each olfactory receptor does not seem to detect a single odour. Instead, different odorants are recognised by different combinations of olfactory receptors [].
Protein Domain
Name: Gp9-like superfamily
Type: Homologous_superfamily
Description: Members of this entry are similar to gene products 9 (gp9) and 10 (gp10) of bacteriophage T4. Both proteins are components of the viral baseplate [ ]. Gp9 connects the long tail fibres of the virus to the baseplate and triggers tail contraction after viral attachment to a host cell. The protein is active as a trimer, with each monomer being composed of three domains. The N-terminal domain consists of an extended polypeptide chain and two alpha helices. The alpha1 helix from each of the three monomers in the trimer interacts with its counterparts to form a coiled-coil structure. The middle domain is a seven-stranded β-sandwich that is thought to be a novel protein fold. The C-terminal domain is thought to be essential for gp9 trimerisation and is organised into an eight- stranded antiparallel β-barrel, which was found to resemble the 'jelly roll' fold found in many viral capsid proteins. The long flexible region between the N-terminal and middle domains may be required for the function of gp9 to transmit signals from the long tail fibres []. Together with gp11, gp10 initiates the assembly of wedges that then go on to associate with a hub to form the viral baseplate [].
Protein Domain
Name: SinR repressor/SinI anti-repressor, dimerisation domain superfamily
Type: Homologous_superfamily
Description: The SinR repressor is part of a group of Sin (sporulation inhibition) proteins in Bacillus subtilis that regulate the commitment to sporulation in response to extreme adversity [ ]. SinR is a tetrameric repressor protein that binds to the promoters of genes essential for entry into sporulation and prevents their transcription. This repression is overcome through the activity of SinI, which disrupts the SinR tetramer through the formation of a SinI-SinR heterodimer, thereby allowing sporulation to proceed. The SinR structure consists of two domains: a dimerisation domain stabilised by a hydrophobic core, and a DNA-binding domain that is identical to domains of the bacteriophage 434 CI and Cro proteins that regulate prophage induction. The dimerisation domain is a four-helical bundle formed from two helices from the C-terminal residues of SinR and two helices from the central residues of SinI. These regions in SinR and SinI are similar in both structure and sequence. The interaction of SinR monomers to form tetramers is weaker than between SinR and SinI, since SinI can effectively disrupt SinR tetramers.This entry represents the dimerisation domain in both SinI and SinR proteins.The structure of the SinR dimerisation domain consists of an intertwined heterodimer of two homologous chains.
Protein Domain
Name: Nucleolin, RNA recognition motif 1
Type: Domain
Description: This entry represents the RNA recognition motif 1 (RRM1) of ubiquitously expressed protein nucleolin.Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc [ , ]. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities []. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta [, ]. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe [ ] and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG, NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop [].
Protein Domain
Name: Nucleolin, RNA recognition motif 2
Type: Domain
Description: This entry represents the RNA recognition motif 2 (RRM2) of ubiquitously expressed protein nucleolin.Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc [ , ]. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities []. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta [, ]. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe [ ] and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG, NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop [].
Protein Domain
Name: Nucleolin, RNA recognition motif 3
Type: Domain
Description: This entry represents the RNA recognition motif 3 (RRM3) of ubiquitously expressed protein nucleolin.Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc [ , ]. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities []. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta [, ]. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe [ ] and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG, NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop [].
Protein Domain
Name: Nucleolin, RNA recognition motif 4
Type: Domain
Description: This entry represents the RNA recognition motif 4 (RRM4) of ubiquitously expressed protein nucleolin.Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc [, ]. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities []. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta [, ]. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe [ ] and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG, NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop [].
Protein Domain
Name: RBM8, RNA recognition motif
Type: Domain
Description: This entry corresponds to the RNA recognition motif of RBM8. RNA-binding protein RBM8, also termed binder of OVCA1-1 (BOV-1) or RNA-binding protein Y14, is one of the components of the exon-exon junction complex (EJC) [ ]. It has two isoforms, RBM8A and RBM8B, both of which are identical except that RBM8B is 16 amino acids shorter at its N terminus []. RBM8, together with other EJC components (such as Magoh, Aly/REF, RNPS1, Srm160, and Upf3), plays critical roles in postsplicing processing, including nuclear export and cytoplasmic localization of the mRNA, and the nonsense-mediated mRNA decay (NMD) surveillance process. RBM8 binds to mRNA 20-24 nucleotides upstream of a spliced exon-exon junction. It is also involved in spliced mRNA nuclear export, and the process of nonsense-mediated decay of mRNAs with premature stop codons. RBM8 forms a specific heterodimer complex with the EJC protein Magoh which then associates with Aly/REF, RNPS1, DEK, and SRm160 on the spliced mRNA, and inhibits ATP turnover by eIF4AIII, thereby trapping the EJC core onto RNA [, ].RBM8 contains an N-terminal putative bipartite nuclear localization signal, one RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), in the central region, and a C-terminal serine-arginine rich region (SR domain) and glycine-arginine rich region (RG domain) [ ].
Protein Domain
Name: WRKY domain superfamily
Type: Homologous_superfamily
Description: This entry represents the WRKY domain superfamily.The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding [ ]. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene.Structural studies indicate that this domain is a four-stranded β-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure [ ]. The WRKYGQK residues correspond to the most N-terminal β-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the β-sheet.
Protein Domain
Name: Cytochrome b5-like heme/steroid binding domain superfamily
Type: Homologous_superfamily
Description: Cytochrome b5 is a membrane-bound hemoprotein which acts as an electron carrier for several membrane-bound oxygenases [ ]. There are two homologous forms of b5, one found in microsomes and one found in the outer membrane of mitochondria. Two conserved histidine residues serve as axial ligands for the heme group. The structure of a number of oxidoreductases consists of the juxtaposition of a heme-binding domain homologous to that of b5 and either a flavodehydrogenase or a molybdopterin domain. These enzymes are:Lactate dehydrogenase (EC 1.1.2.3) [ ], an enzyme that consists of a flavodehydrogenase domain and a heme-binding domain called cytochrome b2.Nitrate reductase (EC 1.7.1.-), a key enzyme involved in the first step of nitrate assimilation in plants, fungi and bacteria [ ]. Consists of a molybdopterin domain, a heme-binding domain called cytochrome b557, as well as a cytochrome reductase domain.Sulfite oxidase (EC 1.8.3.1) [ ], which catalyzes the terminal reaction in the oxidative degradation of sulfur-containing amino acids. Also consists of a molybdopterin domain and a heme-binding domain.Yeast acyl-CoA desaturase 1 (EC 1.14.19.1; gene OLE1). This enzyme contains a C-terminal heme-binding domain.Yeast Scs7 (YMR272c), a sphingolipid alpha-hydroxylase.Proteins containing a cytochrome b5-like domain also include:Fission yeast hypothetical protein SpAC1F12.10c (C1F12.10c).Yeast Irc21 (YMR073c), a putative protein with unknown function.
Protein Domain
Name: NPSN/SNAP25-like, N-terminal SNARE domain
Type: Domain
Description: Soluble N-ethylmaleimide attachment protein receptor (SNARE) proteins are a family of membrane-associated proteins characterised by an α-helical coiled-coil domain called the SNARE motif [ ]. These proteins are classified as v-SNAREs and t-SNAREs based on their localisation on vesicle or target membrane; another classification scheme defines R-SNAREs and Q-SNAREs, as based on the conserved arginine or glutamine residue in the centre of the SNARE motif []. SNAREs are localised to distinct membrane compartments of the secretory and endocytic trafficking pathways, and contribute to the specificity of intracellular membrane fusion processes.The t-SNARE domain consists of a 4-helical bundle with a coiled-coil twist. The SNARE motif contributes to the fusion of two membranes. SNARE motifs fall into four classes: homologues of syntaxin 1a (t-SNARE), VAMP-2 (v-SNARE), and the N- and C-terminal SNARE motifs of SNAP-25.This entry represents the N-terminal SNARE motif found in plant SNAP25 homologues, SNAP29, SNAP30, SNAP32 and SNAP33 [, , , ], and in the novel plant SNAREs (NPSNs) 11, 12 and 13 from Arabidopsis thaliana [, ]. SNAP33 and NPS11 play a key role in cytokinesis [, ]. SNAP33, the most studied SNAP25 homologue in Arabidopsis, is also involved in triggering innate immune responses. SNAP29 and SNAP30 have not been well-characterised yet [].
Protein Domain
Name: Peptidylglycine alpha-hydroxylating monooxygenase/peptidyl-hydroxyglycine alpha-amidating lyase
Type: Family
Description: In vertebrates, peptidylglycine alpha-amidating monooxygenase (PAM) is a multifunctional protein found in secretory granules. The protein contains two enzymes, peptidylglycine alpha-hydroxylating monooxygenase (PHM) and peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL), that act sequentially to catalyse the alpha-amidation of neuroendocrine peptides [ , ]: peptidylglycine + ascorbate + O2= peptidyl-(2-hydroxyglycine) + dehydroascorbate + H 2O The product is unstable and dismutates to glyoxylate and the corresponding desglycine peptide amide. The first step of the reaction is catalysed by peptidylglycine alpha-hydroxylating monooxygenase (PHM), and is dependent on copper, ascorbate and molecular oxygen; peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) catalyses the second step of the reaction [ ]. PHM share protein sequence similarity with dopamine-beta-monooxygenases (DBH), a class of ascorbate-dependent enzymes that requires copper as a cofactor and uses ascorbate as an electron donor. PHM and DBH share a few regions of sequence similarity, some of which contain clusters of conserved histidine residues that may be involved in copper binding [ , ].Interestingly, in Drosophila, the PHM and PAL enzyme are not fused. The Drosophila genome predicts expression of one monofunctional PHM gene and two monofunctional PAL genes [ ]. Drosophila PHM encodes an active enzyme that is required for peptide amidation in vivo [], while both PAL proteins display PAL enzymatic activity and are involved in neuroendocrine biosynthesis [].
Protein Domain
Name: Streptococcal non-M secreted SibA
Type: Family
Description: At present, Streptococci are amongst the most medically important bacterial species, causing a variety of diseases across a wide range of age groups.They are non-motile, Gram-positive cocci that are facultative or obligate anaerobes, and occur in pairs or chains. These microbes can be separatedinto groups according to their serological specificity, group A Streptococci (GAS) being amongst the most virulent and showing the most antibioticresistance. To combat the sudden surge of infections caused by GAS, namely Streptococcus pyogenes and Streptococcus pneumoniae, researchers have turned to the wealth of information contained within streptococcal genomes as a source of novel drug and vaccine targets. This, coupled with other techniques, such as microarray analysis and comparative genomics, are providing insights into the variations between clinical strains and avirulent commensal bacteria. Virulence factors, like the well-characterised glucan-binding protein from Streptococcus mutans are useful in assigning function to novel ORFs in new genomes.An example of this approach is the recent discovery of the SibA secreted protein of S. pyogenes, an immunoglobulin binding moiety that is completelydifferent from the classical Ig-binding M protein of other GAS isolates [ ].It shows similarity to other GAS glucan-binding proteins and secreted antigens, suggesting a common ancestor. Deletional studies in vitro showed that it is essential for virulence, and has several homologues in other Gram-positive pathogens.
Protein Domain
Name: Notch domain
Type: Domain
Description: The Notch domain is also called the 'DSL' domain or the Lin-12/Notch repeat (LNR). The LNR region is present only in Notch related proteins C-terminal to EGF repeats. The lin-12/Notch proteins act as transmembrane receptors for intercellular signals that specify cell fates during animal development. In response to a ligand, proteolytic cleavages release the intracellular domain of Notch, which then gains access to the nucleus and acts as a transcriptional co-activator [ ]. The LNR region is supposed to negatively regulate the Lin-12/Notch proteins activity. It is a triplication of an around 35-40 amino acids module present on the extracellular part of the protein [, ]. Each module contains six cysteine residues engaged in three disulphide bonds and three conserved aspartate and asparagine residues [ ]. The biochemical characterisation of a recombinantly expressed LIN-12.1 module from the human Notch1 receptor indicate that the disulphide bonds are formed between the firstand fifth, second and fourth, and third and sixth cysteines. The formation of this particular disulphide isomer is favored by the presence of Ca 2+, which is also required to maintain the structural integrity of the rLIN-12.1 module. The conserved aspartate and asparagine residues are likely to be important for Ca 2+binding, and thereby contribute to the native fold.
Protein Domain
Name: VAV1, SH2 domain
Type: Domain
Description: This entry represents the SH2 domain of VAV1 from vertebrates.VAV1 (also known as proto-oncogene vav) is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells [ , , ]. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization [, ]. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others []. The VAV protein family members are multiple domain proteins, including Vav from flies and VAV1/2/3 from mammals. VAV1 predominates in hematopoietic cells, whereas VAV2 and VAV3 are more broadly expressed. They have a calponin homology (CH) domain, an acidic domain (AC), a Dbl homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich (CR) domain containing a zinc finger, and a complex region with SH2 and SH3 domains. Therefore they may participate in the activity of several pathways [ , ]. They are signal transducer proteins that couple tyrosine kinase signals with the activation of the Rho/Rac GTPases, [, , ].
Protein Domain
Name: STAT4, SH2 domain
Type: Domain
Description: Signal transducer and activator of transcription 4 (STAT4) transduces interleukin-12, interleukin-23, and type I interferon cytokine signals in T cells and monocytes [ , ]. It plays an important role in CD4+ Th1 lineage differentiation and IFN-gamma protein expression by CD4+ T cells []. It is crucial for both innate and adaptive immune responses to viral infection []. Variations of the STAT4 gene affect the susceptibility to autoimmune diseases [], such as systemic lupus erythematosus 11 (SLEB11) [] and rheumatoid arthritis (RA) []. STAT proteins have a dual function: signal transduction and activation of transcription. When cytokines are bound to cell surface receptors, the associated Janus kinases (JAKs) are activated, leading to tyrosine phosphorylation of the given STAT proteins []. Phosphorylated STATs form dimers, translocate to the nucleus, and bind specific response elements to activate transcription of target genes []. STAT proteins contain an N-terminal domain (NTD), a coiled-coil domain (CCD), a DNA-binding domain (DBD), an α-helical linker domain (LD), an SH2 domain, and a transactivation domain (TAD). The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6 []. This entry represents the SH2 domain of STAT4.
Protein Domain
Name: Monellin, A chain
Type: Family
Description: Monellin is an intensely sweet-tasting protein derived from African berries. The protein has a very high specificity for the sweet receptors, making it ~100,000 times sweeter than sugar on a molar basis and several thousand times sweeter on a weight basis. Like the sweet-tasting protein thaumatin, it neither contains carbohydrates nor modified amino acids. Although there is no sequence similarity between the proteins, antibodies for thaumatin compete for monellin (and other sweet compounds, but not for chemically modified non-sweet monellin) and vice versa []. It is thought that native conformations are important for the sweet taste. Monellin is a heterodimer, comprising an A chain of 44 amino acid residues, and a B chain of 50 residues. The individual subunits are not sweet, nor do they block the sweet sensation of sucrose or monellin. However, blocking the single SH of monellin abolishes its sweetness, as does reaction of its methionyl residue with CNBr []. The cysteinyl and methionyl residues are adjacent, and it has therefore been suggested that this part of the molecule is essential for its sweetness []. The structure of monellin belongs to the alpha/beta class, a 5-stranded β-sheet sequestering a single α-helix. The A chain contributes 3 strands to the sheet.
Protein Domain
Name: Proteinase, regulatory CLIP domain
Type: Domain
Description: The CLIP domain is a regulatory domain which controls the proteinase action of various proteins of the trypsin family, e.g. easter and pap2. The domain is restricted to the arthropoda and found in varying copy numbers (from one to five in Drosophila proteins). It is always found N-terminal to the chymotrypsin serine protease domain, which belong to MEROPS peptidase family S1A. The CLIP domain remains linked to the protease domain after cleavage of a conserved residue which retains the protein in zymogen form. It is named CLIP because it can be drawn in the shape of a paper clip. It has many disulphide bonds and highly conserved cysteine residues, and so it folds extensively [ , ]. The clip domain adopts an α/β mixed fold consisting of two helices and an antiparallel distorted β-sheet made of four strands. The two helices are antiparallel and are almost perpendicular to the β-sheet. Three disulfide bridges (C1-C5, C2-C4, C3-C6) stabilize the β-sheet, C3 being the only cysteine that is not located on a β-strand. The clip domain is located opposite the activation loop and contacts the C-terminal α-helix of the SP domain [ ]. The CLIP domain is present in silkworm prophenoloxidase-activating enzyme [ ].
Protein Domain
Name: Signal transducer and activator of transcription 4 , coiled-coil domain
Type: Domain
Description: Signal transducer and activator of transcription 4 (STAT4) transduces interleukin-12, interleukin-23, and type I interferon cytokine signals in T cells and monocytes [ , ]. It plays an important role in CD4+ Th1 lineage differentiation and IFN-gamma protein expression by CD4+ T cells []. It is crucial for both innate and adaptive immune responses to viral infection []. Variations of the STAT4 gene affect the susceptibility to autoimmune diseases [], such as systemic lupus erythematosus 11 (SLEB11) [] and rheumatoid arthritis (RA) []. STAT proteins have a dual function: signal transduction and activation of transcription. When cytokines are bound to cell surface receptors, the associated Janus kinases (JAKs) are activated, leading to tyrosine phosphorylation of the given STAT proteins [ ]. Phosphorylated STATs form dimers, translocate to the nucleus, and bind specific response elements to activate transcription of target genes []. STAT proteins contain an N-terminal domain (NTD), a coiled-coil domain (CCD), a DNA-binding domain (DBD), an α-helical linker domain (LD), an SH2 domain, and a transactivation domain (TAD). The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6 []. This entry represents the coiled-coil (CCD or alpha) domain of STAT4.
Protein Domain
Name: Amphiphysin
Type: Family
Description: Amphiphysins belong to the expanding BAR (Bin-Amphiphysin-Rvsp) family proteins, all members of which share a highly conserved N-terminal BAR domain, which has predicted coiled-coil structures required for amphiphysin dimerisation and plasma membrane interaction [ , ]. Almost all members also share a conserved C-terminal Src homology 3 (SH3) domain, which mediates their interactions with the GTPase dynamin and the inositol-5'-phosphatase synaptojanin 1 in vertebrates and with actin in yeast. The central region of all these proteins is most variable. In mammals, the central region of amphiphysin I and amphiphysin IIa contains a proline-arginine-rich region for endophilin binding and a CLAP domain, for binding to clathrin and AP-2. The interactions mediated by both the central and C-terminal domains are believed to be modulated by protein phosphorylation [, ].Amphiphysins are key players in the control of plasma membrane curvature, membrane shaping and membrane remodeling, involved involved in clathrin-mediated endocytosis, the endosomal sorting of membrane proteins, actin function, and signalling pathways [ , ]. In vertebrates, amphiphysins may regulate, but are not essential for clathrin-mediated endocytosis of SVs. However, in Drosophila amphiphysin is not involved at all in SV endocytosis but is required for T-tubule structure and excitation-contraction coupling muscles and plays a role in membrane morphogenesis in developing photoreceptors and a variety of other cells [].
Protein Domain
Name: Dedicator of cytokinesis C, C2 domain
Type: Domain
Description: DOCK family members are evolutionarily conserved guanine nucleotide exchange factors (GEFs) for Rho-family GTPases [ ]. DOCK proteins are required during several cellular processes, such as cell motility and phagocytosis. The N-terminal SH3 domain of the DOCK proteins functions as an inhibitor of GEF, which can be relieved upon its binding to the ELMO1-3 adaptor proteins, after their binding to active RhoG at the plasma membrane [, ]. DOCK family proteins are categorised into four subfamilies based on their sequence homology: DOCK-A subfamily (DOCK1/180, 2, 5), DOCK-B subfamily (DOCK3, 4), DOCK-C subfamily (DOCK6, 7, 8), DOCK-D subfamily (DOCK9, 10, 11) []. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). This entry represents the C2 domain found in the Dock-C members. In addition to the C2 domain (also known as DHR-1 domain) and the DHR-2 domain, Dock-C members contain a functionally uncharacterised domain upstream of the C2 domain. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock1 (also known as Dock180) and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3) [ , , ].
Protein Domain
Name: Disease resistance protein, RPW8-like
Type: Family
Description: Arabidopsis RPW8.1 and RPW8.2 genes confer broad-spectrum resistance to powdery mildew [ , ]. RPW8.2 is specifically targeted to the extrahaustorial membrane (EHM), where it activates haustorium-targeted resistance against powdery mildew []. This family includes RPW8.2 and homologues of RPW8 (HR), also known as RPW8-like proteins. HRs also contribute to basal resistance to powdery mildew, and HR1 to HR3 have been shown to localize to the EHM, suggesting that this could be a feature of the family [].Plants are attacked by a range of phytopathogenic organisms, including viruses, mycoplasma, bacteria, fungi, nematodes, protozoa and parasites. Resistance to a pathogen is manifested in several ways and is often correlated with a hypersensitive response (HR), localised induced cell death in the host plant at the site of infection [ , ]. The induction of the plant defence response that leads to HR is initiated by the plants recognition of specific signal molecules (elicitors) produced by the pathogen; R genes are thought to encode receptors for these elicitors. RPS2, N and L6 genes confer resistance to bacterial, viral and fungal pathogens.Sequence analysis has shown that they contain C-terminal leucine-rich repeats, which are characteristic of plant and animal proteins involved in protein-protein interactions [ ]. In addition, the sequences contain a conserved nucleotide-binding site towards their N-terminal.
Protein Domain
Name: PKHA4-7, PH domain
Type: Domain
Description: This entry represents pleckstrin homology (PH) domain found in the Pleckstrin homology domain-containing family A members 4-7 (PKHA4-7) from humans. This domain is involved in targeting these proteins to appropriate cellular compartments or enabling them to interact with other components of the signal transduction pathways. Some PH domains are responsible for the protein binding to phosphoinositide phosphates (PIPs) with high affinity and specificity, others display strong specificity in lipid binding. Its specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. Proteins included in this entry are predominantly found in chordates. Some members also contain WW (also known as WWP) domains, also occurring in proteins involved in signal transduction processes [ , , , ]. PKHA4 (PEPP-1) binds specifically to phosphatidylinositol 3-phosphate (PtdIns3P) and was reported to be involved in ubiquitination [ , ]. In humans, PKHA6 (PEPP-3) has been related to the pathophysiology of schizophrenia and the therapy response towards antipsychotics []. PKHA7, required for zonula adherens biogenesis and maintenance, has been identified as one of the host factors mediating death by S. aureus alpha-toxin [] and related to hypertension, glaucoma and cancer [, , , , ].
Protein Domain
Name: Carboxyltransferase domain, subdomain C and D
Type: Domain
Description: Urea carboxylase (UC) catalyses a two-step, ATP- and biotin-dependent carboxylation reaction of urea. It is composed of biotin carboxylase (BC), carboxyltransferase (CT), and biotin carboxyl carrier protein (BCCP) domains. The CT domain of UC consists of four subdomains, named A, B, C and D. This domain covers the C and D subdomains of the CT domain. This domain covers the whole length of kipI (kinase A inhibitor) from Bacillus subtilis [ ]. It can also be found in S. cerevisiae urea amidolyase Dur1,2, which is a multifunctional biotin-dependent enzyme with domains for urea carboxylase and allophanate (urea carboxylate) hydrolase activity []. KipI forms a complex with KipA, which is covered by the A and B subdomains of the CT. The KipI-KipA complex shares protein structure and sequence similarity with the CT domain of urea amidolyase from K. lactis, but residues that are important for CT catalysis are not conserved in KipA and KipI. Therefore, the KipA-KipI complex is unlikely to have CT activity [ ]. The CT domain is homologous to the Thermus thermophilus protein TTHA0988 (). However, the subdomain order of TTHA0988 is different compared with that of CT, suggesting distinct fusion events in the evolution of these proteins [ ].
Protein Domain
Name: Carboxyltransferase domain, subdomain A and B
Type: Domain
Description: Urea carboxylase (UC) catalyses a two-step, ATP- and biotin-dependent carboxylation reaction of urea. It is composed of biotin carboxylase (BC), carboxyltransferase (CT), and biotin carboxyl carrier protein (BCCP) domains. The CT domain of UC consists of four subdomains, named A, B, C and D. This domain covers the A and B subdomains of the CT domain. This domain covers the whole length of KipA (kinase A) from Bacillus subtilis [ ]. It can also be found in S. cerevisiae urea amidolyase Dur1,2, which is a multifunctional biotin-dependent enzyme with domains for urea carboxylase and allophanate (urea carboxylate) hydrolase activity []. KipA forms a complex with KipI, which is covered by the C and D subdomains of the CT. The KipI-KipA complex shares protein structure and sequence similarity with the CT domain of urea amidolyase from K. lactis, but residues that are important for CT catalysis are not conserved in KipA and KipI. Therefore, the KipA-KipI complex is unlikely to have CT activity [ ]. The CT domain is homologous to the Thermus thermophilus protein TTHA0988 ( ). However, the subdomain order of TTHA0988 is different compared with that of CT, suggesting distinct fusion events in the evolution of these proteins [ ].
Protein Domain
Name: HMW glutenin
Type: Family
Description: Gluten is the protein component of wheat flour. It consists of numerous proteins, which are of two different types responsible for different physicalproperties of dough: the glutenins, which are primarily responsible for the elasticity, and the gliadins, which contribute to the extensibility.The glutenins are of two different types, termed low (LMW) and high molecular weight (HMW) subunits []. The glutenin high molecular weight subunits are classified aselastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are allpolymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobicresidues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterised by the following three repeated motifs:PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping β-turns within and between the repeated motifs, and assumes aregular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm [ ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom