Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 2201 to 2300 out of 38750 for *

Category restricted to ProteinDomain (x)

0.018s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Annexin repeat
Type: Repeat
Description: The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner [ ]. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long [ ]. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition. Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [ ].
Protein Domain
Name: Munc13 homology 1
Type: Domain
Description: Munc13 proteins constitute a family of three highly homologous molecules (Munc13-1, Munc13-2 and Munc13-3) with homology to Caenorhabditis elegans Unc-13. Munc13 proteins contain a phorbol ester-binding C1 domain and two C2 domains, which are Ca2+/phospholipid binding domains. Sequence analyses have uncovered two regions called Munc13 homology domains 1 (MHD1) and 2 (MHD2) that are arranged between two flanking C2 domains. MHD1 and MHD2 domains are present in a wide variety of proteins from Arabidopsis thaliana, C. elegans, Drosophila melanogaster (Fruit fly), Mus musculus (Mouse), Rattus norvegicus (Rat) and Homo sapiens (Human), some of which may function in a Munc13-like manner to regulate membrane trafficking [ ].The MHD1 and MHD2 domains are predicted to be α-helical [ ]. Some proteins known to contain MHD1 and MHD2 domains are listed below:Mammalian Munc13-1. It is specifically targeted to presynaptic active zones and has a central priming function in synaptic vesicle exocytosis from glutaminergic synapses.Mammalian Munc13-2. It plays a role in vesicle maturation during exocytosis as a target of the diacylglycerol second messenger pathway.Mammalian Munc13-3. It probably plays a role in vesicle maturation during exocytosis as a target of the diacylglycerol second messenger pathway.Mammalian Munc13-4. It is predominantly expressed in lung where it is localized to goblet cells of the bronchial epithelium and to alveolar type II cells, both of which are cell types with secretory function.C. elegans Unc-13. It may form part of a signal transduction pathway, transducing the signal from diacylglycerol to effector functions.Mammalian BAI1-associated protein 3 (BAP3), which exhibits the typical Munc13-like domain structure with two C2 domains flanking the MHD1 and MHD2 domains, but which lack the long N terminus with the C1 domain.Animal calcium-dependent activator proteins for secretion (CAPSs), regulators of large dense-core vesicle secretion. They contain only a MHD1 domain and are otherwise unrelated to Munc13 proteins.A. thaliana hypothetical proteins with MHD1 and MHD2 domains but without C1 and C2 domains.Saccharomyces cerevisiae uncharacterised protein YOR296W, where MHD1 and MHD2 enclose a central C2 domain. YOR296W is presumably involved in bud formation.Schizosaccharomyces pombe hypothetical protein C11E3.02c in chromosome I, where MHD1 and MHD2 enclose a central C2 domain.This entry represents the Munc13 homology domain 1.
Protein Domain
Name: Mammalian uncoordinated homology 13, domain 2
Type: Domain
Description: Mammalian uncoordinated homology 13 (Munc13) proteins constitute a family of three highly homologous molecules (Munc13-1, Munc13-2 and Munc13-3) with homology to Caenorhabditis elegans Unc-13. Munc13 proteins contain a phorbol ester-binding C1 domain and two C2 domains, which are Ca2+/phospholipid binding domains. Sequence analyses have uncovered two regions called Munc13 homology domains 1 (MHD1) and 2 (MHD2) that are arranged between two flanking C2 domains. MHD1 and MHD2 domains are present in a wide variety of proteins from Arabidopsis thaliana, C. elegans, Drosophila melanogaster (Fruit fly), Mus musculus (Mouse), Rattus norvegicus (Rat) and Homo sapiens (Human), some of which may function in a Munc13-like manner to regulate membrane trafficking [ ].The MHD1 and MHD2 domains are predicted to be α-helical [ ]. Some proteins known to contain MHD1 and MHD2 domains are listed below:Mammalian Munc13-1. It is specifically targeted to presynaptic active zones and has a central priming function in synaptic vesicle exocytosis from glutaminergic synapses.Mammalian Munc13-2. It plays a role in vesicle maturation during exocytosis as a target of the diacylglycerol second messenger pathway.Mammalian Munc13-3. It probably plays a role in vesicle maturation during exocytosis as a target of the diacylglycerol second messenger pathway.Mammalian Munc13-4. It is predominantly expressed in lung where it is localized to goblet cells of the bronchial epithelium and to alveolar type II cells, both of which are cell types with secretory function.C. elegans Unc-13. It may form part of a signal transduction pathway, transducing the signal from diacylglycerol to effector functions.Mammalian BAI1-associated protein 3 (BAP3), which exhibits the typical Munc13-like domain structure with two C2 domains flanking the MHD1 and MHD2 domains, but which lack the long N terminus with the C1 domain.Animal calcium-dependent activator proteins for secretion (CAPSs), regulators of large dense-core vesicle secretion. They contain only a MHD1 domain and are otherwise unrelated to Munc13 proteins.A. thaliana hypothetical proteins with MHD1 and MHD2 domains but without C1 and C2 domains.Saccharomyces cerevisiae uncharacterised protein YOR296W, where MHD1 and MHD2 enclose a central C2 domain. YOR296W is presumably involved in bud formation.Schizosaccharomyces pombe hypothetical protein C11E3.02c in chromosome I, where MHD1 and MHD2 enclose a central C2 domain.This entry represents the Munc13 homology domain 2.
Protein Domain
Name: Protein unc-13 homologue
Type: Family
Description: This family consists mainly of uncharacterized proteins from plants. The unc-13 homologue from Arabidopsis thalianahas been shown to control tethering of the proton ATPase AHA1 to the plasma membrane, and to be essential for the opening of stomata in repsonse to low levels of carbon dioxide and light. The unc13-like protein contains two Munc13 homology domains, which are known to mediate synaptic priming in neuronal exocytosis in animals, and may act similarly for plant stomata [ ].
Protein Domain
Name: JmjC domain
Type: Domain
Description: The JmjN and JmjC domains are two non-adjacent domains which have been identified in the jumonji family of transcription factors. Although it was originally suggested that the JmjN and JmjC domains always co-occur and might form a single functional unit within the folded protein, the JmjC domain was later found without the JmjN domain in organisms from bacteria to human [ , , ].Proteins containing JmjC domain are predicted to be metalloenzymes that adopt the cupin fold and are candidates for enzymes that regulate chromatin remodelling [ ]. The cupin fold is a flattened β-barrel structure containing two sheets of five antiparallel β-strands that form the walls of a zinc-binding cleft. Based on the crystal structure of JmjC domain containing protein FIH and JHDM3A/JMJD2A, the JmjC domain forms an enzymatically active pocket that coordinates Fe(III) and alphaKG. Three amino-acid residues within the JmjC domain bind to the Fe(II) cofactor and two additional residues bind to alphaKG []. JmjC domains were identified in numerous eukaryotic proteins containing domains typical of transcription factors, such as PHD, C2H2, ARID/BRIGHT and zinc fingers [ , ]. The JmjC has been shown to function in a histone demethylation mechanism that is conserved from yeast to human []. JmjC domain proteins may be protein hydroxylases that catalyse a novel histone modification []. The human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalysing hydroxylation [].
Protein Domain
Name: Ribosomal protein S26e
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families, the S26E family, includes mammalian S26 []; Octopus S26 [];Drosophila S26 (DS31) [ ]; plant cytoplasmic S26; and fungal S26 []. S26 may be involved in the attachment of eIF3 and poly (U) []. Disruption of RPS26, the gene encoding a homologue of ribosomal protein small subunit S26 in yeast (S. cerevisiae), resulted in the formation of micro-colonies, suggesting that it is important for the normal cell growth of S. cerevisiae [].
Protein Domain
Name: Transmembrane Fragile-X-F-associated protein
Type: Family
Description: This entry represents conserved transmembrane proteins that in humans are expressed from a region upstream of the FragileXF site and appear to be intimately linked with Fragile-X syndrome. The absence of the human TMEM185A protein does not necessarily lead to developmental delay, but might, in combination with other, currently unknown, factors. Alternatively, the TMEM185A protein is either redundant, or its function can be complemented by the highly similar chromosome 2 retro-pseudogene product, TMEM185B [ ].
Protein Domain
Name: DNA replication helicase domain
Type: Domain
Description: This entry represents a domain found in viral DNA replication helicases, bacterial ATP-dependent DNA helicase Pif1 and eukaryotic ATP-dependent DNA helicase PIF7.
Protein Domain
Name: DENND6
Type: Family
Description: The DENND6 family of proteins includes DENND6A and B. They act as guanine nucleotide exchange factors (GEF) for RAB14 [ ].
Protein Domain      
Protein Domain
Name: Protein of unknown function DUF936, plant
Type: Family
Description: This family consists of several hypothetical proteins from plants. The function of this family is unknown.
Protein Domain      
Protein Domain
Name: Ribosome recycling factor
Type: Family
Description: The ribosome recycling factor or ribosome release factor (RRF) dissociates ribosomes from mRNA after termination of translation, and is essential for bacterial growth [ ]. Thus ribosomes are 'recycled' and ready for another round of protein synthesis.RRF is believed to bind the ribosome at the A-site in a manner that mimics tRNA, but the specific mechanisms remain unclear [ ].
Protein Domain
Name: Ribosome recycling factor domain
Type: Domain
Description: The ribosome recycling factor or ribosome release factor (RRF) dissociates ribosomes from mRNA after termination of translation, and is essential for bacterial growth [ ]. Thus ribosomes are 'recycled' and ready for another round of protein synthesis.This entry represents a domain found in ribosome recycling factors.
Protein Domain
Name: Glyceraldehyde-3-phosphate dehydrogenase, type I
Type: Family
Description: This group of sequences represent glyceraldehyde-3-phosphate dehydrogenase (GAPDH), the enzyme responsible for the interconversion of 1,3-diphosphoglycerate and glyceraldehyde-3-phosphate, a central step in glycolysis and gluconeogenesis. Forms exist which utilise NAD ( ), NADP ( ) or either ( ). In some species, NAD- and NADP- utilising forms exist, generally being responsible for reactions in the anabolic and catabolic directions respectively [ ]. An additional form of gap gene is found in gamma proteobacteria and is responsible for the conversion of erythrose-4-phosphate (E4P) to 4-phospho-erythronate in the biosynthesis of pyridoxine []. This pathway of pyridoxine biosynthesis appears to be limited, however, to a relatively small number of bacterial species although it is prevalent among the gamma-proteobacteria []. This enzyme is described by . These two groups of sequences exhibit a close evolutionary relationship. There exists the possibility that some forms of GAPDH may be bifunctional and act on E4P in species which make pyridoxine and via hydroxythreonine and lack a separate E4PDH enzyme (for instance, the GAPDH from Bacillus stearothermophilus has been shown to possess a limited E4PD activity as well as a robust GAPDH activity [ ]).
Protein Domain
Name: GYF domain 2
Type: Domain
Description: This domain is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. It contains an evolutionary conserved signature W-X-Y-X6-11-GPF-X4-M-X2-W-X3-GYF, the site of interaction with proline-rich peptides. Proteins containing this domain include RME-8 (Required for receptor-mediated endocytosis 8), a DNAJC13 protein. RME-8 was first identified as a protein that is required for endocytosis in Caenorhabditis elegans. It coordinates the activity of the WASH complex with the function of the retromer SNX dimer to control endosomal tubulation [ ]. Proteins containing this domain also include Arabidopsis trithorax-related3 (Atxr3) and Tic56. Atxr3 is the major enzyme responsible for H3K4me3, which is critical for regulating gene expression and plant development [ ]. Tic56 is an essential subunit of a 1-MDa protein complex at the inner chloroplast envelope membrane []. Tic56 also plays important roles in rRNA processing and chloroplast ribosome assembly [].
Protein Domain
Name: KHA domain
Type: Domain
Description: Potassium channels take part in important processes of higher plants, including opening and closing of stomatal pores and leaf movement. Inward rectifying potassium (K(+)in) channels play an important role in turgor regulation and ion uptake in higher plants. All of them comprise, from their N-terminal to their C-terminal ends: a short hydrophilic region, a hydrophobic region structurally analogous and partially homologous to the transmembrane domain of voltage-gated animal channels from the Shaker superfamily, a putative cyclic nucleotide-binding domain, and a conserved C-terminal KHA domain. Between these last two regions, some of them (AKT1, AKT2 and SKT1) contain an ankyrin-repeat domain with six repeats homologous to those of human erythrocyte ankyrin. This entry represents the KHA domain which is unique to plant K(+)in channels. The KHA domain contains two high-homology blocks enriched for hydrophobic and acidic residues, respectively. The KHA domain is essential for interaction of plant K(+)in channels. The KHA domain mediates tetramerization and/or stabilisation of the heteromers [ , , ].
Protein Domain
Name: Potassium channel, voltage-dependent, EAG/ELK/ERG
Type: Family
Description: Potassium channels are the most diverse group of the ion channel family [ , ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K +channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [ ]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [ ]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K +channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K +selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K +across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K +channels; and three types of calcium (Ca)-activated K +channels (BK, IK and SK) [ ]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K +channel alpha-subunits that possess two P-domains. These are usually highly regulated K +selective leak channels. The first EAG K+ channel was identified in Drosophila melanogaster (Fruit fly), following a screen for mutations giving rise to behavioural abnormalities. Disruption of the Eag gene caused an ether-induced, leg-shaking behaviour. Subsequent studies have revealed a conserved multi-gene family of EAG-like K+ channels, which are present in human and many other species. Based on the varying functional properties of the channels, the family has been divided into 3 subfamilies: EAG, ELK and ERG. Interestingly, Caenorhabditis elegans appears to lack the ELK type [ ].
Protein Domain
Name: ENT domain
Type: Domain
Description: The EMSY N-terminal (ENT) domain is a ~90-residue module, which is unique in the human proteome, although multiple copies are found in Arabidopsis proteins. In the plant proteins, the ENT domains are accompanied by Agenet domains, plant specific homologues of Tudor domains [ , ].The ENT domain consists of a unique arrangement of five α-helices that fold into a helical bundle arrangement. Overall, the three-dimensional structure adopts a club-like shape that consists of an extended N-terminal α-helix that connects to a helical bundle substructure. The ENT domain forms a homodimer via the anti-parallel packing of the long N-terminal α-helix from each subunit [ , ].
Protein Domain
Name: Glycerol-3-phosphate dehydrogenase, NAD-dependent, C-terminal
Type: Domain
Description: NAD-dependent glycerol-3-phosphate dehydrogenase ( ) (GPD) catalyzes the reversible reduction of dihydroxyacetone phosphate to glycerol-3-phosphate. It is a cytoplasmic protein, active as a homodimer [ ], each monomer containing an N-terminal NAD binding site []. In insects, it acts in conjunction with a mitochondrial alpha-glycerophosphate oxidase in the alpha-glycerophosphate cycle, which is essential for the production of energy used in insect flight [].
Protein Domain
Name: Glycerol-3-phosphate dehydrogenase, NAD-dependent
Type: Family
Description: NAD-dependent glycerol-3-phosphate dehydrogenase ( ) (GPD) catalyzes the reversible reduction of dihydroxyacetone phosphate to glycerol-3-phosphate. It is a cytoplasmic protein, active as a homodimer [ ], each monomer containing an N-terminal NAD binding site []. In insects, it acts in conjunction with a mitochondrial alpha-glycerophosphate oxidase in the alpha-glycerophosphate cycle, which is essential for the production of energy used in insect flight [].
Protein Domain
Name: CFA20 domain
Type: Domain
Description: This domain is characteristic of cilia- and flagella-associated protein 20 (CFA20). CFA20 is a cilium- and flagellum-specific protein that plays a role in axonemal structure organisation and motility [ , ]. In Chlamydomonas reinhardtii, it stabilises outer doublet microtubules (DMTs) of the axoneme and may work as a scaffold for intratubular proteins, such as tektin and PACRG, to produce the beak structures in DMT1 [, ].Other proteins contain a domain with homology to CFA20. WDR90/POC16 contains such a domain in its N terminus, followed by a large C-terminal domain with multiple WD40 repeats [ ]. This domain is also present in the N terminus of uncharacterised protein C3orf67.
Protein Domain
Name: Protein MIZU-KUSSEI 1-like, plant
Type: Family
Description: This entry includes Arabidopsis MIZU-KUSSEI 1 (MIZ1), which is an essential protein for hydrotropism in roots. It can be regulated by light signal and ABA signalling [, ].
Protein Domain
Name: FCP1-like phosphatase, phosphatase domain
Type: Domain
Description: This entry represents the phosphatase domain of the human RNA polymerase II subunit A C-terminal domain phosphatase (FCP1, [ ]) and closely related phosphatases from eukaryotes including plants, fungi [] and slime mold. This domain is a member of the haloacid dehalogenase (HAD) superfamily by virtue of a conserved set of three catalytic motifs [] and a conserved fold as predicted by PSIPRED. The third motif in this family is distinctive (hhhhDDppphW). This domain is classified as a "Class III"HAD, since there is no large "cap"domain found between motifs 1 and 2 or motifs 2 and 3 [ ]. This domain is related to domains found in the human NLI interacting factor-like phosphatases.
Protein Domain
Name: Retinoblastoma-associated protein, A-box
Type: Domain
Description: Retinoblastoma-like and retinoblastoma-associated proteins may have a function in cell cycle regulation. They form a complex with adenovirus E1A and Simian virus 40 (SV40) large T antigen, and may bind and modulate the function of certain cellular proteins with which T and E1A compete for pocket binding. The proteins may act as tumor suppressors, and are potent inhibitors of E2F-mediated trans-activation. This domain has the cyclin fold [].The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B-box portion of the pocket; the A-box portion appears to be required for the stable folding of the B box (see ). Also highly conserved is the extensive A-B interface, suggesting that it may be an additional protein-binding site. The A and B boxes each contain the cyclin-fold structural motif, with the LxCxE-binding site on the B-box cyclin fold being similar to a Cdk2-binding site of cyclin A and to a TBP-binding site of TFIIB [ ].The A and B boxes are found at the C-terminal end of the protein; the A-box is on N-terminal side of the B-box.
Protein Domain
Name: Retinoblastoma-associated protein, B-box
Type: Domain
Description: This entry includes retinoblastoma-associated protein (RB, also known as pRb, RB, p1051), retinoblastoma-like protein 1 (RBL1, also known as p107) and retinoblastoma-like protein 2 (RBL2, also known as RB2 or p130). Members of this entry contain a conserved domain named the 'pocket' that interacts with the LXCXE motif found in viral proteins, such as SV40 large T antigen [ ]. This pocket consists of A- and B-boxes which are found at the C-terminal end of the protein. This entry represents the B-box, that is on C-terminal side of the A-box. The crystal structure of the RB pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other RB-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B-box portion of the pocket; the A-box portion (see ) appears to be required for the stable folding of the B box. Also highly conserved is the extensive A-B interface, suggesting that it may be an additional protein-binding site. The A and B boxes each contain the cyclin-fold structural motif, with the LxCxE-binding site on the B-box cyclin fold being similar to a Cdk2-binding site of cyclin A and to a TBP-binding site of TFIIB [ , ].In humans, RB is a tumour suppressor linked to several major cancers [ ]. RB forms complexes with E2Fs and represses gene expression by recruiting chromatin remodeling factors, such as histone deacetylases (HDACs) to E2F-responsive promoters [, ]. Apart from E2Fs, RB also interacts with other transcription factors that govern cell differentiation [, ].RBL1 and RBL2 are the components of the DREAM complex, which represses cell cycle-dependent genes in quiescent cells and plays a role in the cell cycle-dependent activation of G2/M genes [ , ].
Protein Domain
Name: Retinoblastoma protein family
Type: Family
Description: This entry represents the retinoblastoma protein family, including retinoblastoma-associated protein (RB, also known as pRb, RB, p1051), retinoblastoma-like protein 1 (RBL1, also known as p107) and retinoblastoma-like protein 2 (RBL2, also known as RB2 or p130). Members of this family contain a conserved domain named the 'pocket' that interacts with the LXCXE motif found in viral proteins, such as SV40 large T antigen []. Therefore, this family is also called the pocket protein family.In humans, RB is a tumour suppressor linked to several major cancers [ ]. RB forms complexes with E2Fs and represses gene expression by recruiting chromatin remodeling factors, such as histone deacetylases (HDACs) to E2F-responsive promoters [, ]. This interaction regulates genes necessary for DNA replication and cell cycle. Phosphorylation of RB family by CDK and cyclin complexes leads to release of the repressor complex and enables E2F-dependent gene expression []. Apart from E2Fs, RB also interacts with other transcription factors that govern cell differentiation [, ]. RBL1 and RBL2 are the components of the DREAM complex, which represses cell cycle-dependent genes in quiescent cells and plays a role in the cell cycle-dependent activation of G2/M genes [ , ].
Protein Domain
Name: Retinoblastoma-associated protein, N-terminal
Type: Domain
Description: This domain is found in N-terminal of the retinoblastoma-associated protein. It is found in association with and . This domain is typically between 124 to 150 amino acids in length and has a single completely conserved residue W that may be functionally important.
Protein Domain
Name: Zinc finger, GRF-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few [ ]. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This presumed zinc-binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to .
Protein Domain
Name: Ribosomal protein S3Ae
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins that have from 220 to 250 amino acids and represents Rps1 (eukaryotic) and Rps3Ae (archaeal and eukaryotic).
Protein Domain
Name: 40S ribosomal protein S1/3, eukaryotes
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].This entry represents the 40S ribosomal protein S1/S3 from eukaryotes.
Protein Domain
Name: Ribosomal protein S3Ae, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins that have from 220 to 250 amino acids.
Protein Domain
Name: Protein Networked (NET), actin-binding (NAB) domain
Type: Domain
Description: This entry represents the NAB domain found in the Networked proteins.The Networked (NET) proteins are a superfamily of plant-specific actin-binding proteins which localize simultaneously to the actin cytoskeleton and specificmembrane compartments and are suggested to couple these membranes to the actin cytoskeleton in plant cells. The minimal actin binding region, referred to asthe NET actin-binding (NAB) domain, represents a new actin binding motif unique to plants with no apparent primary sequence homolgy to previouslyidentified actin binding domains. In Arabidopsis, the NAB domain always starts with three conserved tryptophan residues, WWW, a motif whose worldwide webconnection gives added significance to the NET family name. The C-terminal half of the domain is very highly conserved, more so than the N-terminal[ , ].The predicted secondary structure of the domain includes three major alpha helices connected by a beta turns with the WWW motif predicted to form a betasheet [ , ].
Protein Domain
Name: RNA polymerases, subunit N, zinc binding site
Type: Binding_site
Description: In eukaryotes, there are three different forms of DNA-dependent RNA polymerases ( ) transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. Archaebacterial subunit N (gene rpoN) [] is a small protein of about 8kDa, itis evolutionary related [ ] to a 8.3kDa component shared by all three forms ofeukaryotic RNA polymerases (gene RPB10 in yeast and POLR2J in mammals) as well as to African swine fever virus (ASFV) protein CP80R [].This signature spans the conserved region, which is located at the N-terminal extremity of these polymerase subunits; this region contains twocysteines that binds a zinc ion [ ].
Protein Domain
Name: RNA polymerase subunit RPB10
Type: Homologous_superfamily
Description: The RNA polymerase subunit RPB10 displays a high level of conservation across archaea and eukaryota. Structure determination of this subunit reveals a zinc-bundle topology, consisting of three α-helices stabilised by a zinc ion [ ].
Protein Domain
Name: DNA-directed RNA polymerase subunit RPABC5/Rpb10
Type: Family
Description: In eukaryotes, there are three different forms of DNA-dependent RNA polymerases ( ) transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. Archaebacterial subunit Rpo10 (gene rpoN) [] is a small protein of about 8kDa, it is evolutionary related [] to a 8.3kDa component shared by all three forms of eukaryotic RNA polymerases (RPABC5, gene Rpb10 in yeast and POLR2J in mammals) as well as to African swine fever virus (ASFV) protein CP80R [].This family includes both archaeal subunit N, also known as Rpo10 following the eukaryotic nomenclature [ ], and eukaryotic Rpb10. There is a conserved region which is located at the N-terminal extremity of these polymerase subunits; this region contains two cysteines that bind a zinc ion [].
Protein Domain
Name: Albumin I
Type: Family
Description: The albumin I protein, a hormone-like peptide, stimulates kinase activity upon binding a membrane bound 43kDa receptor. The structure of this region reveals a knottin like fold, comprise of three beta strands [ ].
Protein Domain
Name: Dymeclin
Type: Family
Description: Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. Human dymeclin is necessary for correct organisation of Golgi apparatus and is involved in bone development [ ]. Mutations in the dymeclin gene cause Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800), an autosomal-recessive disorder characterised by the association of spondylo-epi-metaphyseal dysplasia and mental retardation [].
Protein Domain
Name: Early nodulin 93 ENOD93 protein
Type: Family
Description: The expression of early nodulin (ENOD) genes has been well characterised in several legume species. Based on their biochemical attributes and expression patterns, they are postulated to have roles in cell structure, in the control of nodule ontogeny by the degradation of Nod factor, and in carbon metabolism [].
Protein Domain
Name: Acireductone dioxygenase ARD family
Type: Family
Description: The two acireductone dioxygenase enzymes (ARD and ARD', previously known as E-2 and E-2') from Klebsiella pneumoniae share the same amino acid sequence , but bind different metal ions: ARD binds Ni2+, ARD' binds Fe2+ []. ARD and ARD' can be experimentally interconverted by removal of the bound metal ion and reconstitution withthe appropriate metal ion. The two enzymes share the same substrate, 1,2-dihydroxy-3-keto-5-(methylthio)pentene, but yield different products. ARD' yields the alpha-keto precursor of methionine (and formate), thus forming part of theubiquitous methionine salvage pathway that converts 5'-methylthioadenosine (MTA) to methionine. This pathway is responsible for the tight control of the concentration of MTA, which is a powerful inhibitor of polyamine biosynthesis andtransmethylation reactions [ ]. ARD yields methylthiopropanoate, carbon monoxide and formate, and thus prevents theconversion of MTA to methionine. The role of the ARD catalysed reaction is unclear: methylthiopropanoate is cytotoxic, and carbon monoxide can activate guanylyl cyclase, leading to increased intracellular cGMP levels [, ]. Eukaryotic aci-reductone dioxygenase (ARD), also known as 1,2-dihydroxy-3-keto-5-methylthiopentene dioxygenase, acts in the methionine salvage pathway []. Several homologous ARD genes have been identified in plants [].
Protein Domain
Name: Acireductone dioxygenase, eukaryotes
Type: Family
Description: Acireductone dioxygenase (ARD/MTND, also known as 1,2-dihydroxy-3-keto-5-methylthiopentene dioxygenase) ( ) is an eukaryotic enzyme that catalyses the formation of formate and 2-keto-4-methylthiobutyrate (KMTB) from 1,2-dihydroxy-3-keto-5-methylthiopentene (DHK-MTPene). It shows significant homology to the bacterial acireductone dioxygenase (ARD), which is an enzyme in the methionine salvage pathway (MTA cycle) [ , , ]]. MTCBP-1 is necessary for hepatitis C virus replication in an otherwise non-permissive cell line []. MTCBP-1 interacts and inhibits the activity of membrane-type 1 matrix metalloproteinase (MT1-MMP/MMP-14) in promoting tumor cell migration and invasion [].MTCBP-1 contains the cupin sequence motif and belongs to the Cupin superfamily composed of proteins with diverse functions [ ].
Protein Domain
Name: Mediator complex, subunit Med17
Type: Family
Description: The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.This entry represents subunit Med17 of the Mediator complex. The Med17 subunit is located within the head domain and is essential for cell viability to the extent that a mutant strain of Saccharomyces cerevisiae (Baker's yeast) lacking it shows all RNA polymerase II-dependent transcription ceasing at non-permissive temperatures.
Protein Domain
Name: CCAAT-binding factor
Type: Domain
Description: This domain is present in the CAATT-binding protein which is essential for growth and necessary for 60S ribosomal subunit biogenesis. Other proteins containing this domain stimulate transcription from the HSP70 promoter.
Protein Domain      
Protein Domain
Name: Serine incorporator/TMS membrane protein
Type: Family
Description: The serine incorporator/TMS membrane protein (TDE1/TMS) family include SERINC1-5 from mammals and membrane protein Tms1 from budding yeasts. Members in this family contain eleven transmembrane helices. SERINC1-5 function in incorporating serine into membranes and facilitating the synthesis of two serine-derived lipids, phosphatidylserine and sphingolipids [ ]. Serinc3 (also known as TDE1) is overexpressed in tumours []. The function of Tms1 is not clear.
Protein Domain
Name: Arf GTPase activating protein
Type: Domain
Description: Proteins containing this domain include ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins [ ]. The GTP-bound form of ARF is essential for the maintenance of normal Golgi morphology, it participates in recruitment of coat proteins which are required for budding and fission of membranes. Before the fusion with an acceptor compartment the membrane must be uncoated. This step required the hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP proteins.The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has been solved [ ]. It consists of a three-stranded β-sheet surrounded by 5 alpha helices. The domain is organised around a central zinc atom which is coordinated by 4 cysteines. The ARFGAP domain is clearly unrelated to the other GAP proteins structures which are exclusively helical. Classical GAP proteins accelerate GTPase activity by supplying an arginine finger to the active site. The crystal structure of ARFGAP bound to ARF revealed that the ARFGAP domain does not supply an arginine to the active site which suggests a more indirect role of the ARFGAP domain in the GTPase hydrolysis [].
Protein Domain
Name: Protein FAM135
Type: Family
Description: This family is found in eukaryotes, and is approximately 60 amino acids in length. Proteins in this family contain a domain found in a group of putative lipases ( ).
Protein Domain
Name: Major pollen allergen Lol pI
Type: Family
Description: Grass pollens are a major cause of type I allergy. Lol pI, the major rye grass (Lolium perenne) allergen, is a 240-amino acid protein [ ]. Analysis of the amino acid sequence has revealed a determinant within the Lol pI molecule that is recognised by human leukocyte antigen class II-restricted T cells obtained from patients allergic to rye grass pollen []. Sequence analysis of a pollen-specific cDNA from maize has revealed a homologue (Zea mI) of the Lol pI gene [ ]. The protein is ~70% similar to the reported amino acid sequence of Lol pIA. Southern analysis indicates Zea mI to be a member of a small multigene family in maize. Northern analysis indicates expression only in pollen, not in vegetative or female floral tissues. The timing of expression is developmentally regulated,occurring at a low level prior to the first pollen mitosis, and at a high level after post-meiotic division [ ].
Protein Domain
Name: SWEET sugar transporter
Type: Family
Description: This family contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisms it mediates glucose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologues are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter [ ]. Homologues of SWEETs have been identified in bacteria [].The founding member of the SWEET family, MtN3, was identified as a nodulin-specific EST in the legume Medicago truncatula [ ]. Another protein in this family may be involved in activation and expression of recombination activation genes (RAGs) []. This family contains a region of two transmembrane helices that is found in two copies in most members of the family.
Protein Domain
Name: Fibronectin type III
Type: Domain
Description: Fibronectin is a dimeric glycoprotein composed of disulfide-linked subunits with a molecular weight of 220-250kDa each. It is involved in cell adhesion, cell morphology, thrombosis, cell migration, and embryonic differentiation. Fibronectin is a modular protein composed of homologous repeats of threeprototypical types of domains known as types I, II, and III [ ].Fibronectin type-III (FN3) repeats are both the largest and the most common of the fibronectin subdomains. Domains homologous to FN3 repeats have been found in various animal protein families including other extracellular-matrixmolecules, cell-surface receptors, enzymes, and muscle proteins [ ]. Structures of individual FN3 domains have revealed a conserved β-sandwich fold with one β-sheet containing four strands and the other sheet containing three strands (see for example ) [ ]. This fold is topologically very similar to that of Ig-like domains, with a notable difference being the lack of a conserved disulfide bond in FN3 domains. Distinctive hydrophobic core packing and the lack of detectablesequence homology between immunoglobulin and FN3 domains suggest, however, that these domains are not evolutionarily related [].FN3 exhibits functional as well as structural modularity. Sites of interaction with other molecules have been mapped to short stretch of amino acids such as the Arg-Gly-Asp (RGD) sequence found in various FN3 domains. The RGD sequences is involved in interactions with integrin. Small peptides containing the RGD sequence can modulate a variety of cell adhesion invents associated with thrombosis, inflammation, and tumour metastasis. These properties have led to the investigation of RGD peptides and RGD peptide analogues as potential therapeutic agents [ ].
Protein Domain
Name: Pinin/SDK/MemA protein
Type: Domain
Description: This conserved region is located adjacent and C-terminal to a N-terminal pinin/SKD domain . Members of this family have very varied localisations within the eukaryotic cell. Pinin is known to localise at the desmosomes and is implicated in anchoring intermediate filaments to the desmosomal plaque [ ]. SDK2/3 is a dynamically localised nuclear protein thought to be involved in modulation of alternative pre-mRNA splicing []. MemA is a tumour marker preferentially expressed in human melanoma cell lines. A common feature of the members of this family is that they may all participate in regulating protein-protein interactions [].
Protein Domain
Name: Type II CAAX prenyl endopeptidase Rce1-like
Type: Family
Description: This family (also known as the ABI (abortive infection) family) contains putative intramembrane proteases (IMPs) and has homologues in all three domains of life, including Rce1 from S. cerevisiae [ ]. Rce1 is a type II CAAX prenyl protease that processes all farnesylated and geranylgeranylated CAAX proteins. It is an integral membrane endoprotease localized to the endoplasmic reticulum that mediates the cleavage of the carboxyl-terminal three amino acids from CaaX proteins. It is involved in processing the Ras family of small GTPases, the gamma-subunit of heterotrimeric GTPases, nuclear lamins, and protein kinases and phosphatases []. Three residues of S. cerevisiae Rce1 -E156, H194 and H248- are critical for catalysis []. The structure of Rce1 from the archaea Methanococcus (MmRce1) suggests that this group of proteins represents a novel IMP family, the glutamate IMPs [].
Protein Domain
Name: Glycoside hydrolase, family 29
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.O-Glycosyl hydrolases family 29 ( ) encompasses alpha-L-fucosidases ( ) [ ], which is a lysosomal enzyme responsible for hydrolysing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins. Alpha-L-fucosidase is responsible for hydrolysing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins.Fucosylated glycoconjugates are involved in numerous biological events, making alpha-l-fucosidases, the enzymes responsible for their processing, critically important. Deficiency in alpha-l-fucosidase activity is associated with fucosidosis, a lysosomal storage disorder characterised by rapid neurodegeneration, resulting in severe mental and motor deterioration [ ]. The enzyme is a hexamer and displays a two-domain fold, composed of a catalytic (beta/alpha)(8)-like domain and a C-terminal β-sandwich domain [].Drosophila melanogaster spermatozoa contains an alpha-l-fucosidase that might be involved in fertilisation by interacting with alpha-l-fucose residues on the micropyle of the eggshell [ ]. In human sperm, membrane-associated alpha-l-fucosidase is stable for extended periods of time, which is made possible by membrane domains and compartmentalisation. These help preserve protein integrity [].
Protein Domain
Name: Alpha-L-fucosidase, metazoa-type
Type: Family
Description: O-Glycosyl hydrolases family 29 ( ) encompasses alpha-L-fucosidases ( ) [ ], which is a lysosomal enzyme responsible for hydrolysing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins. Alpha-L-fucosidase is responsible for hydrolysing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins.Fucosylated glycoconjugates are involved in numerous biological events, making alpha-l-fucosidases, the enzymes responsible for their processing, critically important. Deficiency in alpha-l-fucosidase activity is associated with fucosidosis, a lysosomal storage disorder characterised by rapid neurodegeneration, resulting in severe mental and motor deterioration [ ]. The enzyme is a hexamer and displays a two-domain fold, composed of a catalytic (beta/alpha)(8)-like domain and a C-terminal β-sandwich domain [].Drosophila melanogaster spermatozoa contains an alpha-l-fucosidase that might be involved in fertilisation by interacting with alpha-l-fucose residues on the micropyle of the eggshell [ ]. In human sperm, membrane-associated alpha-l-fucosidase is stable for extended periods of time, which is made possible by membrane domains and compartmentalisation. These help preserve protein integrity [].This entry represents a subgroup of alpha-L-fucosidases found in metazoa, fungi and bacteria.
Protein Domain
Name: Protein of unknown function DUF538
Type: Family
Description: This family consists of several plant proteins of unknown function.
Protein Domain
Name: AAA domain
Type: Domain
Description: This entry represents a wide variety of AAA domains, including some that have lost essential nucleotide binding residues in the P-loop.
Protein Domain
Name: Gamma-butyrobetaine hydroxylase-like, N-terminal
Type: Domain
Description: This domain is found in gamma-butyrobetaine dioxygenase, Fe-S cluster assembly factor HCF101 and trimethyllysine dioxygenase proteins.Gamma-butyrobetaine hydroxylase (GBBH) is a alpha-ketoglutarate-dependent dioxygenase that catalyzes the biosynthesis of L-carnitine by hydroxylation of gamma-butyrobetaine (GBB). GBBH is a dimeric enzyme. The monomer consists of a catalytic double-stranded β-helix domain and a smaller N-terminal domain. The N-terminal domain has a bound Zn ion, which is coordinated by three cysteines and one histidine. The N-terminal domain could facilitate dimer formation, but its precise function is not known [ ].
Protein Domain
Name: Mrp, conserved site
Type: Conserved_site
Description: The Escherichia coli protein mrp is a 41kDa ATP-binding protein of unknown function. Homologs are present in other bacteria including Bacillus subtilis ybaL, Haemophilus influenzae HI1277, Synechocystis sp. (strain PCC 6803) slr0067; and also in eukaryotes, for example human NBP, yeast NBP35 and YIL003w, Caenorhabditis elegans F10G8.6; and in archaebacteria, for example Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0283.
Protein Domain
Name: Mrp/NBP35 ATP-binding protein
Type: Family
Description: This entry contains cytosolic Fe-S cluster assembling factors NBP35 and CFD1. The NBP35-CFD1 heterotetramer forms a Fe-S scaffold complex, mediating the de novo assembly of an Fe-S cluster and its transfer to target apoproteins. Nucleotide binding and hydrolysis seems to be critical for loading of Fe-S clusters onto CFD1 and NBP35 [ , , ]. In higher eukaryotes NBP35 and CFD1 are known as NUBP1 and NUBP2, and NUBP1 is also involved in iron regulation [].Bacterial homologues ApbC and MRP (Multiple Resistance and pH adaptation in E. coli) have been shown to contain an ATP-binding domain at the N terminus and have ATPase activity. MRP is a membrane-spanning protein and functions as a Na+/H+ antiporter [, ]. Archaeal homologues function as iron-sulfur cluster carriers [].
Protein Domain
Name: MIP18 family-like
Type: Domain
Description: This domain (previously known as DUF59) is found in proteins that are mostly defined as members of the MIP18 family. This includes iron-sulfur cluster carrier proteins, where the domain is found in the N terminus. This domain is also found in protein AE7 from Arabidopsis and its homologues. Protein AE7 is thought to be a central member of the cytosolic iron-sulfur (Fe-S) protein assembly (CIA) pathway, however protein AE7-like 1 and 2 (also containing this domain) are probably not involved in this pathway []. MIP18 family protein YHR122W (CIA2) from S. cerevisiae is a component of the CIA machinery, and acts at a late step of Fe-S cluster assembly []. The SufT protein from Staphylococcus aureus is composed of this domain solely and is involved in the maturation of FeS proteins [].
Protein Domain
Name: Glycosyl transferase, family 48
Type: Family
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.This is the glycosyltransferase 48 family , which consists of various 1,3-beta-glucan synthase components including Gls1, Gls2 and Gls3 from yeast. 1,3-beta-glucan synthase ( ) also known as callose synthase catalyses the formation of a beta-1,3-glucan polymer that is a major component of the fungal cell wall [ ]. The reaction catalysed is:-UDP-glucose + {1,3-beta-D-glucosyl}(N) = UDP + {1,3-beta-D-glucosyl}(N+1).
Protein Domain
Name: Alpha-D-phosphohexomutase, conserved site
Type: Conserved_site
Description: The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM) [ ]. PGM () converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose [ ]. PGM/PMM (; ) are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate [ , ]. Both PNGM () and PAGM ( ) are involved in the biosynthesis of UDP-N-acetylglucosamine [ , ]. Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme [ ].The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.This entry represents the conserved site at the N-terminal region of alpha-D-phosphohexomutase enzymes.
Protein Domain
Name: Alpha-D-phosphohexomutase, C-terminal
Type: Domain
Description: The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM) [ ]. PGM () converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose [ ]. PGM/PMM (; ) are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate [ , ]. Both PNGM () and PAGM ( ) are involved in the biosynthesis of UDP-N-acetylglucosamine [ , ]. Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme [ ].The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.This entry represents the C-terminal domain alpha-D-phosphohexomutase enzymes.
Protein Domain
Name: Ribosomal protein L7/L12, C-terminal/adaptor protein ClpS-like
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].This superfamily represents a domain found at the C terminus of ribosomal proteins L7 and L12, and also in the adaptor protein ClpS, forming an alpha/beta sandwich [ ].The L7 and L12 ribosomal proteins are part of the large 50S ribosomal subunit, and occur in four copies organised as two dimers. The L7/L12 dimer probably interacts with EF-Tu. L7 and L12 only differ in a single post-translational modification of the addition of an acetyl group to the N terminus of L7 [ ].ClpS is an adaptor protein that influences protein degradation through its binding to the N-terminal domain of the chaperone ClpA in the ClpAP chaperone-protease pair. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins [ ].
Protein Domain
Name: Adaptor protein ClpS, core
Type: Domain
Description: In the bacterial cytosol, ATP-dependent protein degradation is performed by several different chaperone-protease pairs, including ClpAP. ClpS directly influences the ClpAP machine by binding to the N-terminal domain of the chaperone ClpA. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins [ ].ClpS is a small alpha/beta protein that consists of three α-helices connected to three antiparallel β-strands [ ]. The protein has a globular shape, with a curved layer of three antiparallel α-helices over a twisted antiparallel β-sheet. Dimerization of ClpS may occur through its N-terminal domain. This short extended N-terminal region in ClpS is followed by the central seven-residue β-strand, which is flanked by two other β-strands in a small β-sheet.
Protein Domain      
Protein Domain
Name: SWIRM domain
Type: Domain
Description: The SWIRM domain is a small α-helical domain of about 85 amino acid residues found in eukaryotic chromosomal proteins. It is named after the proteins SWI3, RSC8 and MOIRA in which it was first recognised. This domain mediates protein-protein interactions in the assembly of chromatin-protein complexes [ , ]. The yeast SWI3 SWIRM structure revealed that it forms a four-helix globular domain containing a helix-turn-helix motif [].The SWIRM domain can be linked to different domains, such as the ZZ-type zinc finger ( ), the Myb DNA-binding domain ( ), the HORMA domain ( ), the amino-oxidase domain, the chromo domain ( ), and the JAB1/PAD1 domain.
Protein Domain
Name: HAUS augmin-like complex subunit 2
Type: Family
Description: This entry represents HAUS augmin-like complex subunit 2 from animals (HAUS2) and plants (AUG2) [ , ]. The HAUS (Homologous to AUgmin Subunits) individual subunits have been designated HAUS1 to HAUS8 [ ]. In animals, HAUS augmin-like complex subunit 2 is a component of the HAUS augmin-like complex, which localises to the centrosomes and interacts with the gamma-tubulin ring complex (gamma-TuRC) []. The interaction between augmin and gamm-TuRC is important for spindle microtubule generation and affects the mitotic progression and cytokinesis []. HAUS2 may also increase the tension between spindle and kinetochore allowing for chromosome segregation during mitosis []. The HAUS augmin-like complex subunit 2 was previously known as centrosomal protein of 27kDa (Cep27).In plants, the augmin complex contains 8 subunits, including two plant-specific subunits [ ]. Despite lacking cetrosomes, the augmin complex in plants plays an important part in gamma-tubulin-dependent MT nucleation and the assembly of microtubule arrays during mitosis [].
Protein Domain
Name: DNA recombination and repair protein RecA, monomer-monomer interface
Type: Domain
Description: The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response [ ]. In homologous recombination, the protein functions as a DNA-dependent ATPase, promoting synapsis, heteroduplex formation and strand exchange between homologous DNAs []. RecA also acts as a protease cofactor that promotes autodigestion of the lexA product and phage repressors. The proteolytic inactivation of the lexA repressor by an activated form of recA may cause a derepression of the 20 or so genes involved in the SOS response, which regulates DNA repair, induced mutagenesis, delayed cell division and prophage induction in response to DNA damage []. RecA is a protein of about 350 amino acid residues. Its sequence is very well conserved [ , , ] among eubacterial species. It is also found in the chloroplast of plants []. RecA-like proteins are found in archaea and diverse eukaryotic organisms, like fission yeast, mouse or human. In the filament visualised by X-ray crystallography, β-strand 3, the loop C-terminal to β-strand 2, and α-helix D of the core domain form one surface that packs against αa-helix A and β-strand 0 (the N-terminal domain) of an adjacent monomer during polymerisation []. The core ATP-binding site domain is well conserved, with 14 invariant residues. It contains the nucleotide binding loop between β-strand 1 and α-helix C. The Escherichia coli sequence GPESSGKT matches the consensus sequence of amino acids (G/A)XXXXGK(T/S) for the Walker A box (also referred to as the P-loop) found in a number of nucleoside triphosphate (NTP)-binding proteins. Another nucleotide binding motif, the Walker B box is found at β-strand 4 in the RecA structure. The Walker B box is characterised by four hydrophobic amino acids followed by an acidic residue (usually aspartate). Nucleotide specificity and additional ATP-binding interactions are contributed by the amino acid residues at β-strand 2 and the loop C-terminal to that strand, all of which are greater than 90% conserved among bacterial RecA proteins.The signature in this entry spans the entire monomer-monomer interface in RecA proteins.
Protein Domain
Name: DNA repair Rad51/transcription factor NusA, alpha-helical
Type: Homologous_superfamily
Description: This superfamily represents an α-helical bundle domain, which has a SAM domain-like fold. This compact domain consists of a 4-5 helical bundle of two orthogonally packed alpha-hairpins, and contains one classic and one pseudo HhH (helix-hairpin-helix) motif. This domain is found at N-terminal of the DNA repair protein Rad51, at the C-terminal of the transcription elongation protein NusA, and at the C-terminal of the hypothetical protein AF1548.Human Rad51 protein is a homologue of Escherichia coli RecA protein, and functions in DNA repair and recombination [ ]. In higher eukaryotes, Rad51 protein is essential for cell viability. The N-terminal region of Rad51 is highly conserved among eukaryotic Rad51 proteins but is absent from RecA, suggesting a Rad51-specific function for this region. The-terminal domain is involved in interactions with DNA and proteins; DNA binding may be regulated via phosphorylation within the N-terminal domain.NusA (N utilisation substance A) from E. coli is an essential transcription factor that associates with the RNA polymerase (RNAP) core enzyme, where it modulates transcriptional pausing, termination and anti-termination [ ]. The C-terminal of NusA consists of two repeat units, and is responsible for the interaction of NisA with the C-terminal of RNAP, and with its interaction with protein N from phage lambda during anti-termination [].
Protein Domain
Name: Meiotic recombination protein Dmc1
Type: Family
Description: Dmc1 is a meiosis-specific RecA homologue [ ]. It is a recombinase required for interhomologue recombination and double-strand break repair during meiosis [, , , ].
Protein Domain
Name: Sulphate adenylyltransferase catalytic domain
Type: Domain
Description: This domain is the catalytic domain of ATP-sulfurylase or sulphate adenylyltransferase ( ). ATP-sulfurylase catalyses the synthesis of adenosine-phosphosulphate (APS) from ATP and inorganic sulphate [ ]. Sometimes is found as part of a bifunctional polypeptide chain associated with adenylylsulphate kinase (). Both enzymes are required for PAPS (phosphoadenosine-phosphosulphate) synthesis from inorganic sulphate [ ].
Protein Domain
Name: Sulphate adenylyltransferase
Type: Domain
Description: Sulphate adenylyltransferase or ATP-sulfurylase ( ) forms adenosine 5'-phosphosulphate (APS) from ATP and free sulphate, the first step in the formation of the activated sulphate donor 3'-phosphoadenylylsulphate (PAPS) [ ]. In some cases, it is found in a bifunctional protein in which the other domain, adenosyl phosphosulphate (APS) kinase, catalyses the second and final step, the phosphorylation of APS to PAPS [ ]. The combined ATP sulfurylase/APS kinase may be called PAPS synthase. In some organisms it is used to generate APS from sulfate and ATP, while in others it proceeds in the opposite direction to generate ATP from APS and pyrophosphate. ATP sulfurylase can be a monomer, a homodimer, or a homo-oligomer, depending on the organism. It belongs to a large superfamily of nucleotidyltransferases that includes pantothenate synthetase (PanC), phosphopantetheine adenylyltransferase (PPAT), the amino-acyl tRNA synthetases, and the dissimilatory sulphate adenylyltransferase (sat) of the sulphate reducer Archaeoglobus fulgidus. The enzymes of this family are structurally similar and share a dinucleotide-binding domain. [, , , , , , , , ].
Protein Domain
Name: ATP-sulfurylase PUA-like domain
Type: Domain
Description: This PUA-like domain is found at the N terminus of ATP-sulfurylase enzymes.
Protein Domain
Name: Emopamil-binding protein
Type: Family
Description: Emopamil binding protein (EBP) is a nonglycosylated type I integral membrane protein of endoplasmic reticulum and shows high level expression in epithelial tissues. TheEBP protein has emopamil binding domains, including the sterol acceptor site and the catalytic centre, which show Delta8-Delta7 sterol isomerase activity. Human sterol isomerase, a homologueof mouse EBP, is suggested not only to play a role in cholesterol biosynthesis, but also to affect lipoprotein internalisation. In humans, mutations of EBPare known to cause the genetic disorder of X-linked dominant chondrodysplasia punctata (CDPX2). This syndrome of humans is lethal in most males, and affected females display asymmetrichyperkeratotic skin and skeletal abnormalities [ ].
Protein Domain
Name: CASP, C-terminal
Type: Domain
Description: This domain is the C-terminal region of the CASP family of proteins. These are Golgi membrane proteins which are thought to have a role in vesicle transport [ ].
Protein Domain
Name: Plant specific mitochondrial import receptor subunit TOM20
Type: Family
Description: This family consists of several plant specific mitochondrial import receptor subunit TOM20 (translocase of outer membrane 20kDa subunit) proteins. Most mitochondrial proteins are encoded by the nuclear genome, and are synthesised in the cytosol. TOM20 is a general import receptor that binds to mitochondrial pre-sequences in the early step of protein import into the mitochondria [ ].
Protein Domain
Name: TMEM33/Pom33 family
Type: Family
Description: This entry represents the TMEM33/Pom33 family. Budding yeast Pom33 is a transmembrane nucleoporin that contributes to proper distribution and/or efficient assembly of nuclear pores [ ]. Proteins in this entry also include Tts1 from fission yeasts [], Kr-h2 (krueppel homologue 2) from flies [] and TMEM33 from vertebrates []. Tts1 is required for the correct positioning of the cellular division plane by delimiting the actomyosin ring assembly at the cell equator [ ].Kr-h2 is a member of the dosage-dependent hierarchy effective upon white gene expression [ ].TMEM33 regulates the tubular structure of endoplasmic reticulum by suppressing the membrane-shaping activity of reticulons [ ]. It was also demonstrated that TMEM33 regulates the unfolded protein response which is activated during endoplasmic reticulum stress [].
Protein Domain
Name: Protein of unknown function DUF2039
Type: Family
Description: This entry is a region of approximately 100 residues containing three pairs of cysteine residues. The region is conserved from plants to humans but its function is unknown.
Protein Domain      
Protein Domain
Name: FAD-binding domain
Type: Domain
Description: This domain is involved in FAD binding in a number of enzymes, including Kynurenine 3-monooxygenase from humans, which is related to neuroinflammatory conditions [ ].
Protein Domain
Name: Zeaxanthin epoxidase
Type: Family
Description: This entry represents the enzyme zeaxanthin epoxidase ( ), which is involved in the epoxidation of zeaxanthin as part of the biosynthesis of the plant hormone abscisic acid (ABA). ABA is a sesquiterpenoid (15-carbon) which is partially produced via the mevalonic pathway in chloroplasts and other plastids (therefore its biosynthesis primarily occurs in the leaves). The production of ABA is accentuated by stresses such as water loss and freezing temperatures. The enzyme zeaxanthin epoxidase converts zeaxanthin into antheraxanthin and subsequently into violaxanthin. This enzyme also acts on beta-cryptoxanthin. Zeaxanthin epoxidase plays an important role in resistance to stresses, seed development and dormancy [ ].
Protein Domain
Name: Phosphatidylinositol-4-phosphate 5-kinase
Type: Family
Description: This entry represents the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in [ ]. PIP5K catalyses the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signalling pathway [, ].
Protein Domain
Name: RuvB-like P-loop domain
Type: Domain
Description: The RuvB protein makes up part of the RuvABC revolvasome which catalyses the resolution of Holliday junctions that arise during genetic recombination and DNA repair. Branch migration is catalysed by the RuvB protein that is targeted to the Holliday junction by the structure specific RuvA protein [ ]. This entry represents the N-terminal domain of the protein.
Protein Domain
Name: Initiation factor 2B-related
Type: Family
Description: Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. The eukaryotic translation initiation factor EIF-2B is a complex made up of five different subunits, alpha, beta, gamma, delta and epsilon, and catalyses the exchange of EIF-2-bound GDP for GTP. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes; related proteins from archaebacteria and IF-2 from prokaryotes and also contains a subfamily of proteins in eukaryotes, archaeae (e.g. Pyrococcus furiosus), or eubacteria such as Bacillus subtilis and Thermotoga maritima. Many of these proteins were initially annotated as putative translation initiation factors despite the fact that there is no evidence for the requirement of an IF2 recycling factor in prokaryotic translation initiation. Recently, one of these proteins from B. subtilis has been functionally characterised as a 5-methylthioribose-1-phosphate isomerase (MTNA) [ ]. This enzyme participates in the methionine salvage pathway catalysing the isomerisation of 5-methylthioribose-1-phosphate to 5-methylthioribulose-1-phosphate []. The methionine salvage pathway leads to the synthesis of methionine from methylthioadenosine, the end product of the spermidine and spermine anabolism in many species.
Protein Domain
Name: SNARE-complex protein Syntaxin-18, N-terminal
Type: Domain
Description: This is the conserved N-terminal of Syntaxin-18. Syntaxin-18 is found in the SNARE complex of the endoplasmic reticulum and functions in the trafficking between the ER intermediate compartment and the cis-Golgi vesicle. In particular, the N-terminal region is important for the formation of ER aggregates [ ]. More specifically, syntaxin-18 is involved in endoplasmic reticulum-mediated phagocytosis, presumably by regulating the specific and direct fusion of the ER with the plasma or phagosomal membranes [].
Protein Domain
Name: CHORD domain
Type: Domain
Description: Cysteine- and histidine-rich domains (CHORDs) are 60-amino acid modules that bind two zinc ions. They are usually arranged in tandem and are found in all tested eukaryotes, with the exception of yeast, where they are involved in processes ranging from pressure sensing in the heart to maintenance of diploidy in fungi, and exhibit distinct protein-protein interaction specificity. Six cysteine and two histidine residues are invariant within the CHORD domain. Three other residues are also invariant and some positions are confined to positive, negative, or aromatic amino acids [ , ]. Silencing of the Caenorhabditis elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development. The CHORD domain is sometimes found N-terminal to the CS domain, , in metazoan proteins, but occurs separately from the CS domain in plants. This association is thought to be indicative of an functional interaction between CS and CHORD domains [ ].
Protein Domain
Name: GUCT
Type: Domain
Description: This is the C-terminal domain found in the RNA helicase II / Gu protein family [ ].
Protein Domain
Name: Dehydrin
Type: Family
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [ , ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].Dehydrin has been classified as part of the LEA family (D-11 from Dure, or group 2 from Bray) [ ]. Dehydrins contribute to freezing stress tolerance in plants and it was suggested that this could be partly due to their protective effect on membranes [].Dehydrins share a number of structural features. One of the most notable features is the presence, in their central region, of a continuous run offive to nine serines followed by a cluster of charged residues. Such a region has been found in all known dehydrins so far with the exception of peadehydrins. A second conserved feature is the presence of two copies of a lysine-rich octapeptide; the first copy is located just after the clusterof charged residues that follows the poly-serine region and the second copy is found at the C-terminal extremity.
Protein Domain
Name: Embryo-specific ATS3
Type: Family
Description: This is a family of plant seed-specific proteins identified in Arabidopsis thaliana (Mouse-ear cress). ATS3 (Arabidopsis thaliana seed gene 3) is expressed in a pattern similar to the Arabidopsis seed storage protein genes [ ].
Protein Domain
Name: Pentapeptide repeat
Type: Repeat
Description: These repeats were first identified in many cyanobacterial proteins but they are also found in bacterial as well as in plant proteins [ ]. The repeats were first identified in hglK [ ]. Pentapeptide repeat proteins (PRPs) are characterised by the repetition of the pentapeptide repeat motif [S,T,A,V][D,N][L,F][S,T,R][G], which allows it to adopt a right-handed β-helical structure conformation [ ]. The functions of these repeats is unknown but it has been shown that members of this family share the ability to interact with DNA-binding proteins, such as DNA gyrase. For example, McbG (from Escherichia coli) protects the DNA gyrase from microcin B17 toxicity, MfpAMt (from Mycobacterium tuberculosis) and Qnr (from Klebsiella pneumoniae and other enterobacteria) are involved in resistance to fluoroquinolones [].
Protein Domain
Name: Potassium channel tetramerisation-type BTB domain
Type: Domain
Description: This domain can be found at the N terminus of voltage-gated potassium channel proteins, where represents a cytoplasmic tetramerisation domain (T1) involved in assembly of alpha-subunits into functional tetrameric channels [ ]. This domain can also be found in proteins that are not potassium channels, like KCTD1 (potassium channel tetramerisation domain-containing protein 1). KCTD1 is though to be a nuclear protein that functions as a transcriptional repressor. In KCTD1, the T1-type BTB domain mediates homomeric protein-protein interactions [, ].
Protein Domain
Name: Glycosyl hydrolases family 2, sugar binding domain
Type: Domain
Description: Proteins containing this domain include beta-galactosidases, beta-mannosidases and beta-glucuronidases. The domain has a jelly-roll fold [ ].
Protein Domain
Name: Ribosomal protein L10P
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].On the basis of sequence similarities the following prokaryotic and eukaryotic ribosomal proteins can be grouped: Bacterial 50S ribosomal protein L10;Archaebacterial acidic ribosomal protein P0 homologue (L10E);Eukaryotic 60S ribosomal protein P0 (L10E).This entry represents the ribosomal protein L10P family, with includes the above mentioned ribosomal proteins.
Protein Domain
Name: Peptidase C50, separase
Type: Family
Description: This group of cysteine peptidases belong to MEROPS peptidase family C50 (separase family, clan CD). The active site residues for members of this family and family C14 occur in the same order in the sequence: H,C.The separases are caspase-like proteases, which plays a central role in the chromosome segregation. In yeast they cleave the rad21 subunit of the cohesin complex at the onset of anaphase. During most of the cell cycle, separase is inactivated by the securin/cut2 protein, which probably covers its active site. A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Kinetochore protein Ndc80
Type: Family
Description: Members of this family are components of the mitotic spindle. Ndc80 acts as a component of the NMS (Ndc80-MIND-Spc7) super complex which has a role in kinetochore function during late meiotic prophase and throughout the mitotic cell cycle. It has been shown that Ndc80 from yeast is part of a complex called the Ndc80p complex [ ]. The four Ndc80 complex subunits associate as two rod-like heterodimers (Ndc80:Nuf2 and Spc24:Spc25) []. This complex is thought to bind to the microtubules of the spindle.
Protein Domain
Name: Quinolinate phosphoribosyl transferase, C-terminal
Type: Domain
Description: Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide pyrophosphorylase is involved in the de novosynthesis of NAD in both prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg 2+to give rise to nicotinic acid mononucleotide (NaMN), pyrophosphate and carbon dioxide [ , ]. Unlike , this domain also includes the molybdenum transport system protein ModD.
Protein Domain
Name: Nicotinate phosphoribosyltransferase family
Type: Family
Description: Nicotinate phosphoribosyltransferase ( ) is the rate-limiting enzyme that catalyses the first reaction in the NAD salvage synthesis [ ]. Members in this family can be split into two further subfamilies represented in and . Members in have a different (longer) spacing of several key motifs and have an additional C-terminal domain of up to 100 residues. However, one argument suggesting that this family represents the same enzyme is that no species has a member of both subfamilies. Another is that the gene encoding this protein is located near other NAD salvage biosynthesis genes in Nostoc and in at least four different Gram-positive bacteria.
Protein Domain
Name: Nicotinate phosphoribosyltransferase pncB-type
Type: Family
Description: A deep split separates two related families of proteins, one of which includes experimentally characterised examples of nicotinate phosphoribosyltransferase ( ), the first enzyme of NAD salvage biosynthesis. This entry represents the other family. Members have a different (longer) spacing of several key motifs and have an additional C-terminal domain of up to 100 residues. One argument suggesting that this family represents the same enzyme is that no species has a member of both families. Another is that the gene encoding this protein is located near other NAD salvage biosynthesis genes in Nostoc and in at least four different Gram-positive bacteria. NAD and NADP are ubiquitous in life. Most members of this family are from Gram-positive bacteria. An additional set of mutually closely related archaeal sequences score between the trusted and noise cut-offs. This entry includes pncB1 and pncB2 from Mycobacterium tuberculosis, which also play a role in NAD salvage synthesis [ ].
Protein Domain
Name: Lanthionine synthetase C-like
Type: Family
Description: The LanC-like protein superfamily encompasses a highly divergent group of peptide-modifying enzymes, including the eukaryotic and bacterial lanthionine synthetase C-like proteins (LanC) [ , , ]; subtilin biosynthesis protein SpaC from Bacillus subtilis [, ]; epidermin biosynthesis protein EpiC from Staphylococcus epidermidis []; nisin biosynthesis protein NisC from Lactococcus lactis [, , ]; GCR2 from Arabidopsis thaliana []; and many others. The 3D structure of the lantibiotic cyclase from L. lactis has been determined by X-ray crystallography to 2.5A resolution [ ]. The globular structure is characterised by an all-α fold, in which an outer ring of helices envelops an inner toroid composed of 7 shorter, hydrophobic helices. This 7-fold hydrophobic periodicity has led several authors to claim various members of the family, including eukaryotic LanC-1 and GCR2, to be novel G protein-coupled receptors [, ]; some of these claims have since been corrected [, , ]. The C terminus of the lantibiotic biosynthesis protein LanM is homologous to LanC [ , ]. LanC-like protein 1 (included in this family) has been shown to function as a glutathione transferase, and as such has been renamed to Glutathione S-transferase LANCL1 [].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom