Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 3101 to 3200 out of 38750 for *

Category restricted to ProteinDomain (x)

0.015s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Syndetin, C-terminal
Type: Domain
Description: The function of this domain is not known, but it is found at the C terminus of syndetin (VPS50), a unique component of the endosome-associated retrograde protein (EARP) complex. The EARP complex otherwise shares four of its five subunits with the Golgi-associated retrograde protein (GARP) complex. The EARP complex is localized to recycling endosomes where it acts as a tethering complex for recycling of plasma membrane receptors [ ].
Protein Domain
Name: Vacuolar protein sorting-associated protein 54, N-terminal
Type: Domain
Description: This entry represents a domain found in vacuolar protein sorting-associated protein 54 (VPS54), which acts as component of the GARP complex that is involved in retrograde transport from early and late endosomes to the trans-Golgi network (TGN). VPS54 is required to tether the complex to the TGN. However, it is not involved in endocytic recycling [ ].Proteins containing this domain also includes VPS50 (also known as Syndetin), which is a component of the EARP complex that is involved in endocytic recycling. It is required to tether the EARP complex to recycling endosomes. Nevertheless, it is involved in retrograde transport from early and late endosomes to the TGN [ ].
Protein Domain
Name: Ribonuclease H1, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of RNase HI, which has a 3-layer alpha/beta/alpha structure [ ]. This domain is lacking in retroviral and prokaryotic enzymes, but shows a striking structural similarity to the ribosomal protein L9 N-terminal domain, and may function as a regulatory RNA-binding module. However, the topology of this domain differs from structures of known RNA binding domains such as the double-stranded RNA binding domain (dsRBD), the hnRNP K homology (KH) domain and the RNP motif. Eukaryotic RNases HI possess either one or two copies of this small N-terminal domain, in addition to the well-conserved catalytic RNase H domain. RNase HI belongs to the family of ribonuclease H enzymes that recognise RNA:DNA hybrids and degrade the RNA component.
Protein Domain
Name: Transmembrane protein 147
Type: Family
Description: TMEM147 is a component of the Nicalin-NOMO protein complex, which catalyzes the proteolytic cleavage of the transmembrane domain of various proteins including the beta-amyloid precursor protein and Notch [ ].
Protein Domain
Name: Cyclin-dependent kinase, regulatory subunit
Type: Family
Description: In eukaryotes, cyclin-dependent protein kinases interact with cyclins to regulate cell cycle progression, and are required for the G1 and G2 stages of cell division []. Theproteins bind to a regulatory subunit, cyclin-dependent kinase regulatory subunit (CKS), which is essential for their function. This regulatory subunit is a small protein of 79 to 150residues. In yeast (gene CKS1) and in fission yeast (gene suc1) a single isoform is known, while mammals have two highly related isoforms. The regulatory subunits exist as hexamers,formed by the symmetrical assembly of 3 interlocked homodimers, creating an unusual 12-stranded β-barrel structure []. Through the barrel centre runs a 12A diametertunnel, lined by 6 exposed helix pairs [ ]. Six kinase units can be modelled to bind thehexameric structure, which may thus act as a hub for cyclin-dependent protein kinase multimerisation [, ].
Protein Domain
Name: Pheophorbide a oxygenase
Type: Domain
Description: This domain is found in bacterial and plant proteins to the C terminus of a Rieske 2Fe-2S domain ( ). One of the proteins the domain is found in is Pheophorbide a oxygenase (PaO) which seems to be a key regulator of chlorophyll catabolism. Arabidopsis PaO (AtPaO) is a Rieske-type 2Fe-2S enzyme that is identical to Arabidopsis accelerated cell death 1 and homologous to lethal leaf spot 1 (LLS1) of maize [ ], in which the domain described here is also found.
Protein Domain
Name: Glycosyl transferase, family 28, C-terminal
Type: Domain
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described [ ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 28 comprises enzymes with a number of known activities; 1,2-diacylglycerol 3-beta-galactosyltransferase ( ); 1,2-diacylglycerol 3-beta-glucosyltransferase ( ); beta-N-acetylglucosamine transferase ( ). Structural analysis suggests the C-terminal domain contains the UDP-GlcNAc binding site.
Protein Domain
Name: Diacylglycerol glucosyltransferase, N-terminal
Type: Domain
Description: This entry represents a conserved region of approximately 180 residues found towards the N terminus of a number of glycosyltransferases, such as plant chloroplastic monogalactosyldiacylglycerol synthases [ ] and bacterial diacylglycerol glucosyltransferases [].
Protein Domain
Name: TMEM85/ER membrane protein complex subunit 4
Type: Family
Description: This entry includes TMEM85 from mammals and Emc4 from budding yeasts. They inhibit hydrogen peroxide mediated cell death in yeast [ ]. Emc4 is part of the ER membrane complex (EMC) that may play a role in protein folding [].
Protein Domain
Name: Uncharacterised domain CHP00451
Type: Domain
Description: This uncharacterised domain is found a number of enzymes and uncharacterised proteins, often at the C terminus. It is found in some but not all members of a family of related tRNA-guanine transglycosylases (tgt), which exchange a guanine base for some modified base without breaking the phosphodiester backbone of the tRNA. It is also found in rRNA pseudouridine synthase, another enzyme of RNA base modification not otherwise homologous to tgt. It is found, again at the C terminus, in two putative glutamate 5-kinases. It is also found in a family of small, uncharacterised archaeal proteins consisting mostly of this domain.
Protein Domain
Name: Dyskerin-like
Type: Domain
Description: This is an N-terminal domain of dyskerin-like proteins, which is often associated with the TruB N-terminal ( ) and PUA ( ) domains [ ].
Protein Domain
Name: tRNA pseudouridine synthase B family
Type: Family
Description: This family, found in archaea and eukaryotes, includes the only archaeal proteins markedly similar to bacterial TruB, the tRNA pseudouridine 55 synthase. However, among two related yeast proteins, the archaeal set matches yeast YLR175w far better than YNL292w. The first, termed centromere/microtubule binding protein 5 (CBF5), is an apparent rRNA pseudouridine synthase, while the second is the exclusive tRNA pseudouridine 55 synthase for both cytosolic and mitochondrial compartments. It is unclear whether archaeal proteins found by this entry modify tRNA, rRNA, or both. Yeast CBF5 plays a central role in ribosomal RNA processing. It is a probable catalytic subunit of H/ACA small nucleolar ribonucleoprotein (H/ACA snoRNP) complex, which catalyzes pseudouridylation of rRNA. This involves the isomerization of uridine such that the ribose is subsequently attached to C5, instead of the normal N1. Its pseudouridine ('psi') residues may serve to stabilise the conformation of rRNAs. It may function as a pseudouridine synthase. It is also a centromeric DNA-CBF3-binding factor which is involved in mitotic chromosome segregation [ , , , ]. Human CBF5 homologue, DKC1 (also called Dyskerin), has been involved in a variety of disparate cellular functions. DKC1 isoform 1 is required for correct processing or intranuclear trafficking of TERC, the RNA component of the telomerase reverse transcriptase (TERT) holoenzyme [ ]. In Hela cells, overexpression of DKC1 isoform 3 promotes cell to cell and cell to substratum adhesion, increases the cell proliferation rate and leads to cytokeratin hyper-expression []. Mutations in the human DKC1 gene cause the X-linked form of DC, a bone marrow failure syndrome characterised by mucosal leukoplakia, nail dystrophy, abnormal skin pigmentation, premature aging, stem cell dysfunction and increased susceptibility to cancer. DKC1 loss of function also causes the Hoyeraal-Hreidarsson syndrome, recognised as a severe X-DC allelic variant [, , , , , ].
Protein Domain
Name: 5'(3')-deoxyribonucleotidase
Type: Family
Description: This family consists of several 5' nucleotidase, deoxy (Pyrimidine), and cytosolic type C (NT5C) proteins. 5'(3')-deoxyribonucleotidase is a ubiquitous enzyme in mammalian cells whose physiological function is not known [ ].
Protein Domain
Name: Sas10/Utp3/C1D
Type: Family
Description: This entry represents Something about silencing protein 10 (SAS10, also known as UTP3) and U3 small nucleolar ribonucleoprotein protein LCP5 which are components of the U3 ribonucleoprotein complex []. It also includes Nuclear nucleic acid-binding protein C1D from Mus musculus (Mouse), which plays a role in the recruitment of the RNA exosome complex to pre-rRNA to mediate the 3'-5' end processing of the 5.8S rRNA [], Protein THALLO from Arabidopsis thaliana, essential during embryogenesis [, ] and the human protein Neuroguidin, an initiation factor 4E (eIF4E)-binding protein [].
Protein Domain
Name: Sas10 C-terminal domain
Type: Domain
Description: Sas10 is an Essential subunit of U3-containing Small Subunit (SSU) processome complex involved in the production of the 18S rRNA and assembly of the small ribosomal subunit [ ].
Protein Domain
Name: Aspartate-tRNA synthetase, type 2
Type: Family
Description: Aspartyl tRNA synthetase is an alpha2 dimer that belongs to class IIb. Structural analysis combined with mutagenesis and enzymology data on the yeast enzyme point to a tRNA binding process that starts by a recognition event between the tRNA anticodon loop and the synthetase anticodon binding module [ ]. This family represents a group of aspartyl-tRNA synthetases from the eukaryotic cytosol, from archaea and from some bacteria. In some species, this enzyme aminoacylates tRNA for both Asp and Asn; Asp-tRNA (asn) is subsequently transamidated to Asn-tRNA (asn).
Protein Domain
Name: Romo1/Mgr2
Type: Family
Description: This entry includes a group of mitochondrial proteins, including reactive oxygen species modulator 1 (Romo1) from animals and Mgr2 from fungi. Budding yeast Mgr2 is a subunit of the TIM23 translocase complex, which translocates preproteins into and across the membrane and associates with the matrix-localized import motor. It is required for binding of Tim21 to TIM23(CORE). Mrg2 is essential for cell growth at elevated temperature and for efficient protein import [ ]. Romo1 is responsible for increasing the level of ROS in cells. In various cancer cell lines with elevated levels of ROS there is also an increased abundance of Romo1 []. Increased Romo1 expression can have a number of other affects including: inducing premature senescence of cultured human fibroblasts [, ] and increased resistance to 5-fluorouracil [].
Protein Domain
Name: RNA polymerase, Rpb5, N-terminal
Type: Domain
Description: Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; ) that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH) ( ) [ , , , ].This entry represents the N-terminal domain of eukaryotic RPB5, which has a core structure consisting of 3 layers alpha/beta/alpha [ ]. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure []. This module is important for positioning the downstream DNA.
Protein Domain
Name: DNA-directed RNA polymerase subunit Rpo5/Rpb5
Type: Family
Description: Eukaryotic Rpb5 is a common component of RNA polymerases I, II and III. Archaeal Rpo5 (also known as subunit H) is a homologue of Rpb5 [ ]. Rpb5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to Rpo5 [, , , ].DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length [ ]. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3' direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs.RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700kDa, contain two non-identical large (>100kDa) subunits and an array of up to 12 different small (less than 50kDa) subunits.
Protein Domain
Name: Atg6/Beclin
Type: Family
Description: In yeasts, vacuolar protein sorting-associated protein 30 (Vps30), also known as autophagy-related protein 6 (Atg6), is a common component of two distinct phosphatidylinositol 3-kinase complexes. In complex I, Atg14 links Vps30 to Vps34 lipid kinase and plays a specific role in autophagy, while in complex II, Vps38 links Vps30 to Vps34 and plays an important role in vacuolar protein sorting [ ]. The C-terminal of Vps30 contains a globular fold comprised of three β-sheet-α-helix repeats (also known as beta-alpha repeated, autophagy-specific (BARA) domain) and is required for autophagy through the targeting of complex I to the pre-autophagosomal structure. The N-terminal of Vps30 is required for vacuolar protein sorting []. Beclin, the mammalian homologue of yeast Atg6/Vps30, is a tumour suppressor that coordinately regulates the autophagy and membrane trafficking involved in several physiological and pathological processes [ , ].
Protein Domain
Name: Sterile alpha motif/pointed domain superfamily
Type: Homologous_superfamily
Description: Sterile alpha motif (SAM) domains are known to be involved in diverse protein-protein interactions, associating with both SAM-containing and non-SAM-containing proteins pathway [ ]. SAM domains exhibit a conserved structure, consisting of a 4-5-helical bundle of two orthogonally packed alpha-hairpins. However SAM domains display a diversity of function, being involved in interactions with proteins, DNA and RNA []. The name sterile alpha motif arose from its presence in proteins that are essential for yeast sexual differentiation. The SAM domain has had various names, including SPM, PTN (pointed), SEP (yeast sterility, Ets-related, PcG proteins), NCR (N-terminal conserved region) and HLH (helix-loop-helix) domain, all of which are related and can be classified as SAM domains.SAM domains occur in eukaryotic and in some bacterial proteins. Structures have been determined for several proteins that contain SAM domains, including Ets-1 transcription factor, which plays a role in the development and invasion of tumour cells by regulating the expression of matrix-degrading proteases [ ]; Etv6 transcription factor, gene rearrangements of which have been demonstrated in several malignancies []; EphA4 receptor tyrosine kinase, which is believed to be important for the correct localization of a motoneuron pool to a specific position in the spinal cord []; EphB2 receptor, which is involved in spine morphogenesis via intersectin, Cdc42 and N-Wasp []; p73, a p53 homologue involved in neuronal development []; and polyhomeotic, which is a member of the Polycomb group of genes (Pc-G) required for the maintenance of the spatial expression pattern of homeotic genes [].
Protein Domain
Name: Protein OS9-like domain
Type: Domain
Description: This entry represents a domain found in the OS9 protein, which is a lectin that functions in endoplasmic reticulum (ER) quality control and ER-associated degradation (ERAD) [ ]. The sequences of this domainare similar to a region found in the beta-subunit of glucosidase II (), which is also known as protein kinase C substrate 80K-H (PRKCSH).
Protein Domain
Name: Phosphoinositide-specific phospholipase C, EF-hand-like domain
Type: Domain
Description: This domain is predominantly found in the enzyme phosphoinositol-specific phospholipase C. It adopts a structure consisting of a core of four α-helices, in an EF like fold, and is required for functioning of the enzyme [ ].
Protein Domain
Name: Phosphoinositide phospholipase C family
Type: Family
Description: This entry represents phosphoinositol-specific phospholipase C (PLC) from eukaryotes. Proteins in this entry include PLC-beta, gamma, delta, epsilon, eta, zeta and inactive phospholipase C-like protein 2 (PLC-L2). Phosphoinositol-specific phospholipase C (PLC; ( ) plays an important role in signal transduction processes [ ], mediating the cellular actions of a variety of hormones, neurotransmitters and growth factors. Upon agonist-dependent activation, PLC catalyses the hydrolysis of membrane phosphatidylinositol 4,5-bisphosphate (PIP2), generating the second messengers inositol 1,4,5-trisphosphate (IP3) and diacylglycerol (DAG). IP3 binds specific intracellular receptors to trigger Ca2+mobilisation, while DAG mediates activation of a family of protein kinase C isozymes. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins [ , , ]. Based on molecular size, immunoreactivity and amino acid sequence, several subtypes have been classified. Overall, sequence identity between sub-types is low, yet all isoforms share a split TIM barrel containing two conserved domains, designated X and Y []. The core eukaryotic PLC enzyme is composed of a pleckstrin homology (PH) domain, four tandem EF hand domains, a split TIM barrel, and a C2 domain [ ]. The presence of an insert in the TIM barrel led to the naming of the N- and C-terminal halves of the TIM barrel as 'X-box' and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues, for example, in PLC-beta subtypes, X and Y domains are separated by a stretch of 70-120 amino acids rich in Ser, Thr and acidic residues (their C terminus is rich in basic residues). However, in PLC-gammas, there is an insert of more than 400 residues containing a PH domain, two SH2 domains, and one SH3 domain. The two conserved X and Y domains have been shown to be important for the catalytic activity. C-terminal to the Y-box, there is a C2 domain, possibly involved in Ca-dependent membrane attachment.
Protein Domain      
Protein Domain      
Protein Domain
Name: Bifunctional purine biosynthesis protein PurH-like
Type: Family
Description: This is a family of bifunctional enzymes catalysing the last two steps in de novopurine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate [ ]. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP [ ].
Protein Domain
Name: AICAR transformylase, duplicated domain superfamily
Type: Homologous_superfamily
Description: This domain is found in the enzyme 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT). AICARFT can be part of a bifunctional enzyme catalysing the last two steps in de novopurine biosynthesis and having also an IMP (Inosine monophosphate) cyclohydrolase (IMPCHase) functional domain. The bifunctional enzyme forms an intertwined dimer where each monomer is composed of two separate functional domains. This superfamily represents a duplicated domain in AICARFT consisting of a three layer α-β-alpha sandwich [ ].
Protein Domain
Name: Cytochrome c oxidase, subunit VIb
Type: Family
Description: Cytochrome c oxidase ( ) is an oligomeric enzymatic complex that is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen [ ]. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptide subunits. One of these subunits is the potentially haem-binding subunit, VIb, which is encoded in the nucleus [ ]. Subunit VIb is one of three mammalian subunits that lacks a transmembrane region. It is located on the cytosolic side of the membrane and helps form the dimer interface with the corresponding subunit on the other monomer complex [, ].
Protein Domain
Name: Domain of unknown function DUF1995
Type: Domain
Description: This entry includes proteins from bacteria and plants. This domain can be found in a chloroplastic protein, LOW PSII ACCUMULATION 3 (LPA3, At1g73060), from Arabidopsis. LPA3 is involved in assisting chlorophyll a binding protein psbC assembly within photosystem II (PSII) [ ]. This domain can also be found in some putative adenylate kinases, such as At5g35170 [] from Arabidopsis and Os08g0288200 from rice.
Protein Domain
Name: MD-2-related lipid-recognition domain
Type: Domain
Description: The MD-2-related lipid-recognition (ML) domain is implicated in lipid recognition, particularly in the recognition of pathogen related products. It has an immunoglobulin-like β-sandwich fold similar to that of E-set Ig domains. This domain is present in proteins from plants, animals and fungi, including the following proteins: Epididymal secretory protein E1 (also known as Niemann-Pick C2 protein - Npc2), which is known to bind cholesterol. Niemann-Pick disease type C2 is a fatal hereditary disease characterised by accumulation of low-density lipoprotein-derived cholesterol in lysosomes [ ].House-dust mite allergen proteins such as Der f 2 from Dermatophagoides farinae and Der p 2 from Dermatophagoides pteronyssinus [ ].
Protein Domain
Name: GTP-binding protein, orthogonal bundle domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a orthogonal bundle domain found in some GTP-binding proteins, including YlgF from Bacillus subtilis.
Protein Domain
Name: Serine/threonine protein phosphatase, BSU1
Type: Family
Description: This entry represents a group of plant serine/threonine protein phosphatases, including Arabidopsis BSU1 and BSU1-like proteins (BSLs) [ ]. AtBSU1 is a phosphatase that acts as a positive regulator of brassinosteroid (BR) signalling [, ].This entry also includes putative serine/threonine-protein phosphatases from Plasmodiumand green algae.
Protein Domain
Name: Kelch repeat type 2
Type: Repeat
Description: Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified [ ]. This sequence motif represents one β-sheet blade, and several of these repeats can associate to form a β-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein (also known as ring canal kelch protein), creating a 6-bladed β-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded antiparallel β-sheet motif that forms the repeat unit in a super-barrel structural fold [].The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila [ ]. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase [].This entry represents a type of kelch sequence motif that comprises one β-sheet blade.
Protein Domain
Name: RLI1
Type: Family
Description: This entry represents the ABCE family of ATP-binding cassette (ABC) transporters and solely comprises of the ABCE1 gene product (also known as RNase L inhibitor, RLI1) [ , , ]. RLI1 contains 2 nucleotide-binding domains (NBDs) typical of the ABC transporter protein superfamily []; however, it lacks the transmembrane domains required for membrane transport functions [, ]. RLI1 was first identified as an endoribonuclease inhibitor that interacts directly with RNase L to prevent it from binding 2-5A (5'-phosphorylated 2',5'-linked oligo- adenylates) []. RNase L plays a major role in the anti-viral and anti-proliferative activities of interferons, and its inhibition by RLI1 occurs in a concentration-dependent manner [, ]. Recently, RLI1 has been shown to be essential for the assembly of immature HIV-1 capsids in insect cells and higher eukaryotic cell types []. RLI1 expression is induced during HIV type I infection, and is understood to bind HIV-1 Gag (p55) polypeptides following their translation, and to promote their assembly into immature HIV-1 capsids [, , ].
Protein Domain
Name: RNase L inhibitor RLI-like, possible metal-binding domain
Type: Domain
Description: This is part of a possible metal-binding domain in endoribonuclease RNase L inhibitor. It is found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, . Also often found adjacent to in uncharacterised proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons [ ]. It is a component of the multifactor complex (MFC) involved in translation initiation and it is required for the processing and nuclear export of the 60S and 40S ribosomal subunits, playing a role in ribosome biogenesis [, . Inhibitory activity requires concentration-dependent association of RLI with RNase L [ ].
Protein Domain
Name: La protein, xRRM domain
Type: Domain
Description: This entry represents the atypical RRM, named xRRM, found in La and La-related proteins (LaRPs). They belong to an ancient superfamily of proteins that are conserved in nearly all eukaryotes, except Plasmodium. These proteins are broadly involved in critical processes of RNA use and metabolism in the nucleus and the cytoplasm. The LaRP superfamily is distinguished by a conserved bipartite RNA-binding unit called the La-module, composed by a Lupus antigen motif (LaM) followed by an RNA-Recognition motif (RRM). Beyond this, each LaRP family is characterized by distinct family specific domains and motifs that contribute to structure and function. Genuine La and La-related proteins group 7 (LARP7) bind to the non-coding RNAs transcribed by RNA polymerase III (RNAPIII), which end in UUU. The La-module of these proteins bind the UUU-3'OH, protecting the RNA from degradation, while other domains may be important for RNA folding or other functions. The La and LaRP7 protein families have a C-terminal domain that contains a novel class of atypical RRM, named xRRM (for atypical RRM with extended alpha3), which uses a unique mode of single- and double-strand RNA binding [ , , , , ]. The overall fold of the xRRM is an RRM, but with several atypical features. Unusual features of the xRRM include the absence of conserved RNP1 and RNP2 aromatic sequences on the beta3 and beta1 strands, respectively, typically involved in nucleotide recognition; the presence of an additional helix alpha3 that lies across the β-sheet surface, where single-stranded nucleotides usually bind; and a C-terminal tail required for RNA binding that is disordered in the free xRRM but forms an alpha3 extension (alpha3x) on binding RNA. The front face of the xRRM consists of an antiparallel β-sheet with helix alpha3 lying across the β-sheet perpendicular to the β-strand axis. The back side of the xRRM consists of alpha helices. The xRRM interacts with both single- and double-stranded RNA using the β-sheet surface and the C-terminal tail, which forms a helical extension of alpha3 (alhpa3x) that binds to the RNA major groove [ , , , , ].
Protein Domain
Name: DFDF domain
Type: Domain
Description: Sm and Sm-like proteins of the RNA-binding Lsm (like Sm) domain family are found in all domains of life and are generally involved in important RNA-processing tasks. Lsm13-16 homologs share a domain organisation consisting of a divergent N-terminal Lsm domain and a central or C-terminal consensus motifDFDF-x(7)-F closely preceded and followed by further phenylalanines and charged aspartates/glutamates and arginines/lysines/histidines. The variableseven-residue tract of this consensus motif usually contains an asparagine at the third or fourth position except of one sequence where the asparagine isreplaced by a glycine. In few other sequences, the DFDF box is replaced by a DYDF or EFDF box []. The DFDF domain is a heterodimerization domain, whichadopts a helical conformation upon binding. It folds into two consecutive alpha helices that are preceded and connected by the FDF and arelated FDK sequence [ ].Two other strongly conserved FFD box and TFG box sequence motifs Y-x-K-x(3)- FFD-x-[IL]-S and [RKH]-x(2,5)-E-x(0-2)-[RK]-x(3,4)-[DE]-TFG contained inLsm13-15, but not Lsm16, homologs succeed the DFDF-x(7)-F motif and are also predicted to be of helical nature [].This entry represents the DFDF domain.
Protein Domain
Name: FFD box
Type: Domain
Description: Sm and Sm-like proteins of the RNA-binding Lsm (like Sm) domain family are found in all domains of life and are generally involved in important RNA-processing tasks. Lsm13-16 homologues share a domain organisation consisting of a divergent N-terminal Lsm domain and a central or C-terminal consensus motifDFDF-x(7)-F. In few other sequences, the DFDF box is replaced by a DYDF or EFDF box [ ].The FFD box and TFG box are two other strongly conserved sequence motifs(Y-x-K-x(3)-FFD-x-[IL]-S and [RKH]-x(2,5)-E-x(0-2)-[RK]-x(3,4)-[DE]-TFG respectively) contained in Lsm13-15, but not Lsm16, homologues. They succeed the DFDF-x(7)-F motif and are also predicted to be of helical nature [ ].This entry represents the FFD box.
Protein Domain
Name: Herpesvirus UL92
Type: Family
Description: Members of this family are functionally uncharacterised proteins from herpesviruses. The N terminus of these proteins contain 6 conserved cysteines and histidines that might form a zinc binding domain.
Protein Domain
Name: Conserved oligomeric Golgi complex subunit 6
Type: Family
Description: COG6 is a component of the peripheral membrane COG (conserved oligomeric Golgi) complex that is involved in intra-Golgi protein trafficking. COG is located at the cis-Golgi, regulates tethering of retrograde intra-Golgi vesicles and is required for normal Golgi morphology and localisation [ , , ]. COG subunits belong to the CATCHR (complexes associated with tethering containing helical rods) family which includes subunits of the GARP/EARP, exocyst, and Dsl1 complexes, all evolutionary related and have a conserved structural fold consisting of α-helical bundles in tandem at the C-terminal and a coiled-coil region at the N-terminal [, ].
Protein Domain
Name: Thioredoxin reductase
Type: Family
Description: Reactive oxygen species (ROS) are known mediators of intracellular signalling cascades. Excessive production of ROS may, however, lead to oxidative stress, loss of cell function, and ultimately apoptosis or necrosis. A balance between oxidant and antioxidant intracellular systems is hence vital for cell function, regulation, and adaptation to diverse growth conditions. Thioredoxin reductase in conjunction with thioredoxin is a ubiquitous oxidoreductase system with antioxidant and redox regulatory roles. Thioredoxin reductase ( ) reduces oxidised thioredoxin in the presence of NADPH. Reduced thioredoxin serves as an electron donor for thioredoxin peroxidase which consequently reduces H 2O 2to H 2O. In mammals, extracellular forms of Trx also have cytokine-like effects. Mammalian TrxR has a highly reactive active site selenocysteine residue resulting in a profound reductive capacity, reducing several substrates in addition to Trx [ ].
Protein Domain
Name: Pyridine nucleotide-disulphide oxidoreductase, class-II, active site
Type: Active_site
Description: The pyridine nucleotide-disulphide reductases (PNDR) use the isoalloxazine ring of FAD to shuttle reducing equivalents from NAD(P)H to a Cys residuethat is usually a part of a redox-active disulphide bridge. In a second step, the reduced disulphide reduces the substrate. On the basis of sequence and structural similarities [ ], PNDR can be categorised into 2 groups.Class II includes: prokaryotic and eukaryotic thioredoxin reductases [ , ];bacterial alkyl hydroperoxide reductases [ ]; bacterial NADH:dehydrogenases[ ]; a probable oxidoreductase encoded in the Clostridium pasteurianum rubredoxin operon []; and yeast hypothetical protein YHR106w. The 3D structure of Escherichia coli thioredoxin reductase (TR) has been solved [, ].The protein exists as a homodimer, with 3 domains per monomer, which correspond to the FAD-binding, NAD(P)H-binding and central domains ofglutathione reductase (GR) (cf. signature PNDRDTASEI). However, TR lacks the domain that provides the dimer interface in GR, and forms a completely different dimeric structure. The relative orientation of these domains is very different in the 2 enzymes: when the FAD-binding domains of TR and GRare superimposed, the NADPH-binding domain of one is rotated by 66 degrees with respect to the other. The FAD- and NAD(P)H-binding domains have a similar doubly-wound alpha/beta fold, suggesting they evolved by gene duplication []. While in GR the redox active disulphide is located inthe FAD-binding domain, in TR it lies in the NADPH-binding domain. This suggests that the enzymes diverged from an ancestral nucleotide-bindingprotein and acquired their disulphide reductase activities independently [ ].The sequence around the two cysteines involved in the redox-active disulphide bond is conserved, and is covered by this pattern.
Protein Domain
Name: Man1/Src1, C-terminal
Type: Domain
Description: MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C-terminal nucleoplasmic region forms a DNA binding winged helix and binds to Smad [ ]. This C-terminal tail is also found in S. cerevisiae and is thought to consist of three conserved helices followed by two downstream strands [].
Protein Domain
Name: RNA polymerase Rpb4/RPC9, core
Type: Domain
Description: DNA-directed RNA polymerases (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length [ ]. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3' direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs.RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700kDa, contain two non-identical large (>100kDa) subunits and an array of up to 12 different small (less than 50kDa) subunits.A major role in the regulation of eukaryotic protein-coding genes is played by the gene-specific transcriptional regulators, which recruit the RNA polymerase II holoenzyme to the specific promoter. The Rpb4 and Rpb7 subunits of yeast RNA polymerase II form a heterodimeric complex essential for promoter-directed transcription initiation. The Rpb4-Rpb7 complex is not required for stable recruitment of polymerase to active preinitiation complexes, suggesting that Rpb4-Rpb7 mediates an essential step subsequent to promoter binding [ ].This entry represents a domain present in DNA-directed RNA polymerase II subunit Rpb4 and DNA-directed RNA polymerase III subunit RPC9.
Protein Domain
Name: RNA polymerase subunit Rpb4/RPC9
Type: Family
Description: This entry includes RNA polymerase II subunit Rpb4 and its paralogue, RPC9 (also known as Rpc17 in budding yeasts), a subunit of RNA polymerase III [ ]. The Rpb4 and Rpb7 subunits of yeast RNA polymerase II form a heterodimeric complex essential for promoter-directed transcription initiation. The Rpb4-Rpb7 complex is not required for stable recruitment of polymerase to active preinitiation complexes, suggesting that Rpb4-Rpb7 mediates an essential step subsequent to promoter binding []. RPC9 may be involved in the recruitment of pol III by the preinitiation complex [, ].
Protein Domain
Name: SKP1 component, dimerisation
Type: Domain
Description: SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex []. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex [] and is also involved in the ubiquitin pathway []. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1 [].This entry represents a dimerisation domain found at the C-terminal of SKP1 proteins [ ], as well as in subunit D of the centromere DNA-binding protein complex Cbf3 []. This domain is multi-helical in structure, and consists of an interlocked herterodimer in F-box proteins.
Protein Domain
Name: S-phase kinase-associated protein 1
Type: Family
Description: This entry includes SKP1 from yeasts, animals and plants. Mammlian S-phase kinase-associated protein 1 (SKP1) is an essential component of the SCF (SKP1-CUL1-F-box protein) ubiquitin ligase complex, which mediates the ubiquitination of proteins involved in cell cycle progression, signal transduction and transcription [ ]. It is also part of the ubiquitin E3 ligase complex (Skp1-Pam-Fbxo45) that controls the core epithelial-to-mesenchymal transition-inducing transcription factors [].Budding yeast Skp1 is a kinetochore protein found in several complexes, including the SCF ubiquitin ligase complex, the CBF3 complex that binds centromeric DNA [ ], and the RAVE complex that regulates assembly of the V-ATPase []. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1 [].Arabidopsis Skp1 is part of the Skp1/Cullin1/F-box protein COI1 (SCFCOI1) E3 ubiquitin ligase complex required for vegetative and floral organ development as well as for male gametogenesis [ , ]. 21 Skp1-related genes, called Arabidopsis-SKP1-like (ASK), have been uncovered in the Arabidopsis genome. They may collectively perform a range of functions and may regulate different developmental and physiological processes [, ].
Protein Domain
Name: SKP1 component, POZ domain
Type: Domain
Description: SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex []. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex [] and is also involved in the ubiquitin pathway []. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1 [].This entry represents a POZ domain with a core structure consisting of beta(2)/alpha(2)/beta(2)/alpha(2) in two layers, alpha/beta. This domain is found at the N-terminal of SKP1 proteins [ ] as well as in subunit D of the centromere DNA-binding protein complex Cbf3 [].
Protein Domain
Name: S-phase kinase-associated protein 1-like
Type: Family
Description: This entry includes SKP1 and SKP1-like protein, elongin-C (also known as TCEB1). SKP1 is part of the E3 ubiquitin ligase complexes. Elongin-C has dual functions, works as a component of RNA polymerase II (Pol II) transcription elongation factor and as the substrate recognition subunit of a Cullin-RING E3 ubiquitin ligase []. Mammlian S-phase kinase-associated protein 1 (SKP1) is an essential component of the SCF (SKP1-CUL1-F-box protein) ubiquitin ligase complex, which mediates the ubiquitination of proteins involved in cell cycle progression, signal transduction and transcription [ ]. It is also part of the ubiquitin E3 ligase complex (Skp1-Pam-Fbxo45) that controls the core epithelial-to-mesenchymal transition-inducing transcription factors []. Budding yeast Skp1 is a kinetochore protein found in several complexes, including the SCF ubiquitin ligase complex, the CBF3 complex that binds centromeric DNA [], and the RAVE complex that regulates assembly of the V-ATPase []. Elongin-C is a general transcription elongation factor that increases the RNA polymerase II transcription elongation past template-encoded arresting sites [ ]. It forms a complex with SIII regulatory subunits B, which serves as an adapter protein in the proteasomal degradation of target proteins via different E3 ubiquitin ligase complexes []. Elongin-C forms a complex with Cul3 that polyubiquitylates monoubiquitylated RNA polymerase II to trigger its proteolysis [].
Protein Domain
Name: NSF attachment protein
Type: Family
Description: Regulated exocytosis of neurotransmitters and hormones, as well as intracellular traffic, requires fusion of two lipid bilayers. SNARE proteins are thought to form a protein bridge, the SNARE complex, between an incoming vesicle and the acceptor compartment. SNARE proteins contribute to the specificity of membrane fusion, implying that the mechanisms by which SNAREs are targeted to subcellular compartments are important for specific docking and fusion of vesicles. This mechanism involves a family of conserved proteins, members of which appear to function at all sites of constitutive and regulated secretion in eukaryotes [ ]. Among them are 2 types of cytosolic protein, NSF (N-ethyl-maleimide-sensitive protein) and the SNAPs (alpha-, beta- and gamma-soluble NSF attachment proteins). The yeast vesicular fusion protein, sec17, a cytoplasmic peripheral membrane protein involved in vesicular transport between the endoplasmic reticulum and the golgi apparatus, shows a high degree of sequence similarity to the alpha-SNAP family. Alpha-SNAP is universally present in eukaryotes and acts as an adaptor protein between SNARE (integral membrane SNAP receptor) and NSF for recruitment to the 20S complex. Beta-SNAP is brain-specific and shares high sequence identity (about 85%) with alpha-SNAP. Gamma-SNAP is weakly related (about 20-25% identity) to the two other isoforms, and is ubiquitous. It may help regulate the activity of the 20S complex. The X-ray structures of vertebrate gamma-SNAP and Sec17 show similar all-helical structures consisting of an N-terminal extended twisted sheet of four tetratricopeptide repeat (TPR)-like helical hairpins and a C-terminal helical bundle [ , , , , , , , ].SNAP-25 and its non-neuronal homologue Syndet/SNAP-23 are synthesized as soluble proteins in the cytosol. Both SNAP-25 and Syndet/SNAP-23 are palmitoylated at cysteine residues clustered in a loop between two N- and C-terminal coils and palmitoylation is essential for membrane binding and plasma membrane targeting. The C-terminal and the N-terminal helices of SNAP-25, are each targeted to the plasma membrane by two distinct cysteine-rich domains and appear to regulate the availability of SNAP to form complexes with SNARE [ ].
Protein Domain
Name: FY-rich, C-terminal
Type: Domain
Description: The "FY-rich"domain N-terminal (FYRN) and "FY-rich"domain C-terminal (FYRC) sequence motifs are two poorly characterised phenylalanine/ tyrosine-rich regions of around 50 and 100 amino acids, respectively, that arefound in a variety of chromatin-associated proteins [ , , , ]. They areparticularly common in histone H3K4 methyltransferases most notably in a family of proteins that includes human mixed lineage leukemia (MLL) and theDrosophila melanogaster protein trithorax. Both of these enzymes play a key role in the epigenetic regulation of gene expression during development, andthe gene coding for MLL is frequently rearranged in infant and secondary therapy-related acute leukemias. They are also found in transforming growthfactor beta regulator 1 (TBRG1), a growth inhibitory protein induced in cells undergoing arrest in response to DNA damage and transforming growth factor(TGF)-beta1. As TBRG1 has been shown to bind to both the tumor suppressor p14ARF and MDM2, a key regulator of p53, it is also known as nuclearinteractor of ARF and MDM2 (NIAM). In most proteins, the FYRN and FYRC regions are closely juxtaposed, however, in MLL and its homologues they are fardistant. To be fully active, MLL must be proteolytically processed by taspase1, which cleaves the protein between the FYRN and FYRC regions []. TheN-terminal and C-terminal fragments remain associated after proteolysis apparently as a result of an interaction between the FYRN and FYRC regions.How proteolytic processing regulates the activity of MLL is not known. Intriguingly, the FYRN and FYRC motifs of a second family of histone H3K4methyltransferases, represented by MLL2 and MLL4 in humans and TRR in Drosophila melanogaster, are closely juxtaposed. FYRN and FYRC motifs arefound in association with modules that create or recognise histone modifications in proteins from a wide range of eukaryotes, and it is likelythat in these proteins they have a conserved role related to some aspect of chromatin biology [].The FYRN and FYRC regions are not separate independently folded domains, butare components of a distinct protein module, The FYRN and FYRC motifs both form part of a single folded module (the FYR domain), which adopts an alpha+beta fold consisting of a six-stranded antiparallel β-sheet followed by four consecutive α-helices. The FYRN region correspondsto β-strands 1-4 and their connecting loops, whereas the FYRC motif maps to β-strand 5, β-strand 6 and helices alpha1 to alpha4. Most of theconserved tyrosine and phenylalanine residues, after which these motifs are named are involved in interactions that stabilise the fold. Proteins such asMLL, in which the FYRN and FYRC regions are separated by hundreds of amino acids, are expected to contain FYR domains with a large insertion between twoof the strands of the β-sheet (strands 4 and 5) [ ].
Protein Domain
Name: FY-rich, N-terminal
Type: Domain
Description: The "FY-rich"domain N-terminal (FYRN) and "FY-rich"domain C-terminal (FYRC) sequence motifs are two poorly characterised phenylalanine/ tyrosine-rich regions of around 50 and 100 amino acids, respectively, that arefound in a variety of chromatin-associated proteins [ , , , ]. They areparticularly common in histone H3K4 methyltransferases most notably in a family of proteins that includes human mixed lineage leukemia (MLL) and theDrosophila melanogaster protein trithorax. Both of these enzymes play a key role in the epigenetic regulation of gene expression during development, andthe gene coding for MLL is frequently rearranged in infant and secondary therapy-related acute leukemias. They are also found in transforming growthfactor beta regulator 1 (TBRG1), a growth inhibitory protein induced in cells undergoing arrest in response to DNA damage and transforming growth factor(TGF)-beta1. As TBRG1 has been shown to bind to both the tumor suppressor p14ARF and MDM2, a key regulator of p53, it is also known as nuclearinteractor of ARF and MDM2 (NIAM). In most proteins, the FYRN and FYRC regions are closely juxtaposed, however, in MLL and its homologues they are fardistant. To be fully active, MLL must be proteolytically processed by taspase1, which cleaves the protein between the FYRN and FYRC regions []. TheN-terminal and C-terminal fragments remain associated after proteolysis apparently as a result of an interaction between the FYRN and FYRC regions.How proteolytic processing regulates the activity of MLL is not known. Intriguingly, the FYRN and FYRC motifs of a second family of histone H3K4methyltransferases, represented by MLL2 and MLL4 in humans and TRR in Drosophila melanogaster, are closely juxtaposed. FYRN and FYRC motifs arefound in association with modules that create or recognise histone modifications in proteins from a wide range of eukaryotes, and it is likelythat in these proteins they have a conserved role related to some aspect of chromatin biology [].The FYRN and FYRC regions are not separate independently folded domains, butare components of a distinct protein module, The FYRN and FYRC motifs both form part of a single folded module (the FYR domain), which adopts an alpha+beta fold consisting of a six-stranded antiparallel β-sheet followed by four consecutive α-helices. The FYRN region correspondsto β-strands 1-4 and their connecting loops, whereas the FYRC motif maps to β-strand 5, β-strand 6 and helices alpha1 to alpha4. Most of theconserved tyrosine and phenylalanine residues, after which these motifs are named are involved in interactions that stabilise the fold. Proteins such asMLL, in which the FYRN and FYRC regions are separated by hundreds of amino acids, are expected to contain FYR domains with a large insertion between twoof the strands of the β-sheet (strands 4 and 5) [ ].
Protein Domain
Name: JmjN domain
Type: Domain
Description: This entry represents the JmjN domain. The JmjN and JmjC domains are two non-adjacent domains which have been identified in the jumonji family of transcription factors. Although it was originally suggested that the JmjN and JmjC domains always co-occur and might form a single functional unit within the folded protein, the JmjC domain was later found without the JmjN domain in organisms from bacteria to human [, ].JmJC domains are predicted to be metalloenzymes that adopt the cupin fold, and are candidates for enzymes that regulate chromatin remodelling. The cupin fold is a flattened β-barrel structure containing two sheets of five antiparallel β-strands that form the walls of a zinc-binding cleft. JmjC domains were identified in numerous eukaryotic proteins containing domains typical of transcription factors, such as PHD, C2H2, ARID/BRIGHT and zinc fingers [ , ]. The JmjC has been shown to function in a histone demethylation mechanism that is conserved from yeast to human [].
Protein Domain
Name: Zinc finger, C5HC2-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a predicted zinc finger with eight potential zinc ligand binding residues. This domain is found in Jumonji [ ], and may have a DNA binding function. The mouse jumonji protein is required for neural tube formation, and is essential for normal heart development. It also plays a role in the down-regulation of cell proliferation signalling.
Protein Domain
Name: S-adenosylmethionine synthetase, N-terminal
Type: Domain
Description: The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. This entry represents the N-terminal domain of S-adenosylmethionine synthetase and is found in association with and . S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP [ ]. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance [ ].
Protein Domain
Name: S-adenosylmethionine synthetase
Type: Family
Description: S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP [ ]. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance [ ].
Protein Domain
Name: S-adenosylmethionine synthetase, conserved site
Type: Conserved_site
Description: Two conserved site signatures are present in S-adenosylmethionine synthetase. The more N-terminal site represents a hexapeptide which is thought to be involved in ATP binding whilst the C-terminal conserved site is an almost perfectly conserved glycine-rich nonapeptide. S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP [ ]. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance [ ].
Protein Domain
Name: S-adenosylmethionine synthetase, central domain
Type: Domain
Description: The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. This entry represents the central domain and is found in association with and . S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP [ ]. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance [ ].
Protein Domain
Name: S-adenosylmethionine synthetase superfamily
Type: Homologous_superfamily
Description: S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP [ ]. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance [ ].
Protein Domain
Name: S-adenosylmethionine synthetase, C-terminal
Type: Domain
Description: The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold. This entry represents the C-terminal domain of S=adenosylmethionine synthetase and is found in association with and . S-adenosylmethionine synthetase (MAT, ) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP [ ]. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance [].
Protein Domain
Name: Periodic tryptophan protein 2
Type: Family
Description: Periodic tryptophan protein 2 (also known as UTP1) is involved in nucleolar processing of pre-18S ribosomal RNA [ ]. In budding yeast, it is a component of the ribosomal small subunit (SSU) processome composed of at least 40 protein subunits and snoRNA U3 [].
Protein Domain
Name: Small-subunit processome, Utp12
Type: Domain
Description: A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome [ , ]. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties: They are nucleolar.They are able to coimmunoprecipitate with the U3 snoRNA and Mpp10 (a protein specific to the SSU processome). They are required for 18S rRNA biogenesis.There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble [ ]. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA. This domain is found at the C terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein. In yeast, these proteins are called Utp5, Utp1 or Pwp2, Utp12 or DIP2 . They interact with snoRNA U3 and with MPP10 [ ]. Pwp2 is an essential Saccharomyces cerevisiae (Baker's yeast) protein involved in cell separation.
Protein Domain
Name: MEMO1 family
Type: Family
Description: This entry is composed of Memo 1 (mediator of ErbB2-driven cell motility 1) and related proteins from eukaryotes, archaea and bacteria whose molecular function is unclear.Memo 1 is an effector of the ErbB2 receptor tyrosine kinase involved in breast carcinoma cell migration [ ]. Its increased expression is associated with cancer aggressiveness. In breast cancer it regulates insulin-like growth factor-I receptor-dependent signaling pathway []. It binds to a specific ErbB2-derived phosphopeptide []. It regulates the localisation of the small G protein RhoA and its effector mDia1 at the plasma membrane, and thereby coordinates the organisation of the lamellipodial actin network, adhesion site formation, and MT outgrowth within the cell leading edge to sustain cell motility []. In yeast, the homologue is known as Mho1, and inhibits haploid invasive growth when overexpressed [].
Protein Domain
Name: Vacuolar protein sorting-associated protein 54, C-terminal
Type: Domain
Description: This entry represents a domain found in vacuolar protein sorting-associated protein 54 (VPS54), which acts as component of the GARP complex that is involved in retrograde transport from early and late endosomes to the trans-Golgi network (TGN). VPS54 is required to tether the complex to the TGN. However, it is not involved in endocytic recycling [ ].
Protein Domain
Name: Nitrilase/cyanide hydratase, conserved site
Type: Conserved_site
Description: This family includes both nitrilases and cyanide hydratase. Nitrilases () are enzymes that convert nitriles into their corresponding acids and ammonia. They are widespread in microbes as well as inplants where they convert indole-3-acetonitrile to the hormone indole-3- acetic acid. A conserved cysteine has been shown [, ] to be essential forenzyme activity; it seems to be involved in a nucleophilic attack on the nitrile carbon atom. Cyanide hydratase () converts HCN to formamide. In phytopathogenic fungi, it is used to avoid the toxic effect of cyanide released by wounded plants [ ].
Protein Domain
Name: Photosystem I reaction center subunit psaK, plant
Type: Family
Description: Photosystem I (PSI) [ ] is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. It is found in the chloroplasts of plants and cyanobacteria. PSI is composed of at least 14 different subunits, two of which are small hydrophobic proteins of about 7 to 9 Kd and evolutionary related, PsaG (also known as PSI-G) and PsaK (also known as PSI-K), both integral membrane proteins. Cyanobacteria contain only PsaK []. While cyanobacterial PSI have phycobilisomes to harvest light, eukaryotic PSI have a membrane-imbedded peripheral antenna []. This entry represents Photosystem I reaction center subunit psaK found in plants, predominantly in Streptophytes. PsaK is important for stable interaction and proper function of the antenna [ ]. The crystal structure of the plant PSI complex show this protein is closely related to the similar subunit PsaG [].
Protein Domain
Name: Phytochelatin synthase, N-terminal catalytic domain
Type: Domain
Description: Phytochelatins are well known as the heavy metal-detoxifying peptides in higher plants, eukaryotic algae, fungi, nematode and cyanobacteria. Phytochelatin synthase (PCS, also known as glutathione gamma-glutamylcysteinyltransferase; ) is involved in the synthesis of phytochelatins (PC) and homophytochelatins (hPC). This enzyme is required for detoxification of heavy metals such as cadmium and arsenate. The N-terminal region of phytochelatin synthase contains the active site, as well as four highly conserved cysteine residues that appear to play an important role in heavy-metal-induced phytochelatin catalysis. The C-terminal region is rich in cysteines, and may act as a metal sensor, whereby the Cys residues bind cadmium ions to bring them into closer proximity and transferring them to the activation site in the N-terminal catalytic domain [ ]. The C-terminal region displays homology to the functional domains of metallothionein and metallochaperone. This entry represents the N-terminal catalytic PCS domain, which belongs to the petidase family C83 of the papain superfamily of cysteine proteases, with a structurally conserved "catalytic triad"and oxyanion hole in the active site. It has an overall "crescent"shape with alpha/beta fold containing eight α-helices and six β-strands [ ].
Protein Domain
Name: Ricin B, lectin domain
Type: Domain
Description: Ricin is a legume lectin from the seeds of the castor bean plant, Ricinus communis. The seeds are poisonous to people, animals and insects and just one milligram of ricin can kill an adult. Primary structure analysis has shown the presence of a similar domain in many carbohydrate-recognition proteins like plant and bacterial AB-toxins, glycosidases or proteases [ , , ]. This domain, known as the ricin B lectin domain, can be present in one or more copies and has been shown in some instance to bind simple sugars, such as galactose or lactose.The ricin B lectin domain is composed of three homologous subdomains of 40 amino acids (alpha, beta and gamma) and a linker peptide of around 15 residues (lambda). It has been proposed that the ricin B lectin domain arose by gene triplication from a primitive 40 residue galactoside-binding peptide [ , ]. The most characteristic, though not completely conserved, sequence feature is the presence of a Q-W pattern. Consequently, the ricin B lectin domain as also been refered as the (QxW)3domain and the three homologous regions as the QxW repeats [ , ]. A disulphide bond is also conserved in some of the QxW repeats [].The 3D structure of the ricin B chain has shown that the three QxW repeats pack around a pseudo threefold axis that is stabilised by the lambda linker [ ]. The ricin B lectin domain has no major segments of a helix or β-sheet but each of the QxW repeats contains an ω-loop []. An idealized ω-loop is a compact, contiguous segment of polypeptide that traces a 'loop-shaped' path in three-dimensional space; the main chain resembles a Greek omega.
Protein Domain
Name: Amino acid permease/ SLC12A domain
Type: Domain
Description: Amino acid permeases are integral membrane proteins involved in the transport of amino acids into the cell. A number of such proteins have been found to be evolutionary related [ , , ]. These proteins seem to contain up to 12 transmembrane segments. The best conserved region in this family is located in the second transmembrane segment.This domain is found in amino acid permeases, as well as in solute carrier family 12A (SLC12A) sequences.
Protein Domain
Name: Ribonuclease CAF1
Type: Family
Description: The major pathways of mRNA turnover in eukaryotes initiate with shortening of the poly(A) tail. CAF1 (also known as CCR4-associated factor 1) is an RNase of the DEDD superfamily, and a subunit of the CCR4-NOT complex that mediates 3' to 5' mRNA deadenylation [ , ]. In yeast, CAF1 () is also known as POP2, and encodes a critical component of the major cytoplasmic deadenylase [ , ]. It is required for normal mRNA deadenylation in vivoand localises to the cytoplasm. CAF1 copurifies with a CCR4-dependent poly(A)-specific exonuclease activity. The crystal structure of Saccharomyces cerevisiae POP2 has been resolved [ ].Some members of this family contain a single-stranded nucleic acid binding domain, R3H, such as poly(A)-specific ribonuclease (PARN), which also contains an RRM domain [ ]. PARN is only conserved in vertebrates and may be important in regulated deadenylation such as early developmentand DNA damage response [, ].
Protein Domain
Name: ClpP, Ser active site
Type: Active_site
Description: Clp is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin [ ]. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence ofATP [ ], although the P subunit alone does possess some catalytic activity.Proteases highly similar to ClpP have been found to be encoded in the genome of bacteria, metazoa, some viruses and in the chloroplast of plants. A number of the proteins in this family are classified as non-peptidase homologues as they have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.This entry represents a conserved region containing a serine that is involved in the catalytic triad.
Protein Domain
Name: Up-frameshift suppressor 2, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain found in Up-frameshift suppressor 2 (also known as Nonsense-mediated mRNA decay protein 2). Transcripts harbouring premature signals for translation termination are recognised and rapidly degraded by eukaryotic cells through a pathway known as nonsense-mediated mRNA decay. In Saccharomyces cerevisiae, three trans-acting factors (Upf1 to Upf3) are required for nonsense-mediated mRNA decay [ ].
Protein Domain
Name: Phytochrome, central region
Type: Domain
Description: Phytochrome belongs to a family of plant photoreceptors that mediate physiological and developmental responses to changes in red and far-red light conditions []. Besides plants, they are widely represented in both photosynthetic and non-photosynthetic bacteria and are known in a variety of fungi.The protein undergoes reversible photochemical conversion between a biologically-inactive red light-absorbing form and the active far-red light-absorbing form. Phytochrome is a dimer of identical 124kDa subunits, each of which contains a linear tetrapyrrole chromophore, covalently-attached via a Cys residue.This domain is found in the central region of phytochrome proteins. It is structurally related to the GAF domain , which is generally located immediately N-terminal to this domain, but it carries an additional tongue-like hairpin loop between the fifth β-sheet and the sixth α-helix which functions to seal the chromophore pocket and stabilise the photoactivated far-red-absorbing state (Pfr) [ , ].
Protein Domain
Name: Phytochrome A/B/C/D/E
Type: Family
Description: This group represent phytochrome A to E.
Protein Domain
Name: PAS fold-2
Type: Domain
Description: The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs [1]. The PAS fold appears in archaea, eubacteria and eukarya.
Protein Domain
Name: Phytochrome chromophore attachment domain
Type: Domain
Description: Phytochrome [ , , ] is a plant protein that acts as a regulatory photoreceptor and which mediates red-light effects on a wide variety of physiological and molecular responses. Phytochrome can undergo a reversible photochemical conversion between a biologically inactive red light-absorbing form and the active far-red light-absorbing form. Phytochrome is a dimer of identical 124 Kd subunits, each of which contains a covalently attached linear tetrapyrrole chromophore. The chromophore is attached to a cysteine which is located in a highly conserved region that can be used as a signature pattern. Synechocystis strain PCC 6803 hypothetical protein slr0473 contains a domain similar to that of plants phytochrome and seems to also bind a chromophore.
Protein Domain
Name: PAS fold
Type: Domain
Description: PAS domains are involved in many signalling proteins where they are used as a signal sensor domain [ ]. PAS domains appear in archaea, bacteria and eukaryotes. Several PAS-domain proteins are known to detect their signal by way of an associated cofactor. Heme,flavin, and a 4-hydroxycinnamyl chromophore are used in different proteins. The PAS domain was named after three proteins that it occurs in: Per- period circadian proteinArnt- Ah receptor nuclear translocator proteinSim- single-minded protein.PAS domains are often associated with PAC domains . It appears that these domains are directly linked, and that together they form the conserved 3D PAS fold. The division between the PAS and PAC domains is caused by major differences in sequences in the region connecting these two motifs [ ]. In human PAS kinase, this region has been shown to be very flexible, and adopts different conformations depending on the bound ligand []. Probably the most surprising identification of a PAS domain was that in EAG-like K-channels [ ].
Protein Domain
Name: GAF domain
Type: Domain
Description: The GAF domain is named after some of the proteins it is found in, including cGMP-specific phosphodiesterases, adenylyl cyclases and FhlA. It is also found in guanylyl cyclases and phytochromes [ , ]. The structure of a GAF domain shows that the domain shares a similar fold with the PAS domain []. Adenylyl and guanylyl cyclases catalyse ATP and GTP to the second messengers cAMP and cGMP respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalysed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyses the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally stable states that are reversibly inter-convertible by light, the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region [].The GAF domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator required for activation of most Nif operons, which are directly involved in nitrogen fixation. NifA interacts with sigma-54 [ ].
Protein Domain
Name: Phytochrome
Type: Family
Description: Phytochromes are a class of photoreceptor found in plants, bacteria and fungi, which are used to detect light. In plants, phytochromes mediate physiological and developmental responses to changes in red and far-red light conditions [ ].The protein undergoes reversible photochemical conversion between a biologically-inactive red light-absorbing form and the active far-red light-absorbing form. Phytochrome is a dimer of identical 124kDa subunits, each of which contains a linear tetrapyrrole chromophore, covalently-attached via a Cys residue. In Arabidopsis thaliana, there are genes for at least five phytochrome proteins [ ].These photoreceptors control such responses as germination, stem elongation, flowering, gene expression, and chloroplast and leaf development. It is not yet known which red light responses are controlled by which phytochrome species, or whether the different phytochromes have overlapping functions []. Synechocystis sp. (strain PCC 6803) hypothetical protein slr0473 contains a domain similar to that of plants phytochrome and seems also to bind a chromophore.
Protein Domain
Name: SAM-dependent methyltransferase RsmB/NOP2-type
Type: Domain
Description: The C-terminal catalytic domain of ribosomal RNA cysteine methyltransferases is highly conserved in archaeal, bacterial and eukaryotic proteins [ ], such as ribosomal RNA methyltransferase B (RsmB, Sun, Fmu) and Nop2. Escherichia coli RsmB methylates cytosine C967 in 16S rRNA []. Nop2 methylates cytosine C2870 in the 25S rRNA of S. cerevisiae [] and is critical for 60S biogenesis [].
Protein Domain
Name: RNA (C5-cytosine) methyltransferase
Type: Family
Description: RNA (C5-cytosine) methyltransferases (RCMTs) catalyse the transfer of a methyl group to the 5th carbon of a cytosine base in RNA sequences to produce C5-methylcytosine. RCMTs use the cofactor S-adenosyl-L-methionine (SAM) as a methyl donor [ ]. The catalytic mechanism of RCMTs involves an attack by the thiolate of a Cys residue on position 6 of the target cytosine base to form a covalent link, thereby activating C5 for methyl-group transfer. Following the addition of the methyl group, a second Cys residue acts as a general base in the beta-elimination of the proton from the methylated cytosine ring. The free enzyme is restored and the methylated product is released [].Numerous putative RCMTs have been identified in archaea, bacteria and eukaryota [ , ]; most are predicted to be nuclear or nucleolar proteins []. The Escherichia coli Ribosomal RNA Small-subunit Methyltransferase Beta (RSMB) FMU (FirMicUtes) represents the first protein identified and characterised as a cytosine-specific RNA methyltransferase. RSMB was reported to catalyse the formation of C5-methylcytosine at position 967 of 16S rRNA [, ].A classification of RCMTs has been proposed on the basis of sequence similarity [ ]. According to this classification, RCMTs are divided into 8 distinct subfamilies []. Recently, a new RCMT subfamily, termed RCMT9, was identified []. Members of the RCMT contain a core domain, responsible for the cytosine-specific RNA methyltransferase activity. This 'catalytic' domain adopts the Rossman fold for the accommodation of the cofactor SAM []. The RCMT subfamilies are also distinguished by N-terminal and C-terminal extensions, variable both in size and sequence [].
Protein Domain
Name: Endoplasmic reticulum oxidoreductin 1
Type: Family
Description: Ero1 and PDI form the disulfide relay system of the ER that supports correct disulfide bond formation of secretory proteins. This entry represents Ero1 (endoplasmic oxidoreductin-1) from yeasts and its homologues from mammals, Ero1-alpha and Ero1-beta. Ero1 is an flavoprotein that directly transfers disulfide bonds to disulfide isomerase PDI [ , , ]. Ero1 acts as an thiol oxidoreductase responsible for catalyzing disulfide bond formation in nascent polypeptide substrates via electron transfer through protein disulfide isomerase (PDI) with oxygen acting as the final electron acceptor []. Newly generated disulfides are transferred from a FAD (flavin adenine dinucleotide)-associated active site via a "shuttle disulfide"cysteine pair in Ero1 to PDI and from there on to substrate proteins [ , , ]. The activity of Ero1 is regulated by PDI (also known as Pdi1). This regulation of Ero1 through reduction and oxidation of regulatory bonds within Ero1 is essential for maintaining the proper redox balance in the ER [, ].
Protein Domain
Name: MOB kinase activator family
Type: Family
Description: The MOB kinase activator family includes MOB1, an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature [ ]. This family also includes the MOB-like protein phocein, an intracellular protein that interacts with striatin and may play a role in membrane trafficking [].
Protein Domain
Name: AWPM-19-like
Type: Family
Description: Members of this family are 19kDa membrane proteins. The levels of the plant protein AWPM-19 increase dramatically when there is an increase level of abscisic acid. The increase presence of this protein leads to greater tolerance of freezing [ ]. The rice homologue, OsPM19L1, is induced by osmotic stress and may be associated with stress tolerance through ABA-dependent pathway [].
Protein Domain
Name: PAM68-like
Type: Family
Description: This entry includes chloroplastic protein PAM68 and some uncharacterised proteins from Cyanobacteria (blue-green algae). PAM68 is involved in early steps in photosystem II (PSII) biogenesis and in maturation and stability of newly synthesized psbA protein [ ].
Protein Domain
Name: Alkaline-phosphatase-like, core domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a structural domain with a core 3-layer α/β/α structure, which can sometimes contain additional subdomains (also covered by this entry). These domains form the core domain of alkaline phosphatases. This structural domain is found in:Alkaline phosphatase ( ); most use zinc and magnesium as cofactors [ , ].Arylsulphatase (has an additional C-terminal alpha+beta subdomain) ( ) [ , ].Phosphoglycerate mutase (catalytic domain) ( ) [ ].Phosphonoacetate hydrolase (contains an alpha+beta subdomain inserted near C terminus) ( ); uses zinc as a cofactor. Phosphoenolmutase ( ).
Protein Domain
Name: Protein OCTOPUS-like
Type: Family
Description: This family consists of several plant proteins, including protein OCTOPUS (OPS, At3g09070) and OPSL1 (At5g01170) from Arabidopsis [ ]. OPS is a membrane-associated protein that regulates phloem differentiation entry []. It is a positive regulator of the brassinosteroid (BR) signaling pathway and sequesters BIN2 to the plasma membrane to promote phloem differentiation [].
Protein Domain
Name: Acid phosphatase, plant
Type: Family
Description: This entry represents a family of acid phosphatase [ , ] from plants which are closely related to the class B non-specific acid phosphatase OlpA (, which is believed to be a 5'-nucleotide phosphatase) and somewhat more distantly to another class B phosphatase, AphA ( ). Together these three clades define a subfamily of Acid phosphatase (Class B), which corresponds to the IIIB subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. It has been reported that the best substrates were purine 5'-nucleoside phosphates [ ]. This is in concordance with the assignment of the Haemophilus influenzae hel protein (from ) as a 5'-nucleotidase, however there is presently no other evidence to support this specific function for this family of plant phosphatases. Many genes from this family have been annotated as vegetative storage proteins (VSPs) due to their close homology with these earlier-characterised gene products which are highly expressed in leaves. There are significant differences however, including expression levels and distribution [ ]. The most important difference is the lack in authentic VSPs of the nucleophilic aspartate residue, which is instead replaced by serine, glycine or asparagine. Thus these proteins can not be expected to be active phosphatases. This issue was confused by the publication in 1992 of an article claiming activity for the Glycine max (Soybean) VSP []. In 1994 this assertion was refuted by the separation of the activity from the VSP. This entry explicitly excludes the VSPs which lack the nucleophilic aspartate. The possibility exists, however, that some members of this family may, while containing all of the conserved HAD-superfamily catalytic residues, lack activity and have a function related to the function of the VSPs rather than the acid phosphatases.
Protein Domain
Name: Vegetative storage protein/acid phosphatase
Type: Family
Description: This entry includes vegetative storage protein (VSP) and acid phosphatase 1 (APS1) from plants, and some uncharacterised proteins from bacteria. [ , ].Arabidopsis VSP, including VSP1 and VSP2, are acid phosphatases involved in plant defense and flower development. Their structures have been resolved [ ].
Protein Domain
Name: Acid phosphatase, class B-like
Type: Family
Description: This family of class B acid phosphatases also contains a number of vegetative storage proteins (VPS25) [ ]. The acid phosphatase activity of VPS has been experimentally demonstrated [].
Protein Domain
Name: Protein of unknown function DUF1138
Type: Family
Description: This family consists of several hypothetical short plant proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown.
Protein Domain
Name: CDC50/LEM3 family
Type: Family
Description: CDC50/LEM3 is a family of membrane proteins whose members include cell cycle control protein 50, alkylphosphocholine resistance protein LEM3, which is is required for phospholipid translocation across the plasma membrane in Saccharomyces cerevisiae [ ], and several ALA-interacting subunits, which are plant proteins involved in lipid translocation and secretory vesicle formation [, ]. CDC50 (also known as P4-ATPase flippase complex beta subunit TMEM30A) is an accessory component of a P4-ATPase flippase complex which catalyzes the hydrolysis of ATP coupled to the transport of aminophospholipids from the outer to the inner leaflet of various membranes and ensures the maintenance of asymmetric distribution of phospholipids. It is required for the proper folding, assembly and ER to Golgi exit of the ATP8A2:CDC50A flippase complex, which may be involved in regulation of neurite outgrowth, and, reconstituted to liposomes, predominantly transports phosphatidylserine (PS) and to a lesser extent phosphatidylethanolamine (PE). In complex with ATP8A1, may play a role in regulation of cell migration probably involving flippase-mediated translocation of phosphatidylethanolamine (PE) at the plasma membrane [].
Protein Domain
Name: Glycerol-3-phosphate dehydrogenase, NAD-dependent, N-terminal
Type: Domain
Description: NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyses the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain [ ].
Protein Domain
Name: Glycoside hydrolase family 1, active site
Type: Active_site
Description: Glycoside hydrolase family 1 comprises enzymes with a number of known activities; beta-glucosidase ( ); beta-galactosidase ( ); 6-phospho-beta-galactosidase ( ); 6-phospho-beta-glucosidase ( ); lactase-phlorizin hydrolase ( )/( ); beta-mannosidase ( ); myrosinase ( ). This entry represents a conserved region found in these enzymes. It is centred on a conserved glutamic acid residue which has been shown in the beta-glucosidase from Agrobacterium, to be directly involved in glycosidic bond cleavage by acting as a nucleophile. Signature in this entry also picks up the last two domains of LPH (lactase-phlorizin hydrolase); the first two domains, which are removed from the LPH precursor by proteolytic processing, have lost the active site glutamate and may therefore be inactive [ ].
Protein Domain
Name: ATP phosphoribosyltransferase HisG
Type: Family
Description: ATP phosphoribosyltransferase ( ) is the enzyme that catalyzes the first step in the biosynthesis of histidine in bacteria, fungi and plants as shown below. It is a member of the larger phosphoribosyltransferase superfamily of enzymes which catalyse the condensation of 5-phospho-alpha-D-ribose 1-diphosphate with nitrogenous bases in the presence of divalent metal ions [ ].ATP + 5-phospho-alpha-D-ribose 1-diphosphate = 1-(5-phospho-D-ribosyl)-ATP + diphosphate Histidine biosynthesis is an energetically expensive process and ATP phosphoribosyltransferase activity is subject to control at several levels. Transcriptional regulation is based primarily on nutrient conditions and determines the amount of enzyme present in the cell, while feedback inihibition rapidly modulates activity in response to cellular conditions. The enzyme has been shown to be inhibited by 1-(5-phospho-D-ribosyl)-ATP, histidine, ppGpp (a signal associated with adverse environmental conditions) and ADP and AMP (which reflect the overall energy status of the cell). As this pathway of histidine biosynthesis is present only in prokayrotes, plants and fungi, this enzyme is a promising target for the development of novel antimicrobial compounds and herbicides.The ATP phosphoribosyltransferase come in two forms: a long form containing two catalytic domains and a C-terminal regulatory domain, and a short form in which the regulatory domain is missing. The long form is catalytically competent, but in organisms with the short form, a histidyl-tRNA synthetase paralogue, HisZ, is required for enzyme activity [ ].The structures of the long form enzymes from Escherichia coli ( ) and Mycobacterium tuberculosis ( ) have been determined [ , ]. The enzyme itself exists in equilibrium between an active dimeric form, an inactive hexameric form and higher aggregates. Interconversion between the various forms is largely reversible and is influenced by the binding of the natural substrates and inhibitors of the enzyme. The two catalytic domains are linked by a two-stranded β-sheet and togther form a "periplamsic binding protein fold". A crevice between these domains contains the active site. The C-terminal domain is not directly involved in catalysis but appears to be involved the formation of hexamers, induced by the binding of inhibitors such as histidine to the enzyme, thus regulating activity.
Protein Domain
Name: ATP phosphoribosyltransferase, catalytic domain
Type: Domain
Description: ATP phosphoribosyltransferase ( ) is the enzyme that catalyzes the first step in the biosynthesis of histidine in bacteria, fungi and plants as shown below. It is a member of the larger phosphoribosyltransferase superfamily of enzymes which catalyse the condensation of 5-phospho-alpha-D-ribose 1-diphosphate with nitrogenous bases in the presence of divalent metal ions [ ].ATP + 5-phospho-alpha-D-ribose 1-diphosphate = 1-(5-phospho-D-ribosyl)-ATP + diphosphate Histidine biosynthesis is an energetically expensive process and ATP phosphoribosyltransferase activity is subject to control at several levels. Transcriptional regulation is based primarily on nutrient conditions and determines the amount of enzyme present in the cell, while feedback inihibition rapidly modulates activity in response to cellular conditions. The enzyme has been shown to be inhibited by 1-(5-phospho-D-ribosyl)-ATP, histidine, ppGpp (a signal associated with adverse environmental conditions) and ADP and AMP (which reflect the overall energy status of the cell). As this pathway of histidine biosynthesis is present only in prokayrotes, plants and fungi, this enzyme is a promising target for the development of novel antimicrobial compounds and herbicides.ATP phosphoribosyltransferase is found in two distinct forms: a long form containing two catalytic domains and a C-terminal regulatory domain, and a short form in which the regulatory domain is missing. The long form is catalytically competent, but in organisms with the short form, a histidyl-tRNA synthetase paralogue, HisZ, is required for enzyme activity [ ].This entry represents the catalytic region of this enzyme. The structures of the long form enzymes from Escherichia coli ( ) and Mycobacterium tuberculosis ( ) have been determined [ , ]. The enzyme itself exists in equilibrium between an active dimeric form, an inactive hexameric form and higher aggregates. Interconversion between the various forms is largely reversible and is influenced by the binding of the natural substrates and inhibitors of the enzyme. The two catalytic domains are linked by a two-stranded β-sheet and togther form a "periplasmic binding protein fold". A crevice between these domains contains the active site. The C-terminal domain is not directly involved in catalysis but appears to be involved the formation of hexamers, induced by the binding of inhibitors such as histidine to the enzyme, thus regulating activity.
Protein Domain
Name: ATP phosphoribosyltransferase, conserved site
Type: Conserved_site
Description: ATP phosphoribosyltransferase ( ) is the enzyme that catalyzes the first step in the biosynthesis of histidine in bacteria, fungi and plants as shown below. It is a member of the larger phosphoribosyltransferase superfamily of enzymes which catalyse the condensation of 5-phospho-alpha-D-ribose 1-diphosphate with nitrogenous bases in the presence of divalent metal ions [ ].ATP + 5-phospho-alpha-D-ribose 1-diphosphate = 1-(5-phospho-D-ribosyl)-ATP + diphosphate Histidine biosynthesis is an energetically expensive process and ATP phosphoribosyltransferase activity is subject to control at several levels. Transcriptional regulation is based primarily on nutrient conditions and determines the amount of enzyme present in the cell, while feedback inihibition rapidly modulates activity in response to cellular conditions. The enzyme has been shown to be inhibited by 1-(5-phospho-D-ribosyl)-ATP, histidine, ppGpp (a signal associated with adverse environmental conditions) and ADP and AMP (which reflect the overall energy status of the cell). As this pathway of histidine biosynthesis is present only in prokayrotes, plants and fungi, this enzyme is a promising target for the development of novel antimicrobial compounds and herbicides.ATP phosphoribosyltransferase is found in two distinct forms: a long form containing two catalytic domains and a C-terminal regulatory domain, and a short form in which the regulatory domain is missing. The long form is catalytically competent, but in organisms with the short form, a histidyl-tRNA synthetase paralogue, HisZ, is required for enzyme activity [ ].This entry represents the catalytic region of this enzyme. The structures of the long form enzymes from Escherichia coli ( ) and Mycobacterium tuberculosis ( ) have been determined [ , ]. The enzyme itself exists in equilibrium between an active dimeric form, an inactive hexameric form and higher aggregates. Interconversion between the various forms is largely reversible and is influenced by the binding of the natural substrates and inhibitors of the enzyme. The two catalytic domains are linked by a two-stranded β-sheet and togther form a "periplasmic binding protein fold". A crevice between these domains contains the active site. The C-terminal domain is not directly involved in catalysis but appears to be involved the formation of hexamers, induced by the binding of inhibitors such as histidine to the enzyme, thus regulating activity. This entry represents the conserved site of ATP phosphoribosyltransferase enzymes.
Protein Domain
Name: Proteinase inhibitor I13, potato inhibitor I
Type: Family
Description: This family of proteinase inhibitors belong to MEROPS inhibitor family I13, clan IG. They inhibit peptidases of the S1 ( ) and S8 ( ) families [ ]. Potato inhibitor type I sequences are not solely restricted to potatoes but are found in other plant species for example: barley endosperm chymotrypsin inhibitor [], and pumpkin trypsin inhibitor. Apart from leeches, e.g.Hirudo medicinalis (Medicinal leech), homologues are not found in metazoa []. In general, the proteins have retained a specificity towards chymotrypsin-like and elastase-like proteases. Structurally these inhibitors are small (60 to 90 residues) and in contrast with other families of protease inhibitors, they lack disulphide bonds. The inhibitor is a wedge-shaped molecule, its pointed edge formed by the protease-binding loop, which contains the scissile bond. The loop binds tightly to the protease active site, subsequent cleavage of the scissile bond causing inhibition of the enzyme [].The inhibitors (designated type I and II) are synthesised in potato tubers, increasing in concentration as the tuber develops. Synthesis of the inhibitors throughout the plant is also induced by leaf damage; this systemic response being triggered by the release of a putative plant hormone.Examples found in the bacteria and archaea are probable false positives.
Protein Domain
Name: Ribosomal protein L27
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below. Bacterial L27 Nxxxxxxxxx Algal L27 NxxxxxxxxxPlant L27 tttttNxxxxxxxxxxxxx Yeast MRP7 tttNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx't': transit peptide. 'N': N-terminal of mature protein.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom