Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 401 to 500 out of 38750 for *

Category restricted to ProteinDomain (x)

0.015s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: F-box associated domain, type 1
Type: Domain
Description: This domain occurs in a diverse superfamily of genes in plants. Most examples are found C-terminal to an F-box ( ), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes [ ]. Some members have two copies of this domain.
Protein Domain
Name: F-box associated interaction domain
Type: Domain
Description: This domain occurs in a diverse superfamily of genes in plants. Most examples are found C-terminal to an F-box ( ), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes [ ]. Some members have two copies of this domain.
Protein Domain
Name: Hypoxanthine phosphoribosyl transferase
Type: Family
Description: Phosphoribosyltransferases (PRT) are enzymes that catalyze the synthesis of beta-n-5'-monophosphates from phosphoribosylpyrophosphate (PRPP) and an enzyme specific amine. A number of PRT's are involved in the biosynthesis of purine, pyrimidine, and pyridine nucleotides, or in the salvage of purines and pyrimidines. Purine nucleotides are synthesized both via the de novo pathway and via the salvage pathway and are vital for cell functions and cell proliferation through DNA and RNA syntheses and ATP energy supply. This entry presents hypoxanthine phosphoribosyltransferase ( ), which belongs to phosphoribosyltransferase family and is involved in purine salvage.
Protein Domain
Name: Phosphoribosyltransferase domain
Type: Domain
Description: This entry refers to the phosphoribosyl transferase (PRT) type I domain. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. This domain is found in a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase , hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase , ribose-phosphate pyrophosphokinase , amidophosphoribosyltransferase , orotate phosphoribosyltransferase , uracil phosphoribosyltransferase , and xanthine-guanine phosphoribosyltransferase . In Arabidopsis, at the very N terminus of this domain is the P-Loop NTPase domain [ , , , , , , , , , , , , ].
Protein Domain      
Protein Domain
Name: DNA replication complex GINS protein Psf2
Type: Family
Description: DNA replication in eukaryotes results from a highly coordinated interaction between proteins, often as part of protein complexes, and the DNA template. One of the key early steps leading to DNA replication is formation of the prereplication complex, or pre-RC. The pre-RC is formed by the sequential binding of the origin recognition complex (ORC), Cdc6 and Cdt1 proteins, and the MCM complex. Activation of the pre-RC into the initiation complex (IC) is achieved via the action of S-phase kinases, eventually leading to the loading of the replication machinery.Recently, a novel replication complex, GINS (for Go, Ichi, Nii, and San; five, one, two, and three in Japanese), has been identified [ , ]. The precise function of GINS is not known. However, genetic and two-hybrid interactions indicate that it mediates the loading of the enzymatic replication machinery at a step after the action of the S-phase kinases [ ]. Furthermore, GINS may be a part of the replication machinery itself, since it is found associated with replicating DNA [, ]. Electron microscopy of GINS shows that it forms a ring-like structure [], reminiscent of the structure of PCNA [], the DNA polymerase delta replication clamp.This observation, coupled with the observed interactions for GINS, indicates that the complex may represent the replication clamp for DNA polymerase epsilon [].The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts [ ]. This 100kDa stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in other eukaryotes. This family of proteins represents the Psf2 component.
Protein Domain
Name: GINS subunit, domain A
Type: Domain
Description: DNA replication in eukaryotes results from a highly coordinated interaction between proteins, often as part of protein complexes, and the DNA template. One of the key early steps leading to DNA replication is formation of the pre-replication complex, or pre-RC. The pre-RC is formed by the sequential binding of the origin recognition complex (ORC), Cdc6 and Cdt1 proteins, and the MCM complex. Activation of the pre-RC into the initiation complex (IC) is achieved via the action of S-phase kinases, eventually leading to the loading of the replication machinery.Recently, a novel replication complex, GINS (for Go, Ichi, Nii, and San; five, one, two, and three in Japanese), has been identified [ , ]. The precise function of GINS is not known. However, genetic and two-hybrid interactions indicate that it mediates the loading of the enzymatic replication machinery at a step after the action of the S-phase kinases [ ]. Furthermore, GINS may be a part of the replication machinery itself, since it is found associated with replicating DNA [, ]. Electron microscopy of GINS shows that it forms a ring-like structure [], reminiscent of the structure of PCNA [], the DNA polymerase delta replication clamp. This observation, coupled with the observed interactions for GINS, indicates that the complex may represent the replication clamp for DNA polymerase epsilon [].The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts []. This 100kDa stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in other eukaryotes []. The archaeal GINS complex contains two subunits (SSO0772/gins23 and SO1049/gins15 in Sulfolobus) that are poorly conserved homologues of the eukaryotic GINS subunits []. Only Gins23 is included in this entry.The eukaryotic GINS subunits are homologous. The four subunits of the complex consist of two domains each, termed the α-helical (A) and β-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3 [ ].
Protein Domain
Name: UvrD-like helicase, ATP-binding domain
Type: Domain
Description: Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A(phosphate-binding loop or P-loop) and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonlyreferred to as SF1 and SF2, a total of seven characteristic motifs have been identified [] which are distributed over two structural domains, anN-terminal ATP-binding domain and a C-terminal domain. UvrD-like DNA helicases belong to SF1, but they differ from classical SF1/SF2 by alarge insertion in each domain. UvrD-like DNA helicases unwind DNA with a 3'-5' polarity [].Crystal structures of several uvrD-like DNA helicases have been solved [ , , ]. They are monomeric enzymes consisting of twodomains with a common α-β RecA-like core. The ATP-binding site is situated in a cleft between the N terminus of the ATP-binding domain and thebeginning of the C-terminal domain. The enzyme crystallizes in two different conformations (open and closed). The conformational difference between the twoforms comprises a large rotation of the end of the C-terminal domain by approximately 130 degrees. This "domain swiveling"was proposed to be an important aspect of the mechanism of the enzyme [].Some proteins that belong to the UvrD-like DNA helicase family are listed below:Bacterial UvrD helicase. It is involved in the post-incision events of nucleotide excision repair and methyl-directed mismatch repair. It unwindsDNA duplexes with 3'-5' polarity with respect to the bound strand and initiates unwinding most effectively when a single-stranded region ispresent.Gram-positive bacterial pcrA helicase, an essential enzyme involved in DNA repair and rolling circle replication. The Staphylococcus aureus pcrAhelicase has both 5'-3' and 3'-5' helicase activities.Bacterial rep proteins, a single-stranded DNA-dependent ATPase involved in DNA replication which can initiate unwinding at a nick in the DNA. It bindsto the single-stranded DNA and acts in a progressive fashion along the DNA in the 3' to 5' direction.Bacterial helicase IV (helD gene product). It catalyzes the unwinding of duplex DNA in the 3'-5' direction.Bacterial recB protein. RecBCD is a multi-functional enzyme complex that processes DNA ends resulting from a double-strand break. RecB is a helicasewith a 3'-5' directionality.Fungal srs2 proteins, an ATP-dependent DNA helicase involved in DNA repair. The polarity of the helicase activity was determined to be 3'-5'.This domain is also found bacterial helicase-nuclease complex AddAB, both in subunit AddA and AddB. The AddA subunit is responsable for the helicase activity. AddB also harbors a putative ATP-binding domain which does not play a role as a secondary DNA motor, but that it may instead facilitate the recognition of the recombination hotspot sequences [ ].This entry represents the ATP-binding domain found in AddA, AddB and UvrD-like helicases.
Protein Domain
Name: UvrD-like DNA helicase, C-terminal
Type: Domain
Description: Helicases have been classified in 5 superfamilies (SF1-SF5) [ ]. All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop), and Walker B (Mg2+-binding aspartic acid) motifs []. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs have been identified [] which are distributed over two structural domains, an N-terminal ATP-binding domain and a C-terminal domain.This entry represents the C-terminal domain.UvrD-like DNA helicases belong to SF1, but they differ from classical SF1/SF2 by a large insertion in each domain. UvrD-like DNA helicases unwind DNA with a 3'-5' polarity [ ]. Crystal structures of several uvrD-like DNA helicases have been solved [ , , ]. They are monomeric enzymes consisting of two domains with a common α-β RecA-like core. The ATP-binding site is situated in a cleft between the N terminus of the ATP-binding domain and the beginning of the C-terminal domain. The enzyme crystallizes in two different conformations (open and closed). The conformational difference between the two forms comprises a large rotation of the end of the C-terminal domain by approximately 130 degrees. This "domain swiveling"was proposed to be an important aspect of the mechanism of the enzyme [ ].Some proteins that belong to the uvrD-like DNA helicase family are listed below: Bacterial UvrD helicase. It is involved in the post-incision events of nucleotide excision repair and methyl-directed mismatch repair. It unwinds DNA duplexes with 3'-5' polarity with respect to the bound strand and initiates unwinding most effectively when a single-stranded region is present.Gram-positive bacterial pcrA helicase, an essential enzyme involved in DNA repair and rolling circle replication. The Staphylococcus aureus pcrA helicase has both 5'-3' and 3'-5' helicase activities. Bacterial rep proteins, a single-stranded DNA-dependent ATPase involved in DNA replication which can initiate unwinding at a nick in the DNA. It binds to the single-stranded DNA and acts in a progressive fashion along the DNA in the 3' to 5' direction.Bacterial helicase IV (helD gene product). It catalyzes the unwinding of duplex DNA in the 3'-5' direction.Bacterial recB protein. RecBCD is a multi-functional enzyme complex that processes DNA ends resulting from a double-strand break. RecB is a helicase with a 3'-5' directionality.Fungal srs2 proteins, an ATP-dependent DNA helicase involved in DNA repair. The polarity of the helicase activity was determined to be 3'-5'.
Protein Domain
Name: Tetratricopeptide repeat 2
Type: Repeat
Description: The tetratrico peptide repeat (TPR) is a structural motif present in a wide range of proteins [ , , ]. It mediates protein-protein interactions and the assembly of multiprotein complexes []. The TPR motif consists of 3-16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding [ ].This repeat includes outlying Tetratricopeptide-like repeats (TPR) that are not matched by .
Protein Domain
Name: ATPase, V1 complex, subunit F, eukaryotic
Type: Family
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release [ ]. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins [].This entry represents subunit F found in the V1 complex of V-ATPases in eukaryotes. Subunit F is a 16kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis [ ].
Protein Domain
Name: ATPase, V1 complex, subunit F
Type: Family
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release [ ]. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a,c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins [ ].This entry represents subunit F in the V1 complex of V-ATPases and Na(+)-translocating ATPase in Enterococcus hirae. Subunit F is a 16kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis [ ]. In Enterococcus hirae, Na(+)-translocating ATPase extrudes sodium ions from the cytoplasm and generates the Na+ electrochemical gradient by using the energy of ATP [].
Protein Domain
Name: Myc-type, basic helix-loop-helix (bHLH) domain
Type: Domain
Description: A number of eukaryotic proteins, which probably are sequence specific DNA-binding proteins that act as transcription factors, share a conserved domain of 40 to 50 amino acid residues. It has been proposed [ ] that this domain is formed of two amphipathic helices joined by a variable length linker region that could form a loop. This 'helix-loop-helix' (HLH) domain mediates protein dimerization and has been found in the proteins listed below []. Most of these proteins have an extra basic region of about 15 amino acid residues that is adjacent to the HLH domain and specifically binds to DNA. They are referred as basic helix-loop-helix proteins (bHLH), and are classified in two groups: class A (ubiquitous) and class B (tissue-specific). Members of the bHLH family bind variations on the core sequence 'CANNTG', also referred to asthe E-box motif. The homo- or heterodimerization mediated by the HLH domain is independent of, but necessary for DNA binding, as two basic regions are required for DNA binding activity. The HLH proteins lacking the basic domain (Emc, Id) function as negative regulators, since they form heterodimers, but fail to bind DNA. The hairy-related proteins (hairy, E(spl), deadpan) also repress transcription although they can bind DNA. The proteins of this subfamily act together with co-repressor proteins, like groucho, through their -terminal motif WRPW.Proteins containing a HLH domain include: The myc family of cellular oncogenes [ ], which is currently known to contain four members: c-myc, N-myc, L-myc, and B-myc. The myc genes are thought to play a role in cellular differentiation and proliferation.Proteins involved in myogenesis (the induction of muscle cells). In mammals MyoD1 (Myf-3), myogenin (Myf-4), Myf-5, and Myf-6 (Mrf4 or herculin), in birds CMD1 (QMF-1), in Xenopus MyoD and MF25, in Caenorhabditis elegans CeMyoD, and in Drosophila nautilus (nau).Vertebrate proteins that bind specific DNA sequences ('E boxes') in various immunoglobulin chains enhancers: E2A or ITF-1 (E12/pan-2 and E47/pan-1), ITF-2 (tcf4), TFE3, and TFEB.Vertebrate neurogenic differentiation factor 1 that acts as differentiation factor during neurogenesis.Vertebrate MAX protein, a transcription regulator that forms a sequence- specific DNA-binding protein complex with myc or mad.Vertebrate Max Interacting Protein 1 (MXI1 protein) which acts as a transcriptional repressor and may antagonize myc transcriptional activity by competing for max.Proteins of the bHLH/PAS superfamily which are transcriptional activators. In mammals, AH receptor nuclear translocator (ARNT), single-minded homologues (SIM1 and SIM2), hypoxia-inducible factor 1 alpha (HIF1A), AH receptor (AHR), neuronal pas domain proteins (NPAS1 and NPAS2), endothelial pas domain protein 1 (EPAS1), mouse ARNT2, and human BMAL1. In Drosophila, single-minded (SIM), AH receptor nuclear translocator (ARNT), trachealess protein (TRH), and similar protein (SIMA).Mammalian transcription factors HES, which repress transcription by acting on two types of DNA sequences, the E box and the N box.Mammalian MAD protein (max dimerizer) which acts as transcriptional repressor and may antagonize myc transcriptional activity by competing for max.Mammalian Upstream Stimulatory Factor 1 and 2 (USF1 and USF2), which bind to a symmetrical DNA sequence that is found in a variety of viral and cellular promoters.Human lyl-1 protein; which is involved, by chromosomal translocation, in T- cell leukemia.Human transcription factor AP-4.Mouse helix-loop-helix proteins MATH-1 and MATH-2 which activate E box- dependent transcription in collaboration with E47.Mammalian stem cell protein (SCL) (also known as tal1), a protein which may play an important role in hemopoietic differentiation. SCL is involved, by chromosomal translocation, in stem-cell leukemia.Mammalian proteins Id1 to Id4 [ ]. Id (inhibitor of DNA binding) proteins lack a basic DNA-binding domain but are able to form heterodimers with other HLH proteins, thereby inhibiting binding to DNA.Drosophila extra-macrochaetae (emc) protein, which participates in sensory organ patterning by antagonizing the neurogenic activity of the achaete- scute complex. Emc is the homologue of mammalian Id proteins.Human Sterol Regulatory Element Binding Protein 1 (SREBP-1), a transcriptional activator that binds to the sterol regulatory element 1 (SRE-1) found in the flanking region of the LDLR gene and in other genes.Drosophila achaete-scute (AS-C) complex proteins T3 (l'sc), T4 (scute), T5 (achaete) and T8 (asense). The AS-C proteins are involved in the determination of the neuronal precursors in the peripheral nervous system and the central nervous system.Mammalian homologues of achaete-scute proteins, the MASH-1 and MASH-2 proteins.Drosophila atonal protein (ato) which is involved in neurogenesis.
Protein Domain
Name: Pectinesterase, Tyr active site
Type: Active_site
Description: Pectinesterase (pectin methylesterase) catalyses the de-esterification of pectin into pectate and methanol. Pectin is one of the main components of the plant cell wall. In plants, pectinesterase plays an important role in cell wall metabolism during fruit ripening. In plant bacterial pathogens such as Erwinia carotovora and in fungal pathogens such as Aspergillus niger, pectinesterase is involved in maceration and soft-rotting of plant tissue. Plant pectinesterases are regulated by pectinesterase inhibitors, which are ineffective against microbial enzymes [ ].Prokaryotic and eukaryotic pectinesterases share a few regions of sequence similarity. The crystal structure of pectinesterase from Erwinia chrysanthemi revealed a β-helix structure similar to that found in pectinolytic enzymes, though it is different from most structures of esterases [ ]. The putative catalytic residues are in a similar location to those of the active site and substrate-binding cleft of pectate lyase.The entry represents a region found in the N-terminal section of these enzymes; it contains a conserved tyrosine which may play a role in the catalytic mechanism [ ].
Protein Domain
Name: Pectinesterase, catalytic
Type: Domain
Description: Pectinesterase (pectin methylesterase) catalyses the de-esterification of pectin into pectate and methanol. Pectin is one of the main components of the plant cell wall. In plants, pectinesterase plays an important role in cell wall metabolism during fruit ripening. In plant bacterial pathogens such as Erwinia carotovora and in fungal pathogens such as Aspergillus niger, pectinesterase is involved in maceration and soft-rotting of plant tissue. Plant pectinesterases are regulated by pectinesterase inhibitors, which are ineffective against microbial enzymes [ ].Prokaryotic and eukaryotic pectinesterases share a few regions of sequence similarity. The crystal structure of pectinesterase from Erwinia chrysanthemi revealed a β-helix structure similar to that found in pectinolytic enzymes, though it is different from most structures of esterases [ ]. The putative catalytic residues are in a similar location to those of the active site and substrate-binding cleft of pectate lyase.
Protein Domain
Name: Protein of unknown function DUF1191
Type: Family
Description: This family contains hypothetical plant proteins of unknown function.
Protein Domain
Name: Ethylene insensitive 3-like protein, DNA-binding domain superfamily
Type: Homologous_superfamily
Description: Ethylene-insensitive3 (EIN3) and EIN3-like (EIL) proteins are essential transcription factors in the ethylene signaling of higher plants. The EIN3/EIL proteins bind to the promoter regions of the downstream genes and regulate their expression. This superfamily represents the DNA-binding domain of EIN3 family proteins, which consist of 5 α-helices which come together to form a globular domain [ ].
Protein Domain
Name: Hexokinase
Type: Family
Description: Hexokinase ( ) [ , ] is an important glycolytic enzyme that catalyzesthe phosphorylation of keto- and aldohexoses (e.g. glucose, mannose and fructose) using MgATP as the phosphoryl donor.In vertebrates there are four major isoenzymes, commonly referred as types I, II, III and IV. Type IV hexokinase, which is often incorrectly designatedglucokinase [ ], is only expressed in liver and pancreatic beta-cells andplays an important role in modulating insulin secretion; it is a protein of a molecular mass of about 50 Kd. Hexokinases of types I to III, which have lowKm values for glucose, have a molecular mass of about 100 Kd. Structurally they consist of a very small N-terminal hydrophobic membrane-binding domainfollowed by two highly similar domains of 450 residues. The first domain has lost its catalytic activity and has evolved into a regulatory domain.In yeast there are three different isoenzymes: hexokinase PI (gene HXK1), PII (gene HXKB), and glucokinase (gene GLK1). All three proteins have a molecularmass of about 50 Kd.The hexokinase domain has an alpha/beta fold and is distinctly folded in two subdomains of unequal size: the large and small subdomains.The large subdomain comprises a six-stranded mixed β-sheet and a number of additional α-helices. On one side, the sheet packs against the smallsubdomain, and on the other side it is shielded by several α-helices. The dominant feature of the small subdomain is a five stranded mixed β-sheet.The sheet is flanked by two helices on one side and by one helix on the other. The subdomain also has an additional β-sheet formed by two antiparallelstrands [ , ].All these enzymes contain one (or two in the case of types I to III isozymes) strongly conserved region which has been shown [] to be involved in substrate binding.
Protein Domain
Name: Hexokinase, C-terminal
Type: Domain
Description: Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways [ ]. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains [ ]. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar []. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues [ ]. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.Hexokinase ( ), a fructose and glucose phosphorylating enzyme, contains two structurally similar domains represented by this family and . Some members of the family have two copies of each of these domains. This entry represents the more C-terminal domain.
Protein Domain
Name: Major facilitator superfamily
Type: Family
Description: Among the different families of transporter, only two occur ubiquitously in all classifications of organisms. These are the ATP-Binding Cassette (ABC) superfamily and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients [ , , ].The major facilitator superfamily (MFS) of membrane proteins represents the largest family of secondary transporters with members from Archaea to Homo sapiens. MFS proteins target a wide spectrum of substrates, including ions, carbohydrates, lipids, amino acids and peptides, nucleosides and other small molecules in both directions across the membrane, in many instances catalysing active transport by transducing the energy stored in an proton electrochemical gradient into a concentration gradient of substrate [ ]. One remarkable characteristic of the MFS is the high sequence variety within the superfamily. The sequences identity ranges around 12-18% but regions of functional similarity (e.g., substrate- or H-binding sites) align for only very closely related MFS transporters. A hydrophobic amino acid content of 60-70% of most MFS members, high alfa-helix content and an inherent symmetry of the proteins with regard to helix kinks and bends provides nonspecific overlapping of residues and probably accounts for the reported similarities. Structure from representative members show 12 transmembrane sections (TMSs) surrounding a central cavity, forming a semi-symmetrical structure. MFS includes 105 families based on phylogenetic analysis, sequence alignments, overlap of hydropathy plots, compatibility of repeat units, similarity of complexity profiles of transmembrane segments, shared protein domains and 3D structural similarities between transport proteins [].
Protein Domain
Name: Tetracycline resistance protein TetA/multidrug resistance protein MdtG
Type: Family
Description: The tetracycline resistance protein Tet(A) is a tetracycline efflux protein that functions as a metal-tetracycline/H+ antiporter [ , ]. This is an energy-dependent process that decreases the accumulation of the antibiotic in whole cells. Tet(A) is encoded by the transposon Tn10, and is an integral membrane protein with twelve potential transmembrane domains. Site-directed mutagenesis studies have shown that a negative charge at position 66 is essential for tetracycline transport [], and that the region that includes the dipeptide plays an important role in metal-tetracycline transport; it perhaps acts as a gate that opens on the charge-charge interaction between Asp66 and the metal-tetracycline.The histidine at position 257 plays an essential role in H+ translocation [].Multidrug resistance protein MdtG is found in enterobacteria and confers resistance to fosfomycin and deoxycholate [ ].Both TetA and MdtG belong to major facilitator superfamily (MFS), the largest and most diverse superfamily of secondary active transporters [ ].
Protein Domain
Name: SRA-YDG
Type: Domain
Description: This domain has been termed SRA-YDG, for SET and Ring finger Associated, and because of the conserved YDG motif within the domain. Further characteristics of the domain are the conservation of up to 13 evenly spaced glycine residues and a VRV(I/V)RG motif. The domain is mainly found in plants and animals and in bacteria. In animals, this domain is associated with the Np95-like ring finger protein and the related gene product Np97, which contains PHD and RING FINGER domains and which is an important determinant in cell cycle progression. Np95 is a chromatin-associated ubiquitin ligase, binding to histones is direct and shows a remarkable preference for histone H3 and its N-terminal tail. The SRA-YDG domain contained in Np95 is indispensable both for the interaction with histones and for chromatin binding in vivo[ , ].In plants the SRA-YDG domain is associated with the SET domain, found in a family of histone methyl transferases, and in bacteria it is found in association with HNH, a non-specific nuclease motif [ , ].
Protein Domain
Name: PUA-like superfamily
Type: Homologous_superfamily
Description: This superfamily represents domains with a PUA-like structure, consisting of a pseudo-barrel composed of mixed folded sheets of five strands. This structural motif is found in:PUA-containing proteins.The N-terminal of ATP sulphurylases, which contains extra structures, some similar to the PK β-barrel domain [ ].Several bacterial hypothetical proteins, such as the N-terminal domain of YggJ [ ].The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was named after the proteins in which it was first found [ ]. PUA is a highly conserved RNA-binding motif found in a wide range of archaeal, bacterial and eukaryotic proteins, including enzymes that catalyse tRNA and rRNA post-transcriptional modifications, proteins involved in ribosome biogenesis and translation, as well as in enzymes involved in proline biosynthesis [, ]. The structures of several PUA-RNA complexes reveal a common RNA recognition surface, but also some versatility in the way in which the motif binds to RNA []. PUA motifs are involved in dyskeratosis congenita and cancer, pointing to links between RNA metabolism and human diseases [].
Protein Domain      
Protein Domain
Name: Pre-SET domain
Type: Domain
Description: This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain.Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [ ], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain.The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [ ] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [].
Protein Domain
Name: Post-SET domain
Type: Domain
Description: This region is found in a number of histone lysine methyltransferases (HMTase), C-terminal to the SET domain; it is generally described as the post-SET domain.Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [ ], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [].
Protein Domain
Name: Histone H3-K9 methyltransferase, plant
Type: Family
Description: In general, members of this family methylate 'Lys-9' of histone H3. It also methylates 'Lys-27' of histone H3 [ ] and 'Lys-20' of H4, and cytosine []. H3 'Lys-9' methylation represents a specific tag for epigenetic transcriptional repression []. This enzyme plays a central role in gene silencing []. The silencing mechanism via DNA CpNpG methylation requires the targeting of chromomethylase CMT3 to methylated histones, probably through an interaction with an heterochromatin protein 1-like adapter. Arabidopsis homologue SUVH4 is directly required for the maintenance of the DNA CpNpG and asymmetric methylation. It is also involved in the silencing of transposable elements [, , , , ].Methyltransferases (EC [intenz:2.1.1.-]) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as oxygen leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule [, , ]. All these enzymes have in common a conserved region of about 130 amino acid residues that allow them to bind SAM []. The substrates that are methylated bythese enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids [ , , ]. Methyltransferase are therefore involved in many essential cellular processes including biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing [, , ]. More than 230 families of methyltransferases have been described so far, of which more than 220 use SAM as the methyl donor.
Protein Domain
Name: Galactose oxidase/kelch, beta-propeller
Type: Homologous_superfamily
Description: This entry represents a β-propeller domain found in galactose oxidase and in Kelch repeat-containing proteins.The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila [ ]. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase [].Galactose oxidase ( ) is a monomeric enzyme that contains a single copper ion and catalyses the stereospecific oxidation of primary alcohols to their corresponding aldehyde [ ]. The protein contains an unusual covalent thioether bond between a tyrosine and a cysteine that forms during its maturation []. Galactose oxidase is a three-domain protein: the N-terminal domain forms a jelly-roll sandwich, the central domain forms a seven 4-bladed β-propeller, and the C-terminal domain has an immunoglobulin-like fold.
Protein Domain
Name: Cysteine alpha-hairpin motif superfamily
Type: Homologous_superfamily
Description: This entry represents the cysteine alpha-hairpin motif. Proteins with this structure include mature T-cell proliferation 1 neighbour protein and cytochrome c oxidase-assembly factors COX23 and COX19.
Protein Domain
Name: PPC domain
Type: Domain
Description: The Plants and Prokaryotes Conserved (PPC) domain contains a hydrophobic region in the C-terminal, and in the case of plants, is often found in severalproteins with the AT-hook motif. Proteins with PPC domains are found in Bacteria, Archaea and the plant kingdom [, ].The PPC domain has a single α-helix packed against an antiparallel beta- sheet, which is formed by five β-strands. Three conserved histidine residues appear to form a zinc-binding site, and the domain has been observed to form homotrimers. The domain co-occurs with a thioredoxin-like domain in uncharacterized cyanobacterial proteins [].
Protein Domain
Name: AT-hook motif nuclear-localized protein 15-29
Type: Family
Description: This entry includes AT-hook motif nuclear-localized proteins 15-29 (AHL15-29) from Arabidopsis [ ]. They have two conserved structural units, the AT-hook motif and the Plant and Prokaryote Conserved (PPC) domain, the latter also known as DUF296. Members of the AHL family regulate diverse aspects of growth and development in plants []. AHL20 has been shown to negatively regulate defenses in Arabidopsis [].
Protein Domain
Name: AT hook, DNA-binding motif
Type: Conserved_site
Description: AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins [ ], in DNA-binding proteins from plants [] and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex [].
Protein Domain      
Protein Domain
Name: Clathrin adaptor, mu subunit
Type: Family
Description: Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport []. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors [, ].AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes [ ]. AP2 associates with the plasma membrane and is responsible for endocytosis []. AP3 is responsible for protein trafficking to lysosomes and other related organelles []. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins []. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface []. This entry represents the mu subunit of various clathrin adaptors (AP1, AP2 and AP3) [ ]. The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear []. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle [].
Protein Domain
Name: Mu homology domain
Type: Domain
Description: The mu homology domain (MHD) is an ~280 residue protein-protein interaction module, which is found in endocytotic proteins involved in clathrin-mediatedendocytosis [ , , , ]:Mu subunits of adaptor protein (AP) complexes, AP-1, AP-2, AP-3, and AP-4.Proteins of the stonin family.Proteins of the muniscin family: Syp1, FCHO1/2 and SGIP1.The MHD domain has an elongated, banana-shaped, all β-sheet structure. It can be considered as two β-sandwich subdomains (A and B), with subdomain B inserted between strands 6 and 15 of subdomain A, and joined edge to edge such that the convex surface is a continuous nine-stranded mixed β-sheet that runs the whole length of the molecule. The tyrosine based signal binds to a site on the surface of two parallel β-sheet strands (beta1 and beta16) in subdomain A [, ].
Protein Domain
Name: Clathrin adaptor, mu subunit, conserved site
Type: Conserved_site
Description: This entry represents the mu subunit of various clathrin adaptors (AP1, AP2 and AP3) [ ]. The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear []. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle [].
Protein Domain
Name: Beta-grasp domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a domain with a β-grasp fold and a core structure consisting of beta(2)-α-β(2), which is similar to that found in ubiquitin [ ]. Domains with this type of structure are found in the 2Fe-2S ferredoxin family (including putidaredoxin and adrenodoxin) [], the 2Fe-2S ferredoxin-related family (including aldehyde reductase, and xanthine dehydrogenase) [], the TGS family (including threonyl-tRNA synthetase) [] and the MoaD/ThiS family (including molybdopterin, and thiamine biosynthesis sulphur carrier protein) [].
Protein Domain
Name: 2Fe-2S ferredoxin-type iron-sulfur binding domain
Type: Domain
Description: Ferredoxins are small, acidic, electron transfer proteins that are ubiquitous in biological redox systems. They have either 4Fe-4S, 3Fe-4S, or 2Fe-2Scluster. Among them, ferredoxin with one 2Fe-2S cluster per molecule are present in plants, animals, and bacteria, and form a distinct Ferredoxinfamily [ ]. They are proteins of around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This conserved region is also found as a domain in various metabolic enzymes.Several structures of the 2Fe-2S ferredoxin-type domain have been determined [ ]. The domain is classified as a β-grasp, which is characterised as having a β-sheet comprised of four β-strands and one α-helix flanking the sheet. The two Fe atoms are coordinated tetrahedrally by the two inorganic S atoms and four cysteinyl S atoms.
Protein Domain
Name: Zinc finger, RING-type, conserved site
Type: Conserved_site
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. A number of eukaryotic and viral proteins contain a conserved cysteine-rich domain of 40 to 60 residues (called C3HC4 zinc-finger or 'RING' finger) [ ] that binds two atoms of zinc. There are two different variants, the C3HC4-type and the C3H2C3-type, which is clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as "RING-H2 finger". The 3D structure [ ] of the zinc ligation system is referred to as the "cross-brace"motif. This atypical conformation is also shared by the FYVE (see ) and PHD (see ) domains. Many proteins containing a RING finger play a key role in the ubiquitination pathway. The ubiquitination pathway generally involves three types of enzyme, know as E1, E2 and E3. E1 and E2 are ubiquitin conjugating enzymes. E1 acts first and passes ubiquitin to E2. E3 are ubiquitin protein ligases, responsible for substrate recognition. It has been shown [ , ] that several RING fingers act as E3 enzymes in the ubiquitination process.
Protein Domain
Name: RNA recognition motif domain, eukaryote
Type: Domain
Description: Many eukaryotic proteins that are known or supposed to bind single-stranded RNA contain one or more copies of a putative RNA-binding domain of about 90 amino acids. This is known as the eukaryotic putative RNA-binding region RNP-1 motif [ , ], or RNA recognition motif (RRM). RRMs are found in a variety of RNA binding proteins, including heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs). The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an α/β sandwich, with a third helix present during RNA binding in some cases [].
Protein Domain
Name: Zinc finger, CCHC-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [ , , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding [].
Protein Domain
Name: Harbinger transposase-derived nuclease, animal
Type: Family
Description: This entry represents the Harbinger transposase-derived proteins mostly from animals. Proteins in this family may have nuclease activity, but do not appear to have transposase activity [ , ].Harbinger DNA transposons have been identified in protists, plants, insects, worms, and vertebrates. However, mammals do not have Harbinger transposons. In human, no recognisable members of Harbinger transposase superfamily are found. Instead, a widely expressed HARBI1 gene encoding a 350-amino acid protein derived from a Harbinger transposase has been identified []. The HARBI1 protein is conserved in humans, rats, mice, cows, pigs, chickens, frogs and various bony fish []. Conserved motifs, which are expected to be catalytic centres of nuclease/ligase reactions necessary for transpositions, found in the Harbinger transposases, are also well preserved in the HARBI1 proteins []. It was also proposed that these hypothetical HARBI1 nucleases are also characterised by a strong DNA-target specificity [].
Protein Domain
Name: Tyrosine-protein kinase, active site
Type: Active_site
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].Tyrosine-protein kinases can transfer a phosphate group from ATP to a tyrosine residue in a protein. These enzymes can be divided into two main groups [ ]:Receptor tyrosine kinases (RTK), which are transmembrane proteins involved in signal transduction; they play key roles in growth, differentiation, metabolism, adhesion, motility, death and oncogenesis [ ]. RTKs are composed of 3 domains: an extracellular domain (binds ligand), a transmembrane (TM) domain, and an intracellular catalytic domain (phosphorylates substrate). The TM domain plays an important role in the dimerisation process necessary for signal transduction []. Cytoplasmic / non-receptor tyrosine kinases, which act as regulatory proteins, playing key roles in cell differentiation, motility, proliferation, and survival. For example, the Src-family of protein-tyrosine kinases [ ].This entry represents the tyrosine protein kinase active site. It also matches a number of proteins belonging to the atypical serine/threonine protein kinase BUD32 family, which lack the conventional structural elements necessary for the substrate recognition and also lack the lysine residue that in all other serine/threonine kinases participates in the catalytic event.
Protein Domain      
Protein Domain      
Protein Domain
Name: Zinc finger C2H2-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short β hairpin and an α helix (β/β/α structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 [ ]. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved β/β/α structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short α-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets [].This entry represents the classical C2H2 zinc finger domain.
Protein Domain
Name: START domain
Type: Domain
Description: START (StAR-related lipid-transfer) is a lipid-binding domain in StAR, HD-ZIP and signalling proteins [ ]. StAR (Steroidogenic Acute Regulatory protein) is a mitochondrial protein that is synthesised in response to luteinising hormone stimulation [].Expression of the protein in the absence of hormone stimulation is sufficient to induce steroid production, suggesting that this protein is required in the acute regulation ofsteroidogenesis. Representatives of the START domain family have been shown to bind different ligands such as sterols (StAR protein) andphosphatidylcholine (PC-TP). Ligand binding by the START domain can also regulate the activities of other domains that co-occur with the START domainin multidomain proteins such as Rho-gap, the homeodomain, and the thioesterase domain [, ]. The crystal structure of START domain of human MLN64 shows analpha/beta fold built around an U-shaped incomplete β-barrel. Most importantly, the interior of the protein encompasses a 26 x 12 x 11 Angstromshydrophobic tunnel that is apparently large enough to bind a single cholesterol molecule []. The START domain structure revealed an unexpectedsimilarity to that of the birch pollen allergen Bet v 1 and to bacterial polyketide cyclases/aromatases [, ].
Protein Domain
Name: Cytochrome P450
Type: Family
Description: Cytochrome P450 enzymes are a superfamily of haem-containing mono-oxygenases that are found in all kingdoms of life, and which show extraordinary diversity in their reaction chemistry. In mammals, these proteins are found primarily in microsomes of hepatocytes and other cell types, where they oxidise steroids, fatty acids and xenobiotics, and are important for the detoxification and clearance of various compounds, as well as for hormone synthesis and breakdown, cholesterol synthesis and vitamin D metabolism. In plants, these proteins are important for the biosynthesis of several compounds such as hormones, defensive compounds and fatty acids. In bacteria, they are important for several metabolic processes, such as the biosynthesis of antibiotic erythromycin in Saccharopolyspora erythraea (Streptomyces erythraeus).Cytochrome P450 enzymes use haem to oxidise their substrates, using protons derived from NADH or NADPH to split the oxygen so a single atom can be added to a substrate. They also require electrons, which they receive from a variety of redox partners. In certain cases, cytochrome P450 can be fused to its redox partner to produce a bi-functional protein, such as with P450BM-3 from Bacillus megaterium [ ], which has haem and flavin domains.Organisms produce many different cytochrome P450 enzymes (at least 58 in humans), which together with alternative splicing can provide a wide array of enzymes with different substrate and tissue specificities. Individual cytochrome P450 proteins follow the nomenclature: CYP, followed by a number (family), then a letter (subfamily), and another number (protein); e.g. CYP3A4 is the fourth protein in family 3, subfamily A. In general, family members should share >40% identity, while subfamily members should share >55% identity.Cytochrome P450 proteins can also be grouped by two different schemes. One scheme was based on a taxonomic split: class I (prokaryotic/mitochondrial) and class II (eukaryotic microsomes). The other scheme was based on the number of components in the system: class B (3-components) and class E (2-components). These classes merge to a certain degree. Most prokaryotes and mitochondria (and fungal CYP55) have 3-component systems (class I/class B) - a FAD-containing flavoprotein (NAD(P)H-dependent reductase), an iron-sulphur protein and P450. Most eukaryotic microsomes have 2-component systems (class II/class E) - NADPH:P450 reductase (FAD and FMN-containing flavoprotein) and P450. There are exceptions to this scheme, such as 1-component systems that resemble class E enzymes [ , , ]. The class E enzymes can be further subdivided into five sequence clusters, groups I-V, each of which may contain more than one cytochrome P450 family (eg, CYP1 and CYP2 are both found in group I). The divergence of the cytochrome P450 superfamily into B- and E-classes, and further divergence into stable clusters within the E-class, appears to be very ancient, occurring before the appearance of eukaryotes.This family also includes germacrene A hydroxylase (GAO1; ) from plants such as lettuce (Lactuca sativa). GAO1 is required for the biosynthesis of germacrene-derived sesquiterpene lactones, which are characteristic natural products in members of the Asteraceae [ ].
Protein Domain
Name: Cytochrome P450, E-class, group I
Type: Family
Description: Cytochrome P450 enzymes are a superfamily of haem-containing mono-oxygenases that are found in all kingdoms of life, and which show extraordinary diversity in their reaction chemistry. In mammals, these proteins are found primarily in microsomes of hepatocytes and other cell types, where they oxidise steroids, fatty acids and xenobiotics, and are important for the detoxification and clearance of various compounds, as well as for hormone synthesis and breakdown, cholesterol synthesis and vitamin D metabolism. In plants, these proteins are important for the biosynthesis of several compounds such as hormones, defensive compounds and fatty acids. In bacteria, they are important for several metabolic processes, such as the biosynthesis of antibiotic erythromycin in Saccharopolyspora erythraea (Streptomyces erythraeus).Cytochrome P450 enzymes use haem to oxidise their substrates, using protons derived from NADH or NADPH to split the oxygen so a single atom can be added to a substrate. They also require electrons, which they receive from a variety of redox partners. In certain cases, cytochrome P450 can be fused to its redox partner to produce a bi-functional protein, such as with P450BM-3 from Bacillus megaterium [ ], which has haem and flavin domains.Organisms produce many different cytochrome P450 enzymes (at least 58 in humans), which together with alternative splicing can provide a wide array of enzymes with different substrate and tissue specificities. Individual cytochrome P450 proteins follow the nomenclature: CYP, followed by a number (family), then a letter (subfamily), and another number (protein); e.g. CYP3A4 is the fourth protein in family 3, subfamily A. In general, family members should share >40% identity, while subfamily members should share >55% identity.Cytochrome P450 proteins can also be grouped by two different schemes. One scheme was based on a taxonomic split: class I (prokaryotic/mitochondrial) and class II (eukaryotic microsomes). The other scheme was based on the number of components in the system: class B (3-components) and class E (2-components). These classes merge to a certain degree. Most prokaryotes and mitochondria (and fungal CYP55) have 3-component systems (class I/class B) - a FAD-containing flavoprotein (NAD(P)H-dependent reductase), an iron-sulphur protein and P450. Most eukaryotic microsomes have 2-component systems (class II/class E) - NADPH:P450 reductase (FAD and FMN-containing flavoprotein) and P450. There are exceptions to this scheme, such as 1-component systems that resemble class E enzymes [ , , ]. The class E enzymes can be further subdivided into five sequence clusters, groups I-V, each of which may contain more than one cytochrome P450 family (eg, CYP1 and CYP2 are both found in group I). The divergence of the cytochrome P450 superfamily into B- and E-classes, and further divergence into stable clusters within the E-class, appears to be very ancient, occurring before the appearance of eukaryotes.This entry represents class E cytochrome P450 proteins that fall into sequence cluster group I. Group I is richest in members, consisting of cytochrome P450 families CYP1, CYP2, CYP17, CYP21 and CYP71. The members of the first four families are of vertebrate origin, while those from CYP71 are derived from plants. CYP1 and CYP2 enzymes mainly metabolise exogenous substrates, whereas CYP17 and CYP21 are involved in metabolism of endogenous physiologically-active compounds.In the fungus Gibberella, P450 (FUS8) is a component in the biosynthetic pathway for the mycotoxin fusarin C. FUS8 oxidizes carbon C-20 of the intermediate 20-hydroxy-fusarin to form the penultimate intermediate carboxy-fusarin C [ ].This entry also includes cytochromes P450 (Noroxomaritidine synthases and p-coumarate 3-hydroxylase) that catalyse an intramolecular para-para' C-C phenol coupling of 4'-O-methylnorbelladine in alkaloids biosynthesis, during the biosynthesis of phenylpropanoids and Amaryllidaceae alkaloids including haemanthamine- and crinamine-type alkaloids, promising anticancer agents [ , ].
Protein Domain
Name: Gnk2-homologous domain
Type: Domain
Description: Ginkbilobin-2 (Gnk2) is an antifungal protein found in the endosperm of Ginkgo seeds, which inhibits the growth of phytopathogenic fungi such as Fusariumoxysporum. Gnk2 has considerable homology (~85%) to embryo-abundant proteins (EAP) from the gymnosperms Picea abies and P. glauca. Plant EAP are expressedin the late stage of seed maturation and are involved in protection against environmental stresses such as drought. The sequence of Gnk2 is also 28-31%identical to the extracellular domain of cysteine-rich receptor-like kinases (CRK) from the angiosperm Arabidopsis. The CRK members are induced by pathogeninfection and treatment with reactive oxygen species or salicylic acid and are involved in the hypersensitive reaction, which is a typical system ofprogrammed cell death. In addition, there are at least 60 genes in Arabidopsis encoding the cysteine-rich secreted proteins (CRSP) with an Gnk2-homologousdomain. Therefore, the proteins with a Gnk2-homologous domain are regarded as one of the largest protein superfamilies, although the role of the conservedGnk2-homologous domain remains unclear [ , ].The Gnk2-homologous domain is composed of two α-helices and a fivestranded β-sheet, which forms a compact single-domain architecture with an alpha+β-fold. It contains a C-X(8)-C-X(2)-C motif.Cysteine residues form three intramolecular disulphide bridges: C1-C5, C2-C3, and C4-C6 [].
Protein Domain
Name: S-locus receptor kinase, C-terminal
Type: Domain
Description: This functionally uncharacterised domain of around 50 amino acids is found in the C terminus of S-locus receptor kinase proteins from plants [ ].
Protein Domain
Name: Protein OBERON
Type: Family
Description: This entry represents a plant specific protein family, OBERON (OBE), also known as ptyvirus VPg-interacting protein (PVIP). In Arabidopsis, there are four OBEs, OBE1-4. OBE1 together with OBE2 are required for the maintenance and/or establishment of both the shoot and root meristems, probably by controlling the expression of the meristem genes such as WUS, PLT1 and PLT2 and of genes required for auxin responses [ ]. All four OBE proteins are able to interact with OBE1 and 2. However, OBE3 and 4 do not self-interact or interact with each other []. OBE1 and OBE2 also bind to virus genome-linked proteins (VPgs) of a diverse range of potyviruses and functions as an ancillary factor to support potyvirus movement in plants []. This entry also includes protein TITANIA from rice. TITANIA is a transcriptional regulator of multiple metal transporter genes responsible for essential metals delivery to shoots for their normal growth [ ]. Proteins in this family are characterised by an N-terminal domain of variable length, a central cysteine-rich region and a relatively acidic C-terminal domain. They possess a PHD-type zinc finger.
Protein Domain
Name: AIG1-type guanine nucleotide-binding (G) domain
Type: Domain
Description: This entry represents the AIG1-type G domain.The P-loop guanosine triphosphatases (GTPases) control a multitude of biological processes, ranging from cell division, cell cycling,and signal transduction, to ribosome assembly and protein synthesis. GTPases exert their control by interchanging between an inactive GDP-bound state andan active GTP-bound state, thereby acting as molecular switches. The common denominator of GTPases is the highly conserved guanine nucleotide-binding (G)domain that is responsible for binding and hydrolysis of guanine nucleotides.The TRAFAC (translation factor related) class AIG1/Toc34/Toc159-like paraseptin GTPase family contains the following subfamilies []:The GTPases of immunity-associated protein (GIMAP)/immune-associated nucleotide-binding protein (IAN) subfamily is conserved among vertebratesand angiosperm plants and has been postulated to regulate apoptosis, particularly in context with diseases such as cancer, diabetes, andinfections. The function of GIMAP/IAN GTPases has been linked to self defense in plants and to the development of T cells in vertebrates [, ].Plant-specific Toc (translocon at the outer envelope membrane of chloroplasts) proteins. Toc proteins function as integral components of thechloroplast protein import machinery. The Toc translocon contains the two membrane-bound GTPases Toc33/34 and Toc 159, which expose their G domainsto the cytosol and recognise and then deliver precursor proteins through the translocation pore Toc75 [, ].The GIMAP/IAN GTPases contain a avrRpt2 induced gene 1 (AIG1)-type G domain that exhibits the five motifs G1-G5 characteristic for GTP/GDP-bindingproteins. In addition, the AIG-type G domain contains a unique, highly conserved, hydrophobic motif between G3 and G4. It has a divergent version ofthe guanine recognition motif (G4) at the end of the core strand 5 and an additional helix alpha6 at the C terminus. The AIG1-type G domain contains acentral β-sheet sandwiched by two layers of α-helices.
Protein Domain
Name: FKBP-type peptidyl-prolyl cis-trans isomerase domain
Type: Domain
Description: FKBP-type peptidylprolyl isomerases ( ) in vertebrates, are receptors for the two immunosuppressants, FK506 and rapamycin. The drugs inhibit T cell proliferation by arresting two distinct cytoplasmic signal transmission pathways. Peptidylprolyl isomerases accelerate protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. These proteins are found in a variety of organisms [ ].This entry represents a domain found in FKBP-type peptidylprolyl isomerases.
Protein Domain
Name: SH2 domain
Type: Domain
Description: The Src homology 2 (SH2) domain is a protein domain of about 100 amino-acid residues first identified as a conserved sequence region between the oncoproteins Src and Fps [ ]. Similar sequences were later found in many other intracellular signal-transducing proteins []. SH2 domains function as regulatory modules of intracellular signalling cascades by interacting with high affinity to phosphotyrosine-containing target peptides in a sequence-specific, SH2 domains recognise between 3-6 residues C-terminal to the phosphorylated tyrosine in a fashion that differs from one SH2 domain to another, and strictly phosphorylation-dependent manner [, , , ]. They are found in a wide variety of protein contexts e.g., in association with catalytic domains of phospholipase Cy (PLCy) and the non-receptor protein tyrosine kinases; within structural proteins such as fodrin and tensin; and in a group of small adaptor molecules, i.e Crk and Nck. The domains are frequently found as repeats in a single protein sequence and will then often bind both mono- and di-phosphorylated substrates. The structure of the SH2 domain belongs to the α+β class, its overall shape forming a compact flattened hemisphere. The core structural elements comprise a central hydrophobic anti-parallel β-sheet, flanked by 2 short α-helices. The loop between strands 2 and 3 provides many of the binding interactions with the phosphate group of its phosphopeptide ligand, and is hence designated the phosphate binding loop, the phosphorylated ligand binds perpendicular to the β-sheet and typically interacts with the phosphate binding loop and a hydrophobic binding pocket that interacts with a pY+3 side chain. The N- and C-termini of the domain are close together in space and on the opposite face from the phosphopeptide binding surface and it has been speculated that this has facilitated their integration into surface-exposed regions of host proteins [ ].
Protein Domain
Name: YqgF/RNase H-like domain
Type: Domain
Description: The YqgF domain family is described as RNase H-like and typified by the Escherichia coli protein YqgF [ ].YqgF domain-containing proteins are predicted to be ribonucleases or resolvases based on homology to RuvC Holliday junction resolvases.The group of proteins containing this domain are found primarily in the low-GC Gram-positive bacteria Holliday junction resolvases (HJRs) and in eukaryote orthologs. The RuvC HJRs are conspicuously absent in the low-GC Gram-positive bacterial lineage, with the exception of Ureaplasma urealyticum( , [ ]). Furthermore, loss of function ruvC mutants of Escherichia coli show a residual HJR activity that cannot be ascribed to the prophage-encoded RusA resolvase []. This suggests that the YqgF family proteins could be alternative HJRs whose function partially overlaps with that of RuvC [].The functions of eukaryotic proteins having this domain are less well described. In Saccharomyces cerevisiae (Baker's yeast) Spt6p and its orthologues, the catalytic residues are substituted indicating that they lack the enzymatic function of resolvases [ ]. Spt6p has been implicated in transcription initiation [] and in maintaining normal chromatin structure during transcription elongation [].Horizontal gene transfer, lineage-specific gene loss and gene family expansion, and non-orthologous gene displacement seem to have been major forces in the evolution of HJRs and related nucleases. The diversity of HJRs and related nucleases in bacteria and archaea contrasts with their near absence in eukaryotes. The few detected eukaryotic representatives of the endonuclease fold and the RNase H fold have probably been acquired from bacteria via horizontal gene transfer. The identity of the principal HJR(s) involved in recombination in eukaryotes remains uncertain; this function could be performed by topoisomerase IB or by a novel, so far undetected, class of enzymes. Likely HJRs and related nucleases were identified in the genomes of numerous bacterial and eukaryotic DNA viruses. Gene flow between viral and cellular genomes has probably played a major role in the evolution of this class of enzymes.The YqgF domain is also found in Tex proteins, where maintains the core structural elements and aligns especially well with RuvC nucleases, although Tex does not appear to possess nuclease activity [ ]. Tex (toxin expression) is a highly conserved bacterial protein involved in expression of critical toxin genes [].
Protein Domain      
Protein Domain
Name: Transcription elongation factor Spt6
Type: Family
Description: Spt6 is an essential histone chaperone that mediates nucleosome reassembly during gene transcription. The N-terminal ~300 residues are highly acidic, predicted to be disordered, and are necessary for binding both nucleosomes and the transcription factor Spn1/IWS1 [ ]. Spt6 associates with RNAPII via a tandem Src2 homology domain [, ]. Spt6-RNAPII association is required for efficient recruitment of the Ccr4-Not de-adenylation complex to transcribed genes for mRNA turnover [ ]. Spt6 has a key function in transcription elongation. In yeast, Spt6's interaction with phosphorylated tyrosine 1 at the C-terminal domain of RNAPII prevents pre-mature recruitment of termination factors to genes [].
Protein Domain
Name: Tex-like domain superfamily
Type: Homologous_superfamily
Description: Tex (toxin expression) is a highly conserved bacterial protein involved in expression of critical toxin genes [ ]. The overall structure is notably flat and elongated. The most striking structural feature is a long, central helix comprising from amino acid residue 274 to 322. The rest of the protein wraps around the central helix at both the N-terminal and C-terminal ends. Although the Tex structure is fairly compact, it can be largely described as a series of distinct domains [].This superfamily represents a domain comprising amino acids 115-327 and 456-502, flanking the YqgF domain ().
Protein Domain      
Protein Domain
Name: Transcription elongation factor Spt6, YqgF domain
Type: Domain
Description: The YqgF domain of Spt6 proteins is homologous to the E.coli Holliday junction resolvase RuvC [ ], but its putative catalytic site lacks the carboxylate side chains critical for coordinating magnesium ions that mediate phosphodiester bond-cleavage [].
Protein Domain
Name: Spt6 acidic, N-terminal domain
Type: Domain
Description: The N terminus of Spt6 is highly acidic. The full Spt6 protein is a transcription regulator, but the exact function of this acidic region is not certain.
Protein Domain
Name: RuvA domain 2-like
Type: Homologous_superfamily
Description: In prokaryotes, RuvA, RuvB, and RuvC process the universal DNA intermediate of homologous recombination, termed Holliday junction. The tetrameric DNA helicase RuvA specifically binds to the Holliday junction and facilitates the isomerization of the junction from the stacked folded configuration to the square-planar structure [ ]. In the RuvA tetramer, each subunit consists of three domains, I, II and III, where I and II form the major core that is responsible for Holliday junction binding and base pair rearrangements of Holliday junction executed at the crossover point, whereas domain III regulates branch migration through direct contact with RuvB. Domain 2 has a SAM (sterile alpha motif)-like α-bundle fold that occurs as a duplication containing two helix-hairpin-helix (HhH) motifs.The C-terminal domain (CTD) of the excision repair protein UvrC shows structural similarity to RuvA domain 2. The CTD of UvrC is essential for 5' incision in the prokaryotic nucleotide excision repair process, and acts to mediate structure-specific binding to single-stranded-double-stranded junction DNA [ ].Domain III of NAD+-dependent DNA ligase consists of a duplication of two RuvA-like domains (four HhH motifs), and also contains a zinc-finger subdomain. DNA ligases catalyse the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilizing either ATP or NAD+ as a cofactor [ ].
Protein Domain
Name: Peptidase S28
Type: Family
Description: This group of serine peptidases belong to MEROPS peptidase family S28 (clan SC). The predicted active site residues for members of this family and family S10 occur in the same order in the sequence: S, D, H.These serine proteases include several eukaryotic enzymes such as lysosomal Pro-X carboxypeptidase, dipeptidyl-peptidase II, and thymus-specific serine peptidase [ , , , ].Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].
Protein Domain
Name: Aminotransferases, class-I, pyridoxal-phosphate-binding site
Type: Binding_site
Description: Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). Pyridoxal 5'-phosphate (PLP) is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination [ , , ]. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors []. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy [].PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the ε-amino group of an active site lysine residue on the enzyme. The α-amino group of the substrate displaces the lysine ε-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic [ ].Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped [ ] into subfamilies; these sequences are defined by the aminotransferases class-I pyridoxal-phosphate attachment site signature, which contains the lysine residue involved in pyridoxal-phosphate binding.
Protein Domain
Name: Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain
Type: Domain
Description: Enzymes that participate in the transfer of one-carbon units require the coenzyme tetrahydrofolate (THF). Various reactions generate one-carbon derivatives of THF, which can be interconverted between differentoxidation states by methylene-THF dehydrogenase ( ), methenyl-THF cyclohydrolase ( ) and formyl-THF synthetase () [ , ]. The dehydrogenase and cyclohydrolaseactivities are expressed by a variety of multifunctional enzymes, including the tri-functional eukaryotic C1-tetrahydrofolate synthase []; a bifunctional eukaryotic mitochondrial protein; and thebifunctional Escherichia coli folD protein [ , ]. Methylene-tetrahydrofolate dehydrogenase andmethenyltetrahydrofolate cyclo-hydrolase share an overlapping active site [ ], and as such areusually located together in proteins, acting in tandem on the carbon-nitrogen bonds of substrates other than peptide bonds.This entry represents the N-terminal catalytic domain of these enzymes.
Protein Domain
Name: Tetrahydrofolate dehydrogenase/cyclohydrolase
Type: Family
Description: Enzymes that participate in the transfer of one-carbon units require the coenzyme tetrahydrofolate (THF). Various reactions generate one-carbon derivatives of THF, which can be interconverted between differentoxidation states by methylene-THF dehydrogenase ( ), methenyl-THF cyclohydrolase ( ) and formyl-THF synthetase () [ , ]. The dehydrogenase and cyclohydrolaseactivities are expressed by a variety of multifunctional enzymes, including the tri-functional eukaryotic C1-tetrahydrofolate synthase []; a bifunctional eukaryotic mitochondrial protein; and thebifunctional Escherichia coli folD protein [ , ]. Methylene-tetrahydrofolate dehydrogenase andmethenyltetrahydrofolate cyclo-hydrolase share an overlapping active site [ ], and as such areusually located together in proteins, acting in tandem on the carbon-nitrogen bonds of substrates other than peptide bonds.
Protein Domain
Name: Tetrahydrofolate dehydrogenase/cyclohydrolase, NAD(P)-binding domain
Type: Domain
Description: Enzymes that participate in the transfer of one-carbon units require the coenzyme tetrahydrofolate (THF). Various reactions generate one-carbon derivatives of THF, which can be interconverted between differentoxidation states by methylene-THF dehydrogenase ( ), methenyl-THF cyclohydrolase ( ) and formyl-THF synthetase () [ , ]. The dehydrogenase and cyclohydrolaseactivities are expressed by a variety of multifunctional enzymes, including the tri-functional eukaryotic C1-tetrahydrofolate synthase []; a bifunctional eukaryotic mitochondrial protein; and thebifunctional Escherichia coli folD protein [ , ]. Methylene-tetrahydrofolate dehydrogenase andmethenyltetrahydrofolate cyclo-hydrolase share an overlapping active site [ ], and as such areusually located together in proteins, acting in tandem on the carbon-nitrogen bonds of substrates other than peptide bonds.This entry represents the NAD(P)-binding domain found in these enzymes.
Protein Domain
Name: Prefoldin beta-like
Type: Family
Description: Prefoldin (PFD) is a chaperone that interacts exclusively with type II chaperonins, hetero-oligomers lacking an obligate co-chaperonin that are found only in eukaryotes (chaperonin-containing T-complex polypeptide-1 (CCT)) and archaea. Eukaryotic PFD is a multi-subunit complex containing six polypeptides in the molecular mass range of 14-23kDa. In archaea, on the other hand, PFD is composed of two types of subunits, two alpha and four beta. The six subunits associate to form two back-to-back up-and-down eight-stranded barrels, from which hang six coiled coils. Each subunit contributes one (beta subunits) or two (alpha subunits) beta hairpin turns to the barrels. The coiled coils are formed by the N and C termini of an individual subunit. Overall, this unique arrangement resembles a jellyfish. The eukaryotic PFD hexamer is composed of six different subunits; however, these can be grouped into two alpha-like (PFD3 and -5) and four beta-like (PFD1, -2, -4, and -6) subunits based on amino acid sequence similarity with their archaeal counterparts. Eukaryotic PFD has a six-legged structure similar to that seen in the archaeal homologue [ , ]. This family contains the archaeal beta subunit, eukaryotic prefoldin subunits 1, 2, 4 and 6.Eukaryotic PFD has been shown to bind both actin and tubulin co-translationally. The chaperone then delivers the target protein to CCT, interacting with the chaperonin through the tips of the coiled coils. No authentic target proteins of any archaeal PFD have been identified, to date.
Protein Domain
Name: Prefoldin
Type: Homologous_superfamily
Description: The Prefoldin/GimC family of proteins are found in eukaryotes and archaea [ ]. Prefoldin is part of a molecular chaperone system that promotes the correct folding of nascent polypeptide chains. Prefoldin/GimC interacts with the nascent chain to stabilise it prior to its folding within the central cavity of a chaperonin. Prefoldin/GimC is a hexamer consisting of two types of subunits, alpha and beta. Archaeal prefoldin contains one type of alpha and one type of beta subunit [], while eukaryotic prefoldin/GimC contains two different but related alpha subunits and four related beta subunits [].The unconventional prefoldin RPB5 interactor-like protein from Drosophila (also known as URI) is a serine/threonine phosphatase inhibitor and required for germ line cell viability and differentiation. It binds to protein phosphatase 1 alpha subunits with higher affinity than the beta subunits [ ].
Protein Domain
Name: Prefoldin subunit 2
Type: Family
Description: This entry represents prefoldin subunit 2 (PFD2, also known as Gim4 in budding yeasts). PFD2 is part of the prefoldin heterohexamer complex (consists of two PFD-alpha type and four PFD-beta type subunits) that binds specifically to cytosolic chaperonin (c-CPN) and transfers target proteins to it [ ]. It binds to nascentpolypeptide chain and promotes folding in an environment in which there are many competing pathways for non-native proteins [ , ].
Protein Domain
Name: Cytochrome c oxidase, subunit VIa
Type: Family
Description: Cytochrome c oxidase ( ) is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen [ ]. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits is known as VIa in vertebrates and fungi. Mammals have two tissue-specific isoforms of VIa, a liver (VIa-L) and a heart and skeletal muscle isoform (VIa-H). Mammalian VIa-H induces a slip in CcO (decrease in proton/electron stoichiometry) at high intramitochondrial ATP/ADP ratios, while VIa-L induces a permanent slip in CcO, depending on the presence of cardiolipin and palmitate [ ]. Only one form is found in fish [].
Protein Domain
Name: Tubulin, C-terminal
Type: Homologous_superfamily
Description: Microtubules are polymers of tubulin, a dimer of two 55kDa subunits, designated alpha and beta [ , ]. Within the microtubule lattice, α-β heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end [].For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site [ ].Most species, excepting simple eukaryotes, express a variety of closely-related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly []. Further eukaryotic tubulins (gamma, epsilon, zeta) that are restricted to certain lineages or species have been reported [, ].Bacterial and archaeal homologues of tubulin have been discovered. BtubA and BtubB, two bacterial homologues in the genus Prosthecobacter, have probably been derived by horizontal gene transfer [ , ].This entry represents the extreme C-terminal structural domain of both alpha and beta tubulin. It forms a helix hairpin [ ].
Protein Domain
Name: Alpha tubulin
Type: Family
Description: Microtubules are polymers of tubulin, a dimer of two 55kDa subunits, designated alpha and beta [, ]. Within the microtubule lattice, α-βheterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin isoriented in microtubules with beta-tubulin toward the plus end [ ]. For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site []. Most species, excepting simple eukaryotes, express a variety of closely related alpha- and beta-isotypes. A third family member, gamma tubulin, hasalso been identified in a number of species [ ].This entry represents alpha tubulin.
Protein Domain
Name: Tubulin
Type: Family
Description: Microtubules are polymers of tubulin, a dimer of two 55kDa subunits, designated alpha and beta [ , ]. Within the microtubule lattice, α-β heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end [].For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site [ ].Most species, excepting simple eukaryotes, express a variety of closely-related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly []. Further eukaryotic tubulins (gamma, epsilon, zeta) that are restricted to certain lineages or species have been reported [, ].Bacterial and archaeal homologues of tubulin have been discovered. BtubA and BtubB, two bacterial homologues in the genus Prosthecobacter, have probably been derived by horizontal gene transfer [ , ].
Protein Domain
Name: Tubulin/FtsZ, GTPase domain
Type: Domain
Description: This entry represents a GTPase domain found in all tubulin chains, such as tubulin alpha, beta and gamma chains, plant ARC3 and prokaryotic FtsZ and CetZ proteins [ , ]. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ (homologue of eukaryotic tubulin) is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells [, ]. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. CetZ co-exists with FtsZ in many archaea. Cetz does not affect cell division, instead, it is involved in cell shape control []. Arabidopsis chloroplast protein ARC3 (At1g75010) is a Z-ring accessory protein involved in the initiation of plastid division and division site placement [, ].
Protein Domain
Name: Tubulin, conserved site
Type: Conserved_site
Description: Microtubules are polymers of tubulin, a dimer of two 55kDa subunits, designated alpha and beta [, ]. Within the microtubule lattice, α-β heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end [].For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site [].Most species, excepting simple eukaryotes, express a variety of closely- related alpha- and beta-isotypes. A third family member, gamma tubulin, hasalso been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly [].This entry represents the glycine-rich conserved site near the E (Exchangeable) site that controls the access of the nucleotide to its binding site.
Protein Domain
Name: Tubulin/FtsZ, C-terminal
Type: Homologous_superfamily
Description: This domain superfamily is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitroand is ubiquitous in bacteria and archaea. This is the C-terminal domain.
Protein Domain
Name: Tubulin/FtsZ, 2-layer sandwich domain
Type: Domain
Description: This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitroand is ubiquitous in bacteria and archaea. This is the C-terminal domain.
Protein Domain
Name: Homeobox domain
Type: Domain
Description: The homeobox domain or homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [ , ]. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies []. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two α-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure. The motif is very similar in sequence and structure in a wide range of DNA-binding proteins (e.g., cro and repressor proteins, homeotic proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
Protein Domain
Name: Leucine zipper, homeobox-associated
Type: Domain
Description: This region is a plant specific leucine zipper that is always found associated with a homeobox [].
Protein Domain
Name: Homeobox, conserved site
Type: Conserved_site
Description: The homeobox domain or homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [ , ]. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies []. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two α-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure. The motif is very similar in sequence and structure in a wide range of DNA-binding proteins (e.g., cro and repressor proteins, homeotic proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
Protein Domain
Name: Glycoside hydrolase family 16
Type: Domain
Description: The glycosyl hydrolases family 16 (GH16) [ ] contains functionally heterogeneous members, including lichenase (); xyloglucan xyloglucosyltransferase ( ); agarase ( ); kappa-carrageenase ( ); endo-beta-1,3-glucanase ( ); endo-beta-1,3-1,4-glucanase ( ); endo-beta-galactosidase ( ). These enzymes share a common ancestor and have diverged significantly in their primary sequence. The GH16 catalytic domain has a classical sandwich-like β-jelly roll fold, formed by two main, closely packed and curved antiparallel β-sheets, creating a deep channel harboring the catalytic machinery. Even though the GH16 domains have now diverged significantly in their primary sequences, they all feature a common catalytic motif, E-[ILV]-D-[IVAF]-[VILMF](0,1)-E. The two glutamic acid residues in the conserved motif are the nucleophile and the general base involved in catalysis, whereas the aspartic acid residue is important in maintaining the relative position of these catalytic amino acids [, ].Two closely clustered conserved glutamates have been shown [ ] to be involved in the catalytic activity of Bacillus licheniformis lichenase. This domain contains these residues.
Protein Domain
Name: Xyloglucan endo-transglycosylase, C-terminal
Type: Domain
Description: This entry represents the C terminus (approximately 60 residues) of plant xyloglucan endo-transglycosylases (XET). Xyloglucan is the predominant hemicellulose in the cell walls of most dicotyledons. With cellulose, it forms a network that strengthens the cell wall. XET catalyses the splitting of xyloglucan chains and the linking of the newly generated reducing end to the non-reducing end of another xyloglucan chain, thereby loosening the cell wall [ , , ].
Protein Domain
Name: Xyloglucan endotransglucosylase/hydrolase
Type: Family
Description: This entry includes a group of xyloglucan endotransglucosylase/hydrolases (XTHs) from plants [ ]. Xyloglucan is a soluble hemicellulose with a backbone of beta-1,4-linked glucose units, partially substituted with alpha-1,6-linked xylopyranose branches. It binds noncovalently to cellulose, cross-linking the adjacent cellulose microfibrils, giving it a key structural role as a matrix polymer. XTHs catalyses xyloglucan endohydrolysis (XEH) and/or endotransglycosylation (XET) and are involved in the modification of cell wall structure by cleaving and, often, also re-joining xyloglucan molecules in primary plant cell walls [].
Protein Domain
Name: Glycoside hydrolase, family 16, active site
Type: Active_site
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Glycoside hydrolase family 16 comprises enzymes with a number of known activities; lichenase ( ); xyloglucan xyloglucosyltransferase ( ); agarase ( ); kappa-carrageenase ( ); endo-beta-1,3-glucanase ( ); endo-beta-1,3-1,4-glucanase ( ); endo-beta-galactosidase ( ). Two closely clustered and conserved glutamates, which are part of this signature, have been shown to be involved in the catalytic activity of Bacillus licheniformis.
Protein Domain
Name: Membrane transport protein
Type: Family
Description: This entry represents a mostly uncharacterised family of membrane transport proteins found in eukaryotes, bacteria and archaea. Most characterised members of this family are the PIN components of auxin efflux systems from plants. These carriers are saturable, auxin-specific, and localized to the basal ends of auxin transport-competent cells [ , ]. Plants typically posses several of these proteins, each displaying a unique tissue-specific expression pattern. They are expressed in almost all plant tissues including vascular tissues and roots, and influence many processes including the establishment of embryonic polarity, plant growth, apical hook formation in seedlings and the photo- and gravitrophic responses. These plant proteins are typically 600-700 amino acyl residues long and exhibit 8-12 transmembrane segments.
Protein Domain
Name: SERRATE/Ars2, N-terminal
Type: Domain
Description: This domain can be found in the N terminus of the SERRATE (SE) from plants and its homologue, Ars2, from animals. They play a role in nuclear RNA metabolism. They interact with the nuclear cap-binding complex (CBC) and mediates interactions with diverse RNA processing and transport machineries in a transcript-dependent manner. Interestingly, the plant SERRATE does not have the RNA recognition motif (RRM) domain found in metazoans and S. pombe [ ].
Protein Domain
Name: P-type ATPase, subfamily IV
Type: Family
Description: P-ATPases (also known as E1-E2 ATPases) ([intenz:3.6.3.-]) are found in bacteria and in a number of eukaryotic plasma membranes and organelles []. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, which transport specific types of ion: H+, Na +, K +, Mg 2+, Ca 2+, Ag +and Ag 2+, Zn 2+, Co 2+, Pb 2+, Ni 2+, Cd 2+, Cu +and Cu 2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2. Type IV ATPases have been shown to be involved in the transport of phospholipids [ , ], being involved in signal transduction, cell division, and vesicular transport. These ATPases are found in eukaryotes.
Protein Domain
Name: Zinc finger, PMZ-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry describes a plant mutator transposase zinc finger.
Protein Domain
Name: MULE transposase domain
Type: Domain
Description: This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [ , ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 [].
Protein Domain
Name: Zinc finger, SWIM-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination [ ]. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins.
Protein Domain
Name: FAR1 DNA binding domain
Type: Domain
Description: Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants.The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [ , ]. This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Protein Domain
Name: Argonaute, linker 1 domain
Type: Domain
Description: ArgoL1 is a region found in argonaute [ ] proteins. It normally co-occurs with BAG domain () and Piwi domain (). It is a linker region between the N-terminal and the PAZ domains. It contains an α-helix packed against a three-stranded antiparallel β-sheet with two long β-strands (beta8 and beta9) of the sheet spanning one face of the adjacent N and PAZ domains. L1 together with linker 2, L2, PAZ and ArgoN forms a compact global fold [ ].
Protein Domain
Name: PAZ domain
Type: Domain
Description: This domain is named after the proteins Piwi Argonaut and Zwille. It is also found in the CAF protein from Arabidopsis thaliana. The function of the domain is unknown but has been found in the middle region of a number of members of the Argonaute protein family, which also contain the Piwi domain ( ) in their C-terminal region [ ]. Several members of this family have been implicated in thedevelopment and maintenance of stem cells through the RNA-mediated gene-quelling mechanisms associated with the protein DICER.
Protein Domain
Name: Piwi domain
Type: Domain
Description: The piwi domain [ ] is a protein domain found in piwi proteins and a large number of related nucleic acid-binding proteins, especially those that bind and cleave RNA. The function of the domain is double stranded-RNA-guided hydrolysis of single stranded-RNA, as has been determined in the argonaute family of related proteins [].
Protein Domain
Name: Dullard phosphatase domain, eukaryotic
Type: Domain
Description: This entry represents the putative phosphatase domain of a family of eukaryotic proteins including "Dullard"[ ], and the NLI interacting factor (NIF)-like phosphatases [].
Protein Domain
Name: FCP1 homology domain
Type: Domain
Description: Yeast FCP1 is an essential protein serine phosphatase ( ) that dephosphorylates the C-terminal domain (CTD) of RNA polymerase II. FCP1 orthologs are present in all known eukaryote proteomes. The N-terminal domain of FCP1 corresponds to the catalytic unit of the phosphatase and has been refered to as the FCP1 homology domain. The FCP1 homology domain, which is a ~180-residue module, is also found in many other proteins of unknown function. It contains a DxDx(T/V) motif preceded by four hydrophobic residues characteristic of a large family of metal-dependent phosphohydrolases andphosphotransferases. The first aspartate residue is likely to participate in catalysis, whereas the second could have a role in substrate recognition[ , , , ].
Protein Domain
Name: Transcription factor, MADS-box
Type: Domain
Description: Human serum response factor (SRF) is a ubiquitous nuclear protein important for cell proliferation and differentiation. SRF function is essential for transcriptional regulation of numerous growth-factor-inducible genes, such as c-fos oncogene and muscle-specific actin genes. A core domain of around 90 amino acids is sufficient for the activities of DNA-binding, dimerisation and interaction with accessory factors. Within the core is a DNA-binding region, designated the MADS box [ ], that is highly similar to many eukaryotic regulatory proteins: among these are MCM1, the regulator of cell type-specific genes in fission yeast; DSRF, a Drosophila trachea development factor; the MEF2 family of myocyte-specific enhancer factors; and the Agamous and Deficiens families of plant homeotic proteins.In SRF, the MADS box has been shown to be involved in DNA-binding and dimerisation [ ]. Proteins belonging to the MADS family function as dimers, the primary DNA-binding element of which is an anti-parallel coiled coil of two amphipathic α-helices, one from each subunit. The DNA wraps around the coiled coil allowing the basic N-termini of the helices to fit into the DNA major groove. The chain extending from the helix N-termini reaches over the DNA backbone and penetrates into the minor groove. A 4-stranded, anti-parallel β-sheet packs against the coiled-coil face opposite the DNA and is the central element of the dimerisation interface. The MADS-box domain is commonly found associated with K-box region see ( ).
Protein Domain
Name: Transcription factor, K-box
Type: Domain
Description: MADS genes in plants encode key developmental regulators of vegetative and reproductive development. The majority of the plant MADS proteins share a stereotypical MIKC structure. It comprises (from N- to C-terminal) an N-terminal domain, which is, however, present only in a minority of proteins; a MADS domain (see , ), which is the major determinant of DNA-binding but which also performs dimerisation and accessory factor binding functions; a weakly conserved intervening (I) domain, which constitutes a key molecular determinant for the selective formation of DNA-binding dimers; a keratin-like (K-box) domain, which promotes protein dimerisation; and a C-terminal (C) domain, which is involved in transcriptional activation or in the formation of ternary or quaternary protein complexes. The 80-amino acid K-box domain was originally identified as a region with low but significant similarity to a region of keratin, which is part of the coiled-coil sequence constituting the central rod-shaped domain of keratin [ , , ].The K-box protein-protein interaction domain which mediates heterodimerization of MIKC-type MADS proteins contains several heptad repeats in which the first and the fourth positions are occupied by hydrophobic amino acids suggesting that the K-box domain forms three amphipathic α-helices referred to as K1, K2, and K3 [ ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom