Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 1001 to 1100 out of 38750 for *

Category restricted to ProteinDomain (x)

0.014s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Cytokinin dehydrogenase, C-terminal domain superfamily
Type: Homologous_superfamily
Description: This domain superfamily is found towards the C terminus of cytokinin dehydrogenase and vanillyl-alcohol oxidase.
Protein Domain
Name: Cytokinin dehydrogenase 1, FAD/cytokinin binding domain
Type: Domain
Description: This domain adopts an alpha+beta sandwich structure with an antiparallel β-sheet, in a ferredoxin-like fold. It is predominantly found in plant cytokinin dehydrogenase 1, where it is capable of binding both FAD and cytokinin substrates. The substrate displays a 'plug-into-socket' binding mode that seals the catalytic site and precisely positions the carbon atom undergoing oxidation in close contact with the reactive locus of the flavin [ ].
Protein Domain
Name: FAD-binding domain, PCMH-type
Type: Domain
Description: Flavoenzymes have the ability to catalyse a wide range of biochemical reactions. They are involved in the dehydrogenation of a variety of metabolites, in electron transfer from and to redox centres, in light emission, in the activation of oxygen for oxidation and hydroxylation reactions [ ]. About 1% of all eukaryotic and prokaryotic proteins are predicted to encode a flavin adenine dinucleotide (FAD)-binding domain [].According to structural similarities and conserved sequence motifs,FAD-binding domains have been grouped in three main families: (i) theferredoxin reductase (FR)-type FAD-binding domain,(ii) the FAD-binding domains that adopt a Rossmann fold and (iii) the p-cresol methylhydroxylase (PCMH)-type FAD-binding domain [ ].The FAD cofactor consists of adenosine monophosphate (AMP) linked to flavin mononucleotide (FMN) by a pyrophosphate bond. The AMP moiety is composed of the adenine ring bonded to a ribose that is linked to a phosphate group. The FMN moiety is composed of the isoalloxazine-flavin ring linked to a ribitol, which is connected to a phosphate group. The flavin functions mainly in a redox capacity, being able to take up two electrons from one substrate and release them two at a time to a substrate or coenzyme, or one at a time to an electron acceptor. The catalytic function of the FAD is concentrated in the isoalloxazine ring, whereas the ribityl phosphate and the AMP moiety mainly stabilise cofactor binding to protein residues [ ].The PCMH-type FAD-binding domain consists of two α-β subdomains: one is composed of three parallel β-strands (B1-B3) surrounded by α-helices, and is packed against the second subdomain containing five antiparallel β-strands (B4-B8) surrounded by α-helices [ ]. The two subdomains accommodate the FAD cofactor between them []. In the PCMH proteins the coenzyme FAD is also covalently attached to a tyrosine located outside the FAD-binding domain in the C-terminal catalytic domain [].This domain is found in:FAD-linked oxidases (N-terminal domain), such as vanillyl-alochol oxidase ( ) [ ], flavoprotein subunit of p-cresol methylhydroxylase () [ ], D-lactate dehydrogenases (, -cytochrome) [ ], cholesterol oxidases () [ ], and cytokinin dehydrogenase 1 () [ ].Uridine diphospho-N-acetylenolpyruvylglucosamine reductase (MurB) (N-terminal domain) [ ].CO dehydrogenase flavoprotein (N-terminal domain; [ ]) family, which includes xanthine oxidase (domain 3) () [ ], subunit A of xanthine dehydrogenase (domain 3) () [ ], medium subunit of quinoline 2-oxidoreductase (QorM) () [ ], and the beta-subunit of 4-hydroxybenzoyl-CoA reductase (HrcB) (N-terminal domain) () [ ].
Protein Domain
Name: FAD linked oxidase, N-terminal
Type: Domain
Description: Various enzymes use FAD as a co-factor, most of these enzymes are oxygen-dependent oxidoreductases, containing a covalently bound FAD group which is attached to a histidine via an 8-alpha-(N3-histidyl)-riboflavin linkage. One of the enzymes Vanillyl-alcohol oxidase (VAO, ) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110 [ ]. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyses the oxidation of a wide variety of substrates, ranging from aromatic amines to 4-alkylphenols. Other enzymes included in this family are MurB family members UDP-N-acetylenolpyruvoylglucosamine reductases involved in the biosynthesis of peptidoglycan [], D-lactate dehydrogenases among many others oxidoreductases.
Protein Domain
Name: FAD-linked oxidase-like, C-terminal
Type: Homologous_superfamily
Description: This superfamily represents a structural domain found at the C-terminal of several FAD-linked oxidases. This domain consists of two structural subdomains: subdomain 1 is a 2-layer a/b or 3-layer a/b/a sandwich, and subdomain 2 is either an orthogonal α-bundle or a second 2-layer a/b sandwich. It can be found in the following proteins:Vanillyl-alcohol family, which includes the flavoprotein vanillyl-alcohol oxidase ( ) [ ] as well as the flavoprotein subunit of p-cresol methylhydroxylase () (the other subunit being a short chain cytochrome c) [ ], both of these enzymes contain covalently bound FAD.D-lactate dehydrogenases ( , -cytochrome), a peripheral membrane respiratory enzyme involved in electron transfer for the energization of active transport of a variety of sugars and amino acids in bacteria [ ].Cholesterol oxidases ( ), a monomeric flavoenzyme that catalyses the oxidation and isomerisation of cholesterol to cholest-4-en-3-one [ ].Cytokinin dehydrogenase 1 ( ), which has a major role in the control of plant cytokinin hormone levels by catalysing their irreversible oxidation [ ].
Protein Domain
Name: FAD-binding, type PCMH, subdomain 2
Type: Homologous_superfamily
Description: According to structural similarities and conserved sequence motifs, FAD-binding domains have been grouped in three main families: (i) the ferredoxin reductase (FR)-type FAD-binding domain, (ii) the FAD-binding domains that adopt a Rossmann fold and (iii) the p-cresol methylhydroxylase (PCMH)-type FAD-binding domain [ ].The PCMH-type FAD-binding domain consists of two α-β subdomains: one is composed of three parallel β-strands (B1-B3) surrounded by α-helices, and is packed against the second subdomain containing five antiparallel β-strands (B4-B8) surrounded by α-helices [ ]. The two subdomains accommodate the FAD cofactor between them []. This superfamily represents the second (C-terminal) subdomain, which is found in:CO dehydrogenase flavoprotein (N-terminal domain; [ ]) family, which includes xanthine oxidase (domain 3) () [ ], subunit A of xanthine dehydrogenase (domain 3) () [ ], medium subunit of quinoline 2-oxidoreductase (QorM) () [ ], and the beta-subunit of 4-hydroxybenzoyl-CoA reductase (HrcB) (N-terminal domain) () [ ].Uridine diphospho-N-acetylenolpyruvylglucosamine reductase (MurB) (N-terminal domain) [ ].
Protein Domain
Name: PSP, proline-rich
Type: Domain
Description: PSP is a proline-rich domain of unknown function found in spliceosome associated proteins.
Protein Domain
Name: Timeless, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain found in the Timeless (TIM) proteins. This domain can be found in TIM homologues mostly from animals. This domain found in hTIM has been shown to bind to the PARP-1 catalytic domain [ ].The timeless gene in Drosophila melanogasteris involved in circadian rhythm control [ ]. Drosophila contains two paralogs, dTIM and dTIM2, acting in clock/photoreception and chromosome integrity/photoreception respectively. The mammalian TIMELESS (TIM) protein, originally identified based on its similarity to Drosophila dTIM, interacts with the clock proteins dCRY and dPER and is essential for circadian rhythm generation and photo-entrainment in the fly []. However, phylogenetic sequence analysis has demonstrated that dTIM2 is likely to be the orthologue of mammalian TIM and other widely conserved TIM-like proteins in eukaryotes []. These proteins include Saccharomyces cerevisiae Tof1, Schizosaccharomyces pombe Swi1, and Caenorhabditis elegans TIM. These proteins are not involved in the core clock mechanism, but instead play important roles in chromosome integrity, efficient cell growth and/or development [, ], with the exception of dTIM-2, that has an additional function in retinal photoreception [].Saccharomyces cerevisiae Tof1 is a subunit of a replication-pausing checkpoint complex (Tof1-Mrc1-Csm3) that acts at the stalled replication fork to promote sister chromatid cohesion after DNA damage, facilitating gap repair of damaged DNA [ , ]. Schizosaccharomyces pombe Swi1 and Swi3 form the fork protection complex that coordinates leading- and lagging-strand synthesis and stabilizes stalled replication forks []. In humans timeless forms a stable complex with its partner protein Tipin. The Timeless-Tipin complex has been reported to travel along with the replication fork during unperturbed DNA replication. Moreover, the Timeless-Tipin-Claspin complex contributes to full activation of the ATR-Chk1 signaling pathway through the recruitment of Chk1 to arrested replication forks for sufficient ATR-mediated phosphorylation. It also interacts with PARP-1, and this interaction is required for efficient homologous recombination repair [ ].
Protein Domain
Name: Timeless, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of the Timeless protein. The timeless gene in Drosophila melanogasteris involved in circadian rhythm control [ ]. Drosophila contains two paralogs, dTIM and dTIM2, acting in clock/photoreception and chromosome integrity/photoreception respectively. The mammalian TIMELESS (TIM) protein, originally identified based on its similarity to Drosophila dTIM, interacts with the clock proteins dCRY and dPER and is essential for circadian rhythm generation and photo-entrainment in the fly []. However, phylogenetic sequence analysis has demonstrated that dTIM2 is likely to be the orthologue of mammalian TIM and other widely conserved TIM-like proteins in eukaryotes []. These proteins include Saccharomyces cerevisiae Tof1, Schizosaccharomyces pombe Swi1, and Caenorhabditis elegans TIM. These proteins are not involved in the core clock mechanism, but instead play important roles in chromosome integrity, efficient cell growth and/or development [, ], with the exception of dTIM-2, that has an additional function in retinal photoreception [].Saccharomyces cerevisiae Tof1 is a subunit of a replication-pausing checkpoint complex (Tof1-Mrc1-Csm3) that acts at the stalled replication fork to promote sister chromatid cohesion after DNA damage, facilitating gap repair of damaged DNA [ , ]. Schizosaccharomyces pombe Swi1 and Swi3 form the fork protection complex that coordinates leading- and lagging-strand synthesis and stabilizes stalled replication forks []. In humans timeless forms a stable complex with its partner protein Tipin. The Timeless-Tipin complex has been reported to travel along with the replication fork during unperturbed DNA replication. Moreover, the Timeless-Tipin-Claspin complex contributes to full activation of the ATR-Chk1 signaling pathway through the recruitment of Chk1 to arrested replication forks for sufficient ATR-mediated phosphorylation. It also interacts with PARP-1, and this interaction is required for efficient homologous recombination repair [ ].
Protein Domain
Name: Pseudouridine synthase II, N-terminal
Type: Domain
Description: Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an α+β structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are [, ]:Pseudouridine synthase I, TruA.Pseudouridine synthase II, TruB, which contains and additional C-terminal PUA domain.Pseudouridine synthase RsuA. RluB, RluE and RluF are also part of this family.Pseudouridine synthase RluA. TruC, RluC and RluD belong to this family.Pseudouridine synthase TruD, which has a natural circular permutation in the catalytic domain, as well as an insertion of a family-specific α+β subdomain.TruB is responsible for the pseudouridine residue present in the T loops of virtually all tRNAs. TruB recognises the preformed 3-D structure of the T loop primarily through shape complementarity. It accesses its substrate uridyl residue by flipping out the nucleotide and disrupts the tertiary structure of tRNA [ ].This entry represents an N-terminal domain found in pseudouridine synthase TruB, as well as Cbf5p that modifies rRNA [ ].
Protein Domain
Name: tRNA pseudouridine synthase II, TruB
Type: Family
Description: Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an α+β structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are [ , ]:Pseudouridine synthase I, TruA.Pseudouridine synthase II, TruB, which contains and additional C-terminal PUA domain.Pseudouridine synthase RsuA. RluB, RluE and RluF are also part of this family.Pseudouridine synthase RluA. TruC, RluC and RluD belong to this family.Pseudouridine synthase TruD, which has a natural circular permutation in the catalytic domain, as well as an insertion of a family-specific α+β subdomain.TruB is responsible for the pseudouridine residue present in the T loops of virtually all tRNAs. TruB recognises the preformed 3-D structure of the T loop primarily through shape complementarity. It accesses its substrate uridyl residue by flipping out the nucleotide and disrupts the tertiary structure of tRNA [ ].This model is built on a seed alignment of bacterial proteins only. Saccharomyces cerevisiae protein YNL292w (Pus4) has been shown to be the pseudouridine 55 synthase of both cytosolic and mitochondrial compartments, active at no other position on tRNA and the only enzyme active at that position in the species. A distinct yeast protein YLR175w, (centromere/microtubule-binding protein CBF5) is an rRNA pseudouridine synthase, and the archaeal set is much more similar to CBF5 than to Pus4. It is unclear whether the archaeal proteins found by this model are tRNA pseudouridine 55 synthases like TruB, rRNA pseudouridine synthases like CBF5, or (as suggested by the absence of paralogs in the Archaea) both. CBF5 likely has additional, eukaryotic-specific functions.
Protein Domain
Name: Peptidase M24
Type: Domain
Description: This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread"fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase ( ), aminopeptidase P ( ), prolidase ( ), agropine synthase and creatinase ( ). Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme [ , , ]. The entry also contains proteins that have lost catalytic activity, for example Spt16, which is a component of the FACT complex. The crystal structure of the N-terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails [ ].The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker's yeast) comprising Spt16p ( , ) and Pob3p ( , ). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin [ ]; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilising and then reassembling nucleosome structure [, , ].
Protein Domain
Name: Peptidase M24, methionine aminopeptidase
Type: Family
Description: Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamilies M24A and M24B [ ].Methionine aminopeptidase ( ) (MAP) is responsible for the removal of the amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist [ , ]. While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1, while the second group is made up of archaeal MAP and eukaryotic MAP-2. The second subfamily also includes proteins which do not seem to be MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 1 and fission yeast curved DNA-binding protein.
Protein Domain
Name: Peptidase M24A, methionine aminopeptidase, subfamily 2
Type: Family
Description: This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamily M24A.Methionine aminopeptidase ( ) (MAP) catalyses the hydrolytic cleavage of the N-terminal methionine from newly synthesised polypeptides if the penultimate amino acid is small, with different tolerance to Val and Thr at this position [ ]. All MAP studied to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist [, ]. While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown, in the Escherichia coli MAP [], to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1 (), while the second group is made up of archaeal MAP and eukaryotic MAP-2 [ ] and includes proteins which do not seem to be MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 1 and fission yeast curved DNA-binding protein.
Protein Domain
Name: Peptidase M24A, methionine aminopeptidase, subfamily 2, binding site
Type: Binding_site
Description: Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamily M24A.Methionine aminopeptidase ( ) (MAP) is responsible for the removal of the amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist [ , ]. While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown, in the Escherichia coli MAP [ ], to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1, while the second group () is made up of archaeal MAP and eukaryotic MAP-2 and includes proteins which do not seem to be MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 1 and fission yeast curved DNA-binding protein. This entry represents a cobalt binding site.
Protein Domain
Name: Translation initiation factor IF2/IF5, N-terminal
Type: Homologous_superfamily
Description: The beta subunit of archaeal and eukaryotic translation initiation factor 2 (IF2beta) and the N-terminal domain of translation initiation factor 5 (IF5) show significant sequence homology [ ]. Archaeal IF2beta contains two independent structural domains: an N-terminal mixed alpha/beta core domain (topological similarity to the common core of ribosomal proteins L23 and L15e), and a C-terminal domain consisting of a zinc-binding C4 finger []. Archaeal IF2beta is a ribosome-dependent GTPase that stimulates the binding of initiator Met-tRNA(i)(Met) to the ribosomes, even in the absence of other factors []. The C-terminal domain of eukaryotic IF5 is involved in the formation of the multi-factor complex (MFC), an important intermediate for the 43S pre-initiation complex assembly []. IF5 interacts directly with IF1, IF2beta and IF3c, which together with IF2-bound Met-tRNA(i)(Met) form the MFC.This superfamily represents the N-terminal alpha/beta domain found in IF2beta and IF5.
Protein Domain
Name: Translation initiation factor IF2/IF5 domain
Type: Domain
Description: The beta subunit of archaeal and eukaryotic translation initiation factor 2 (IF2beta) and the N-terminal domain of translation initiation factor 5 (IF5) show significant sequence homology [ ]. Archaeal IF2beta contains two independent structural domains: an N-terminal mixed alpha/beta core domain (topological similarity to the common core of ribosomal proteins L23 and L15e), and a C-terminal domain consisting of a zinc-binding C4 finger []. Archaeal IF2beta is a ribosome-dependent GTPase that stimulates the binding of initiator Met-tRNA(i)(Met) to the ribosomes, even in the absence of other factors []. The C-terminal domain of eukaryotic IF5 is involved in the formation of the multi-factor complex (MFC), an important intermediate for the 43S pre-initiation complex assembly. IF5 interacts directly with IF1, IF2beta and IF3c, which together with IF2-bound Met-tRNA(i)(Met) form the MFC [].This entry represents both the N-terminal and zinc-binding domains of IF2, as well as a domain in IF5.
Protein Domain
Name: W2 domain
Type: Domain
Description: Translation initiation is a sophisticated, well regulated and highly coordinated cellular process in eukaryotes, in which at least 11 eukayroticinitiation factors (eIFs) are included. The W2 domain (two invariant tryptophans) is a region of ~165 amino acids which is found in the C terminusof the following eIFs [ , , , , ]:Eukaryotic translation initiation factor 2B epsilon (eIF-2B-epsilon).Eukaryotic translation initiation factor 4 gamma (eIF-4-gamma).Eukaryotic translation initiation factor 5 (eIF-5), a GTPase-activating protein (GAP) specific for eIF2.The W2 domain has a globular fold and is exclusively composed out of alpha-helices [ , , ]. The structure can be divided into a structuralC-terminal core onto which the two N-terminal helices are attached. The core contains two aromatic/acidic residue-rich regions (AA boxes), which areimportant for mediating protein-protein interactions.
Protein Domain
Name: Translation initiation factor IF2/IF5, zinc-binding
Type: Homologous_superfamily
Description: The beta subunit of archaeal and eukaryotic translation initiation factor 2 (IF2beta) and the N-terminal domain of translation initiation factor 5 (IF5) show significant sequence homology [ ]. Archaeal IF2beta contains two independent structural domains: an N-terminal mixed alpha/beta core domain (topological similarity to the common core of ribosomal proteins L23 and L15e), and a C-terminal domain consisting of a zinc-binding C4 finger []. Archaeal IF2beta is a ribosome-dependent GTPase that stimulates the binding of initiator Met-tRNA(i)(Met) to the ribosomes, even in the absence of other factors []. The C-terminal domain of eukaryotic IF5 is involved in the formation of the multi-factor complex (MFC), an important intermediate for the 43S pre-initiation complex assembly []. IF5 interacts directly with IF1, IF2beta and IF3c, which together with IF2-bound Met-tRNA(i)(Met) form the MFC.This entry represents the zinc-binding C4 domain with a zinc-bound β-ribbon motif, which is found in IF2beta and IF5 [ ].
Protein Domain      
Protein Domain
Name: ASX, DEUBAD domain
Type: Domain
Description: This entry represents the DEUBiquitinase ADaptor (DEUBAD) domain from Asx and homologues, which contain a characteristic LXXLL motif [ , ], detected in diverse transcription factors, coactivators and co-repressors and is implicated in mediating interactions between them []. This domain interacts with the UCH37-like domain (ULD) from Calypso and BAP1 deubiquitinases (DUBs), an interaction that is required for DUB activity activation [, , , ].
Protein Domain
Name: Zinc finger, GATA-type
Type: Domain
Description: This entry represents GATA-type zinc fingers (Znf). A number of transcription factors (including erythroid-specific transcription factor and nitrogen regulatory proteins), specifically bind the DNA sequence (A/T)GATA(A/G) [ ] in the regulatory regions of genes. They are consequently termed GATA-binding transcription factors. The interactions occur via highly-conserved Znf domains in which the zinc ion is coordinated by four cysteine residues [, ]. NMR studies have shown the core of the Znf to comprise two irregular anti-parallel β-sheets and an α-helix, followed by a long loop to the C-terminal end of the finger. The N-terminal part, which includes the helix, is similar in structure, but not sequence, to the N-terminal zinc module of the glucocorticoid receptor DNA-binding domain. The helix and the loop connecting the two β-sheets interact with the major groove of the DNA, while the C-terminal tail wraps around into the minor groove. It is this tail that is the essential determinant of specific binding. Interactions between the Znf and DNA are mainly hydrophobic, explaining the preponderance of thymines in the binding site; a large number of interactions with the phosphate backbone have also been observed []. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contains a single copy of the domain.
Protein Domain
Name: Zinc finger, NHR/GATA-type
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a zinc finger motif found in nuclear hormone receptors and in erythroid transcription factor GATA-1. Nuclear hormone receptors usually have two copies of this motif, while GATA-1 has one copy. The zinc fingers in nuclear receptors are generally regarded as DNA-binding domains [ ], while those in GATA-1 have been implicated in protein-recognition (of FOG proteins) [, ].
Protein Domain
Name: Bax inhibitor 1-related
Type: Family
Description: Bax inhibitor-1 (BI-1) [ ] is a suppressor of apoptosis that interacts with BCL2 and BCL-X. Human Bax BI-1 is an evolutionarily conserved integral membrane protein containing multiple membrane-spanning segments localised to the ER membrane. It has 6-7 membrane-spanning domains. The C termini of the mammalian BI-1 proteins are comprised of basic amino acids resembling some nuclear targeting sequences, but otherwise the predicted proteins lack motifs that suggest this function. BI-1 also regulates cell death triggered by ER stress [, , ]. BI-1 appears to exert its effect through an interaction with calmodulin []. Crystal structure of a bacterial member reveals that these proteins mediate a calcium leak across the membrane in a pH-dependent manner. Calcium homoeostasis balances passive calcium leak with active calcium uptake. The structure exists in a pore-closed and pore-open conformation, at pHs of 8 and 6 respectively [, ].This entry represents BI-1 and related sequences, including lifeguard proteins, which resemble BI-1 and also act as apoptotic regulators [ ].
Protein Domain
Name: S-locus glycoprotein domain
Type: Domain
Description: In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles [ ]. Most of the proteins within this family contain apple-like domain (), which is predicted to possess protein- and/or carbohydrate-binding functions.
Protein Domain
Name: Bulb-type lectin domain
Type: Domain
Description: A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [ , , ]:Dictyostelium discoideum comitin, an actin binding proteinCurculigo latifolia curculin, a sweet tasting and taste-modifying proteinThis domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity. Each bulb-type lectin domain consists of three sequential β-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparallel β-sheets. Together they form a 12-stranded β-barrel in which the barrel axis coincides with the pseudo 3-fold axis.
Protein Domain
Name: Invasin/intimin cell-adhesion fragments
Type: Homologous_superfamily
Description: Two types of pathogenic Escherichia coli, enteropathogenic E. coli (EPEC) and enterohemorrhagic E. coli (EHEC), cause diarrhoeal disease by disrupting the intestinal environment through the intimate attachment of the bacteria to the intestinal epithelium. This process is mediated by intimin, an outer membrane protein that is homologous to the invasins of pathogenic Yersinia. EPEC and EHEC form characteristic lesions on infected mammalian cells called actin pedestals. Each of these two pathogens injects its own translocated intimin receptor (Tir) molecule into the plasma membranes of host cells. Interaction of translocated Tir with the bacterial outer membrane protein intimin is required to trigger the assembly of actin into focused pedestals beneath bound bacteria [ ].
Protein Domain
Name: Bacterial Ig-like domain, group 2
Type: Domain
Description: The Ig-like fold is part of proteins with important roles in different physiological processes [ ]. This entry represents the bacterial Ig-like domain (Big2). This domain is mainly found in a variety of bacterial and phage surface proteins such as intimins, but has also been found in several eukaryote proteins []. Intimin (Eae protein) is a bacterial cell-adhesion molecule that mediates the intimate bacterial host-cell interaction. It contains three domains; two immunoglobulin-like domains and a C-type lectin-like module implying that carbohydrate recognition may be important in intimin-mediated cell adhesion [ , ].The structure of this domain was also described in the Tail tube protein of phage lambda (TTP). This protein assembles in hexameric rings that stack on top of each others [ , ].Nuclear pore membrane glycoprotein 210 from humans (POM210) also belongs to this group of proteins. This nucleoporin is essential for nuclear pore assembly and fusion, nuclear pore spacing, as well as structural integrity [ ].
Protein Domain      
Protein Domain
Name: GPI transamidase component Gaa1
Type: Family
Description: GPI (glycosyl phosphatidyl inositol) transamidase is a multiprotein complex required for a terminal step of adding the glycosylphosphatidylinositol (GPI) anchor attachment onto proteins. Gpi16, Gpi8 and Gaa1 form a sub-complex of the GPI transamidase.
Protein Domain
Name: Armadillo
Type: Repeat
Description: The armadillo (Arm) repeat is an approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila melanogaster segment polarity gene armadillo involved in signal transduction through wingless. Animal Arm-repeat proteins function in various processes, including intracellular signalling and cytoskeletal regulation, and include such proteins as beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumour suppressor protein, and the nuclear transport factor importin-alpha, amongst others [ ]. A subset of these proteins is conserved across eukaryotic kingdoms. In higher plants, some Arm-repeat proteins function in intracellular signalling like their mammalian counterparts, while others have novel functions [].The 3-dimensional fold of an armadillo repeat is known from the crystal structure of beta-catenin, where the 12 repeats form a superhelix of alpha helices with three helices per unit [ ]. The cylindrical structure features a positively charged grove, which presumably interacts with the acidic surfaces of the known interaction partners of beta-catenin.
Protein Domain
Name: RNA-binding, CRM domain
Type: Domain
Description: The CRM domain is an ~100-amino acid RNA-binding domain. The name chloroplast RNA splicing and ribosome maturation (CRM) has been suggested to reflect the functions established for the four characterised members of the family: Zea mays (Maize) CRS1 ( ), CAF1 ( ) and CAF2 ( ) proteins and the Escherichia coli protein YhbY ( ). The CRM domain is found in eubacteria, archaea, and plants. The CRM domain is represented as a stand-alone protein in archaea and bacteria, and in single- and multi-domain proteins in plants. It has been suggested that prokaryotic CRM proteins existed as ribosome-associated proteins prior to the divergence of archaea and bacteria, and that they were co-opted in the plant lineage as RNA binding modules by incorporation into diverse protein contexts. Plant CRM domains are predicted to reside not only in the chloroplast, but also in the mitochondrion and the nucleo/cytoplasmic compartment. The diversity of the CRM domain family in plants suggests a diverse set of RNA targets [, ].The CRM domain is a compact alpha/beta domain consisting of a four-stranded beta sheet and three alpha helices with an α-β-α-β-α-β-beta topology. The beta sheet face is basic, consistent with a role in RNA binding. Proximal to the basic beta sheet face is another moiety that could contribute to nucleic acid recognition. Connecting strand beta1 and helix alpha2 is a loop with a six amino acid motif, GxxG flanked by large aliphatic residues, within which one 'x' is typically a basic residue [ ]. Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants [ ]. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing []. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes []. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome [].
Protein Domain
Name: ADC synthase
Type: Homologous_superfamily
Description: This domain superfamily is characteristic of ADC synthases, including:Anthranilate synthase aminodeoxyisochorismate synthase/lyase subunit, TrpE, which catalyses the formation of anthranilate (o-aminobenzoate) and pyruvic acid from chorismate and glutamine [ , ].P-aminobenzoate synthase component I (Aminodeoxychorismate synthase), which catalyses the two-step biosynthesis of 4-amino-4-deoxychorismate (ADC), a precursor of p-aminobenzoate and folate in microorganisms [ ].Menaquinone-specific isochorismate synthase MenF, which catalyses the conversion of chorismate to isochorismate [ ].Salicylate synthetase Irp9, which is involved in the biosynthesis of the siderophore yersiniabactin in Yersinia enterocolitica [ ].Salicylate synthase MbtI (Mycobactin synthetase protein I), which is involved in salicylate production [ ].
Protein Domain      
Protein Domain
Name: Mevalonate/galactokinase
Type: Family
Description: Mevalonate kinase ( ) and galactokinases ( ) belong to this family. Mevalonate kinase may be a regulatory site in the cholesterol biosynthetic pathway. It is also involved in mevalonate catabolism. Galactokinase takes part in the first reaction of galactose metabolism by converting galactose to galactose-1-phosphate.
Protein Domain
Name: GHMP kinase, C-terminal domain
Type: Domain
Description: This domain is found in homoserine kinases ( ), galactokinases ( ) and mevalonate kinases ( ). These kinases make up the GHMP kinase superfamily of ATP-dependent enzymes [ ]. These enzymes are involved in the biosynthesis of isoprenes and amino acids as well as in carbohydrate metabolism. The C-terminal domain of homoserine kinase has a central α-β plait fold and an insertion of four helices, which, together with the N-terminal fold, create a novel nucleotide binding fold [].
Protein Domain
Name: GHMP kinase N-terminal domain
Type: Domain
Description: The galacto- ( ), homoserine ( ), mevalonate ( ) and phosphomevalonate ( ) kinases contain, in their N-terminal section, a conserved domain with a Gly/Ser-rich region which is probably involved in the binding of ATP [ , ]. This group of kinases has been called 'GHMP' (from the first letter of their substrates).This domain is also found in diphosphomevalonate decarboxylases, which are structurally related members of the GHMP superfamily [ ], but do not possess kinase activity.
Protein Domain
Name: Allene oxide cyclase
Type: Family
Description: This family consists of several plant specific allene oxide cyclase proteins ( ). The allene oxide cyclase (AOC)-catalysed step in jasmonate (JA) biosynthesis is important in the wound response of tomato [ ].
Protein Domain
Name: Protein of unknown function UPF0301
Type: Family
Description: This entry describes proteins of unknown function. Proteins in this family include AlgH from Pseudomonas aeruginosa. AlgH is involved in the transcriptional regulation of alginate biosynthesis [ ]. However, there is no evidence for such function in proteins belonging to this family.
Protein Domain
Name: MICOS complex subunit Mic10
Type: Family
Description: Mic10 (also known as MINOS1) is a component of the MICOS complex, a large protein complex of the mitochondrial inner membrane that plays crucial roles in the maintenance of crista junctions, inner membrane architecture, and formation of contact sites to the outer membrane [ ].
Protein Domain
Name: Eukaryotic-type methylenetetrahydrofolate reductase
Type: Family
Description: This entry represents a family that includes methylenetetrahydrofolate reductase.The enzyme activities methylenetetrahydrofolate reductase ( ) and 5,10-methylenetetrahydrofolate reductase (FADH) ( ) differ in that the former (assigned in many eukaryotes) is defined to use NADP+ as an acceptor, while the latter (assigned in many bacteria) is flexible with respect to the acceptor. Both convert 5-methyltetrahydrofolate to 5,10-methylenetetrahydrofolate. From a larger set of proteins assigned as one or the other, this family describes the subset of proteins found in eukaryotes, and currently designated methylenetetrahydrofolate reductase( ). This protein is an FAD-containing flavoprotein.
Protein Domain
Name: Methylenetetrahydrofolate reductase-like
Type: Family
Description: This represents the catalytic domain of 5,10-methylenetetrahydrofolate reductase from prokaryotes and methylenetetrahydrofolate reductase (MTHFR) from eukaryotes ( ). Both generate 5-methyltetrahydrofolate from 5,10-methylenetetrahydrofolate. Mammalian and yeast MTHFRs are homodimers in which each subunit contains an N-terminal catalytic domain, and a C-terminal regulatory domain to which the allosteric inhibitor adenosylmethionine binds [ ]. NADPH is the preferred reductant. In humans, there are several clinically significant mutations in MTHFR that result in hyperhomocysteinemia, which is a risk factor for the development of cardiovascular disease [].The bacterial enzyme is a homotetramer. MTHFRs of enteric bacteria comprise shorter chains around 300 residues in length. Their sequences can be aligned with the N-terminal catalytic domains of the eukaryotic MTHFRs [ ]. Escherichia coli MTHFR, along with plant MTHFRs, prefer NADHs as the source of reducing equivalents []. The structure of E. coli MTHFR is known to be a TIM barrel [].
Protein Domain
Name: Deoxynucleoside kinase
Type: Family
Description: This family consists of various deoxynucleoside kinases including cytidine ( ), guanosine ( ), adenosine ( ) and thymidine kinase ( ), which also phosphorylates deoxyuridine and deoxycytosine. These enzymes catalyse the production of deoxynucleotide 5'-monophosphate from a deoxynucleoside, using ATP and yielding ADP in the process.
Protein Domain
Name: NADP-dependent oxidoreductase domain
Type: Domain
Description: The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and many others [ ]. All possess a similar structure, with a β-α-β fold characteristic of nucleotide binding proteins []. The fold comprises a parallel β-8/α-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the β-sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ].Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [ ].Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity [ ]. This entry represents the NADP-dependent oxidoreductase domain found in these proteins.
Protein Domain      
Protein Domain
Name: Aldo-keto reductase
Type: Family
Description: In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones []. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: Aldo/keto reductase, conserved site
Type: Conserved_site
Description: The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, andmany others [ ]. All possess a similar structure, with a beta-α-β fold characteristic of nucleotide binding proteins [ ].The fold comprises a parallel beta-8/alpha-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large,deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobicnature of the pocket favours aromatic and apolar substrates over highly polar ones [].Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking thecoenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity [ ].
Protein Domain
Name: Actin family
Type: Family
Description: Actin [ , ] is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70proteins [ , ].In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4 and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.
Protein Domain      
Protein Domain
Name: Ubiquilin
Type: Family
Description: Ubiquitin [ ] is a protein of seventy six amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. It is widely known as a post-translational tag used to signal a protein's hydrolytic destruction. Other functions for ubiquitin, depend on its differential internal isopeptide linkages. In addition, several ubiquitin-like proteins have been discovered from genome-sequencing efforts, other structural studies, and genetic screens. These new data show that proteins with the ubiquitin domain are adaptable, transposable genetic elements, which have been appended to other genes and utilised for many different cellular functions, depending on the ubiquitin-like protein's identity, subcellular location, and method of covalent attachment. The post-translational ligation of proteins to members of the ubiquitin superfamily can signal many different fates for the target protein [].Ubiquitin is a globular protein, the last four C-terminal residues (Leu-Arg-Gly-Gly) extending from the compact structure to form a 'tail' important for its function. The latter is mediated by the covalent conjugation of ubiquitin to target proteins, by an isopeptide linkage between the C-terminal glycine and the epsilon amino group of lysine residues in the target proteins.Ubiquilin is a Ubiquitin-like (UBL) protein and has an N-terminal UBL domain and a C-terminal Ub-associated (UBA) domain in its structure.
Protein Domain
Name: Heat shock chaperonin-binding
Type: Domain
Description: This describes a heat shock chaperonin-binding motif found in the stress-inducible phosphoprotein STI1. Both N- and C-termini of STI1 are capable of binding heat shock proteins [ ] and the domain is found both singly and duplicated in other proteins.
Protein Domain
Name: DNA/RNA helicase, ATP-dependent, DEAH-box type, conserved site
Type: Conserved_site
Description: A number of eukaryotic and prokaryotic proteins have been characterised [ , , ] on the basis of their structural similarity. They all seem to be involved in ATP-dependent, nucleic-acid unwinding. There are two subfamilies of such proteins, the D-E-A-D-box and D-E-A-H-box families. Proteins that belong to the subfamily which have His instead of the second Asp are said to be 'D-E-A-H-box' proteins [, , ]. Proteins currently known to belong to this subfamily include yeast PRP2, PRP16, PRP22 and PRP43, involved in various ATP-requiring steps of the pre-mRNA splicing process; fission yeast prh1, which my be involved in pre-mRNA splicing; Drosophila male-less (mle) protein required in males for dosage compensation of X chromosome linked genes; yeast RAD3, a DNA helicase involved in excision repair of DNA damaged by UV light, bulky adducts or cross-linking agents; fission yeast rad15 (rhp3) and mammalian DNA excision repair protein XPD (ERCC-2); yeast CHL1 (or CTF1), which is important for chromosome transmission and normal cell cycle progression in G(2)/M; yeast TPS1, Caenorhabditis elegans hypothetical proteins C06E1.10 and K03H1.2; Poxviruses' early transcription factor 70kDa subunit which acts with RNA polymerase to initiate transcription from early gene promoters; Vaccinia virus putative helicase I8; and Escherichia coli putative RNA helicase hrpA. All these proteins share a number of conserved sequence motifs. Some of them are specific to this family while others are shared by other ATP-binding proteins or by proteins belonging to the helicases 'superfamily' [ ].
Protein Domain
Name: Coatomer delta subunit
Type: Family
Description: This entry represents the delta subunit of the coatomer complex, which is involved in the regulation of intracellular protein trafficking between the endoplasmic reticulum and the Golgi complex [ ].Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer [ ]. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins []. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi []. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes []. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.
Protein Domain
Name: Protein of unknown function DUF212
Type: Family
Description: No protein in this family has been characterized. This family is believed to be related to the PAP2 family, which includes phosphatases such as type 2 phosphatidic acid phosphatase (PAP2) and haloperoxidases.
Protein Domain
Name: TRAF-like
Type: Homologous_superfamily
Description: The tumour necrosis factor receptor (TNFR) associated factors (TRAFs) act as signal transducers for both TNFRs and interleukin-1/Toll-like receptors. TRAFs function in immunity, embryonic development, stress response and bone metabolism through their induction of cell proliferation, differentiation, and apoptosis [ ]. TRAFs are characterised by two domains: an N-terminal domain containing RING and zinc finger motifs that is essential for the activation of downstream effectors, and a C-terminal TRAF domain that is essential for self-association and receptor interaction []. The TRAF-domain like fold is a β-sandwich consisting of 8 strands in 2 β-sheets and has a circularly permuted greek-key immunoglobulin-fold topology that contains an extra strand.The substrate-binding domain (SBD) of the SIAH (seven in absentia homologue) family of proteins is structurally highly similar to the TRAF domain. The SIAH SBD interacts with a number of proteins, and is involved in TNF-alpha-mediated NFkappaB activation [ ].
Protein Domain
Name: Serine acetyltransferase, N-terminal
Type: Domain
Description: The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants [ ] and bacteria [].
Protein Domain
Name: Serine O-acetyltransferase
Type: Family
Description: The biosynthesis of L-cysteine is the predominant way by which inorganic sulphur is incorporated into organic compounds. In this process, the most abundant utilizable source of sulphur, inorganic sulphate, is taken up and reduced to sulphide. Sulphide is used to produce L-cysteine, which serves for protein synthesis or the production of other sulphur-containing organic compounds. Two routes for cysteine biosynthesis in nature have been documented. Serine transacetylase (also known as serine O-acetyltransferase; ) catalyzes steps in pathway I, the activation of L-serine by acetyl-coenzyme A, yielding O-acetyl-L-serine.
Protein Domain
Name: Hexapeptide repeat
Type: Repeat
Description: A variety of bacterial transferases contain a repeat structure composed of tandem repeats of a [LIV]-G-X(4) hexapeptide, which, in the tertiary structure of LpxA (Acyl-[acyl-carrier-protein]-UDP-N-acetylglucosamine O-acyltransferase) [ ], has been shown to form a left-handed parallel β-helix. A number of different transferase protein families contain this repeat, such as the bifunctional protein GlmU, galactoside acetyltransferase-like proteins [], the gamma-class of carbonic anhydrases [], and tetrahydrodipicolinate-N-succinlytransferases (DapD), the latter containing an extra N-terminal 3-helical domain []. It has been shown that most hexapeptide acyltransferases form catalytic trimers with three symmetrical active sites [].
Protein Domain
Name: Hexapeptide transferase, conserved site
Type: Conserved_site
Description: A variety of bacterial transferases contain a repeat structure composed of tandem repeats of a [LIV]-G-X(4) hexapeptide, which, in the tertiary structure of LpxA (UDP N-acetylglucosamine acyltransferase) [], has been shown to form a left-handed parallel beta helix. A number of different transferase protein families contain this repeat, such as galactoside acetyltransferase-like proteins [], the gamma-class of carbonic anhydrases [], and tetrahydrodipicolinate-N-succinlytransferases (DapD), the latter containing an extra N-terminal 3-helical domain [].The signature pattern of this entry represents a fourfold repeat of the a [LIV]-G-x(4) hexapeptide.
Protein Domain
Name: Trimeric LpxA-like superfamily
Type: Homologous_superfamily
Description: This domain superfamily is characterised by trimeric LpxA-like enzymes that display a single-stranded left-handed β-helix fold, composed of tandem repeats of a hexapeptide, as represented by the Bacterial transferase hexapeptide repeat, where the hexapeptide repeats correspond to individual strands. Many bacterial transferases contain this domain. The structures of several proteins with this domain have been determined, including UDP N-acetylglucosamine acyltransferase (LpxA, ) from Escherichia coli, the first enzyme in the lipid A biosynthetic pathway [ ]; galactoside acetyltransferase (GAT, LacA, ) from E. coli, a gene product of the lac operon that may assist cellular detoxification [ ]; gamma-class Archaeon carbonic anhydrase (), a zinc-containing enzyme that catalyses the reversible hydration of carbon dioxide [ ]; tetrahydrodipicolinate-N-succinlytransferase (DapD) from Mycobacterium bovis, an enzyme from the lysine biosynthetic pathway that contains an extra N-terminal 3-helical domain []; and the C-terminal domain of N-acetylglucosamine 1-phosphate uridyltransferase (GlmU, ) from E. coli, a trimeric bifunctional enzyme that catalyses the last two sequential reactions in the de novo biosynthetic pathway for UDP-N-acetylglucosamine, an essential precursor for many biomolecules [ ].
Protein Domain      
Protein Domain
Name: Cytochrome b561 and DOMON domain-containing protein
Type: Family
Description: This entry represents a group of DOMON domain (named after DOpamine beta-MOnooxygenase N-terminal domain) containing proteins from plants. They also contain a cytochrome b561 domain C-terminal to the DOMON domain. DOMON domain could bind catecholamines and thereby could regulate the cytochrome b561 domain function. Proteins in this family may act as a catecholamine-responsive trans-membrane electron transporter [ ].
Protein Domain
Name: DOMON domain
Type: Domain
Description: The DOMON domain is an 110-125 residue long domain which has been identified in the physiologically important enzyme dopamine beta-monooxygenase and inseveral other secreted and transmembrane proteins from both plants and animals. It has been named after DOpamine beta-MOnooxygenase N-terminaldomain. The DOMON domain can be found in one to four copies and in association with other domains, such as the Cu-ascorbate dependent monooxygenase domain,the epidermal growth factor domain, the trypsin inhibitor-like domain (TIL), the SEA domain and the Reelin domain [ ]. The DOMON domain may be involved in heme and sugar recognition [].The sequence conservation is predominantly centred around patches ofhydrophobic residues. The secondary structure prediction of the DOMON domain points to an all-β-strand fold with seven or eight core strands supportedby a buried core of conserved hydrophobic residues. There is a chraracteristic motif with two small positions (Gly or Ser) corresponding to a conserved turnimmediately C-terminal to strand three. It has been proposed that the DOMON domain might form a β-sandwich structure, with the strands distributed intotwo beta sheets as is seen in many extracellular adhesion domains such as the immunoglobulin, fibronectin type III, cadherin and PKD domains [].
Protein Domain
Name: Recombination protein RecR, conserved site
Type: Conserved_site
Description: The bacterial protein RecR is an important regulator in the RecFOR homologous recombination pathway during DNA repair [ , , , ]. It acts with RecF and RecO forming a complex that facilitates the loading of RecA onto ssDNA [, ]. RecR is a zinc metalloprotein consisting of a N-terminal helix-hairpin-helix (HhH) motif, a middle region containing a zinc finger motif and a Toprim domain, and a C-terminal domain comprising a divergent Walker B motif and a C-terminal helix [, ]. This conserved site represents the C4-type zinc finger which contains four strictly conserved cysteine residues that coordinates a zinc ion. Mutations in this domain affects bacterial survival suggesting that it plays an important role, likely in DNA binding [ ].
Protein Domain
Name: SRA1/Sec31
Type: Domain
Description: This domain can be found in several hypothetical mammalian steroid receptor RNA activator proteins. The SRA-RNAs encode stable proteins that are widely expressed and upregulated in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA [ ].This domain is also found at the C terminus of Sec31, a component of the coat protein complex II (COPII, which promotes the formation of transport vesicles from the endoplasmic reticulum (ER). COPII has two main functions, the physical deformation of the endoplasmic reticulum membrane into vesicles and the selection of cargo molecules [ ].
Protein Domain      
Protein Domain
Name: Amino acid transporter, transmembrane domain
Type: Domain
Description: This transmembrane domain is found in many amino acid transporters including (UNC-47) and (MTR). UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT) and is is predicted to have 10 transmembrane domains UNC47_CAEEL [ ]. MTR is an N system amino acid transporter system protein involved in methyltryptophan resistance MTR_NEUCR. Other proteins with this domain include proline transporters and amino acid transporters whose specificity has not yet been identified.
Protein Domain      
Protein Domain
Name: Signal recognition particle, SRP72 subunit
Type: Family
Description: SRP72 is a core component of the signal recognition particle ribonucleoprotein complex that functions in targeting nascent secretory proteins to the endoplasmic reticulum membrane [ , ]. SRP72 binds the 7S RNA only in presence of SRP68 [].
Protein Domain
Name: Signal recognition particle, SRP72 subunit, RNA-binding
Type: Domain
Description: The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [ , ]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [].This entry represents the RNA binding domain of the SRP72 subunit. This domain is responsible for the binding of SRP72 to the 7S SRP RNA [ ].
Protein Domain
Name: Legume lectin, alpha chain, conserved site
Type: Conserved_site
Description: Legume lectins are one of the largest lectin families with more than 70 lectins reported [, ]. Leguminous plant lectins resemble each other in their physicochemical properties although they differ in their carbohydrate specificities. They consist of two or four subunits with relative molecular mass of 30kDa and each subunit has one carbohydrate-binding site. The interaction with sugars requires tightly bound calcium and manganese ions. The structural similarities of these lectins are reported by the primary structural analyses and X-ray crystallographic studies. X-ray studies have shown that the folding of the polypeptide chains in the region of the carbohydrate-binding sites is also similar, despite differences in the primary sequences. The carbohydrate-binding sites of these lectins consist of two conserved amino acids on beta pleated sheets. One of these loops contains transition metals, calcium and manganese,which keep the amino acid residues of the sugar-binding site at the required positions. Amino acid sequences of this loop play an important role in thecarbohydrate-binding specificities of these lectins. These lectins bind either glucose/mannose or galactose. The exact function of legume lectins is not known but they may be involved in the attachment of nitrogen-fixing bacteria to legumes and in the protection against pathogens.Some legume lectins are proteolytically processed to produce two chains, beta (which corresponds to the N-terminal) () and alpha (C-terminal). The lectin concanavalin A (conA) from Canavalia ensiformis (jack bean) is exceptional in that the two chains are transposed and ligated (by formation of a new peptide bond). The N terminus of mature conA thus corresponds to that of the alpha chain and the C terminus to the beta chain. The signature pattern of the entry is located in N-terminal of the alpha chain.
Protein Domain
Name: PAN/Apple domain
Type: Domain
Description: Plasma kallikrein ( ) and coagulation factor XI ( ) are two related plasma serine proteases activated by factor XIIA and which share the same domain topology: an N-terminal region that contains four tandem repeats of about 90 amino acids and a C-terminal catalytic domain. The 90 amino-acid repeated domain contains 6 conserved cysteines. It has been shown [ , ] that three disulfide bonds link the first and sixth, second and fifth, and third and fourth cysteines. The domain can be drawn in the shape of an apple and has been accordingly called the 'apple domain'.The apple domains of plasma prekallikrein are known to mediate its binding to high molecular weight kininogen [ ], the apple domains of factor XI bind to factor XIIa, platelets, kininogen, factor IX and heparin [].The apple domains display some sequence similarity with the N domain of plasminogen/hepatocyte growth factor (HGF) and to some nematode and protozoan proteins [ ]. They all belong to the same domain superfamily that have been called the PAN module []. The N domain of hepatocyte growth factor binds to the c-Met receptor and to the heparin molecule. The structure of the PAN module of HGF has been solved. It contains a characteristic hairpin-loop structure stabilised by two disulfide bridges, Cys-1 and 6 are not conserved in HGF PAN modules [].Apart from the cysteines, there are a number of other conserved positions in the apple domain. This entry represents the PAN domain of the plasma kalllikrein/coagulation factor XI subgroup proteins.
Protein Domain
Name: TPMT family
Type: Family
Description: This family consists of thiopurine S-methyltransferase (), thiol S-methyltransferase ( ) and thiocyanate methyltransferase ( ). Thiopurine S-methyltransferase is a cytosolic enzyme that catalyses S-methylation of aromatic and heterocyclic sulphydryl compounds, including anticancer and immunosuppressive thiopurines []. Thiopurine is commonly used to suppress the immune system in the case of autoimmune disease, inflammatory bowel disease and to prevent transplanted organ rejection []. Thiopurine drugs can be toxic to cells but their effect can be mediated by the incorporation of thioguanine nucleotides into DNA [].Thiocyanate methyltransferase and thiol S-methyltransferase are involved in glucosinolate metabolism and defense against phytopathogens. Thiocyanate methyltransferase is highly reactive to thiocyanate (NCS-) derived from myrosinase-mediated hydrolysis of glucosinolates upon tissue damage [].
Protein Domain
Name: Protein BPS1, chloroplastic
Type: Family
Description: This family includes BPS1 (Protein BYPASS 1) from plants, a protein required for normal root and shoot development. It prevents constitutive production of a root mobile carotenoid-derived signalling compound that is capable of arresting shoot and leaf development [ , ].
Protein Domain
Name: Glyoxalase-like domain
Type: Domain
Description: This domain is related to the glyoxalase domain ( ).
Protein Domain
Name: Guanylate kinase-like domain
Type: Domain
Description: Guanylate kinase ( ) (GK) [ ] catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes(such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown [ , , ] to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.Proteins containing one or more copies of the DHR domain, an SH3 domain as well as a C-terminal GK-like domain, are collectively termed MAGUKs (membrane-associated guanylate kinase homologues) [ ], andinclude Drosophila lethal(1)discs large-1 tumor suppressor protein (gene dlg1); mammalian tight junction protein Zo-1; a family of mammalian synaptic proteins that seem to interact with the cytoplasmic tail of NMDA receptor subunits (SAP90/PSD-95, CHAPSYN-110/PSD-93, SAP97/DLG1 and SAP102); vertebrate 55kDa erythrocyte membrane protein (p55); Caenorhabditis elegans protein lin-2; rat protein CASK; and human proteins DLG2 and DLG3. There is an ATP-binding site (P-loop) in the N-terminal section of GK, which is not conserved in the GK-like domain of the above proteins. However these proteins retain the residues known, in GK, to be involved in the binding of GMP.
Protein Domain
Name: Guanylate kinase
Type: Family
Description: Guanylate kinase, also called GMP kinase, is essential for recycling GMP and indirectly, cGMP. This enzyme transfers a phosphate from ATP to GMP, yielding ADP and GDP [ ]. Guanylate kinase is a highly conserved monomer and is important for the activation of various antiviral drugs.
Protein Domain
Name: Guanylate kinase/L-type calcium channel beta subunit
Type: Domain
Description: This entry represents a domain found in guanylate kinase ( ) and in L-type calcium channel. Guanylate kinase ( ) (GK) [ ] catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes(such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown [ , , ] to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.L-type calcium channnels are formed from different alpha-1 subunit isoforms that determine the pharmacological properties of the channel, since they form the drug binding domain. Other properties, such as gating voltage-dependence, G protein modulation and kinase susceptibility, are influenced by alpha-2, delta and beta subunits.
Protein Domain
Name: Guanylate kinase, conserved site
Type: Conserved_site
Description: Guanylate kinase ( ) (GK) [ ] catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes(such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown [ , , ] to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.Proteins containing one or more copies of the DHR domain, an SH3 domain as well as a C-terminal GK-like domain, are collectively termed MAGUKs (membrane-associated guanylate kinase homologues) [ ], andinclude Drosophila lethal(1)discs large-1 tumor suppressor protein (gene dlg1); mammalian tight junction protein Zo-1; a family of mammalian synaptic proteins that seem to interact with the cytoplasmic tail of NMDA receptor subunits (SAP90/PSD-95, CHAPSYN-110/PSD-93, SAP97/DLG1 and SAP102); vertebrate 55kDa erythrocyte membrane protein (p55); Caenorhabditis elegans protein lin-2; rat protein CASK; and human proteins DLG2 and DLG3. There is an ATP-binding site (P-loop) in the N-terminal section of GK, which is not conserved in the GK-like domain of the above proteins. However these proteins retain the residues known, in GK, to be involved in the binding of GMP. This signature pattern covers a highly conserved region that contains two arginine and a tyrosine which are involved in GMP-binding.
Protein Domain
Name: Protein RETICULATA-related
Type: Family
Description: This entry represents RETICULATA and related proteins from plants. Arabidopsis RETICULATA protein is involved in differential development of bundle sheath and mesophyll cell chloroplasts [ ].
Protein Domain
Name: 1,3-beta-glucan synthase component FKS1-like, domain-1
Type: Domain
Description: This domain is likely to be the 'Class I' region just N-terminal to the first set of transmembrane helices that is involved in 1,3-beta-glucan synthesis itself [ ].
Protein Domain      
Protein Domain
Name: Peptidase S10, serine carboxypeptidase
Type: Family
Description: This group of serine peptidases belong to MEROPS peptidase family S10 (clan SC). The type example is carboxypeptidase Y from Saccharomyces cerevisiae (Baker's yeast) [ ].All known carboxypeptidases are either metallo carboxypeptidases or serine carboxypeptidases ( and ). The catalytic activity of the serine carboxypeptidases, like that of the trypsin family serine proteases, is provided by a charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, which is itself hydrogen-bonded to a serine [ ]. The sequences surrounding the active site serine and histidine residues are highly conserved in all the serine carboxypeptidases.
Protein Domain
Name: Serine carboxypeptidase, serine active site
Type: Active_site
Description: All known carboxypeptidases are either metallo carboxypeptidases or serine carboxypeptidases (and ). The catalytic activity of the serine carboxypeptidases, like that of the trypsin family serine proteases, is provided by a charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, which is itself hydrogen-bonded to a serine [ ]. Proteins known to be serine carboxypeptidases include:Barley and wheat serine carboxypeptidases I, II, and III [ ]. Yeast carboxypeptidase Y (YSCY) (gene PRC1), a vacuolar protease involved in degrading small peptides. Yeast KEX1 protease, involved in killer toxin and alpha-factor precursor processing. Fission yeast sxa2, a probable carboxypeptidase involved in degrading or processing mating pheromones []. Penicillium janthinellum carboxypeptidase S1 [ ]. Aspergullus niger carboxypeptidase pepF. Aspergullus satoi carboxypeptidase cpdS. Vertebrate protective protein / cathepsin A [ ], a lysosomal protein whichis not only a carboxypeptidase but also essential for the activity of both beta-galactosidase and neuraminidase. Mosquito vitellogenic carboxypeptidase (VCP) [ ]. Naegleria fowleri virulence-related protein Nf314 [ ]. Yeast hypothetical protein YBR139w. Caenorhabditis elegans hypothetical proteins C08H9.1, F13D12.6, F32A5.3, F41C3.5 and K10B2.2. In higher plants and fungi serine carboxypeptidases are found in the cell vacuoles. In animal cells serine carboxypeptidases are found lysosomes [ ].The sequences surrounding the active site histidine residue are highly conserved in all these serine carboxypeptidases.
Protein Domain
Name: Glycoside hydrolase, family 19, catalytic
Type: Domain
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Glycoside hydrolase family 19 comprises enzymes with only one known activity; chitinase ( ). Chitinases [ ] are enzymes that catalyse the hydrolysis of the beta-1,4-N-acetyl-D-glucosamine linkages in chitin polymers. Chitinases belong to glycoside hydrolase families 18 or 19 []. Chitinases of family 19 (also known as classes I, II and IV) are enzymes from plants that function in the defence against fungal and insect pathogens by destroying their chitin-containing cell wall. Some family 19 chitinases are found in bacteria. Class I and II chitinases are similar in their catalytic domains. Class I chitinases have an N-terminal cysteine-rich, chitin-binding domain which is separated from the catalytic domain by a proline and glycine-rich hinge region. Class II chitinases lack both the chitin-binding domain and the hinge region. Class IV chitinases are similar to class I, but they are smaller in size due to certain deletions.Despite any significant sequence homology with lysozymes, structural analysis reveals that family 19 chitinases, together with family 46 chitosanases, are similar to several lysozymes including those from T4-phage and from goose. The structures reveal that the different enzyme groups arose from a common ancestor glycohydrolase antecedent to the procaryotic/eucaryotic divergence [ , , , , , ].
Protein Domain
Name: Lysozyme-like domain superfamily
Type: Homologous_superfamily
Description: This entry represents a lysozyme-like domain superfamily found in glycosyl hydrolases and transglycosylases.
Protein Domain
Name: tRNA-binding domain
Type: Domain
Description: This domain is found in prokaryotic methionyl-tRNA synthetases, prokaryotic phenylalanyl tRNA synthetases, the yeast GU4 nucleic-binding protein (G4p1 or p42, ARC1) [ ], human tyrosyl-tRNA synthetase [], endothelial-monocyte activating polypeptide II and export-related chaperon CsaA []. G4p1 binds specifically to tRNA form a complex with methionyl-tRNA synthetases []. In human tyrosyl-tRNA synthetase, this domain may direct tRNA to the active site of the enzyme []. This domain may perform a common function in tRNA aminoacylation [].
Protein Domain
Name: Ribosomal protein L1/ribosomal biogenesis protein
Type: Family
Description: Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). The two domains cycle between open and closed conformations via a hinge motion. In Escherichia coli, L1 is known to bind to the 23S rRNA. The RNA-binding site of L1 is highly conserved, with both mRNA and rRNA binding the same binding site. Like several other large ribosomal subunit proteins, L1 displays RNA chaperone activity [ , , , , , ]. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [, ], groups:Eubacterial L1Algal and plant chloroplast L1Cyanelle L1Archaebacterial L1Vertebrate L10AYeast Utp30, Rpl1a, Rpl1b and Mrpl1.This entry also matches ribosome biogenesis proteins, such as Cic1, which associates with the proteasome and is required for the degradation of specific substrates [ ], and for the synthesis of 60S ribosome subunits [].
Protein Domain
Name: Ribosomal protein L1-like
Type: Homologous_superfamily
Description: This entry represents a structural domain common to several L1 ribosomal proteins, and related proteins. It consists of one alpha/beta subdomain interrupted by another alpha/beta subdomain.Ribosomal protein L1 is the largest protein from the large ribosomal subunit. In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [ , ], groups:Eubacterial L1Algal and plant chloroplast L1Cyanelle L1Archaebacterial L1Vertebrate L10AYeast L1-A and L1-B
Protein Domain
Name: Ribosomal protein L1, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L1 is the largest protein from the large ribosomal subunit. In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [ , ], groups: Eubacterial L1. Algal and plant chloroplast L1. Cyanelle L1. Archaebacterial L1. Vertebrate L10A. Yeast SSM1. The signature pattern in this entry identified the best conserved region located in the central section of these proteins. It is located at the end of an α-helix thought to be involved in RNA-binding.
Protein Domain
Name: Ribosomal protein L1, bacterial-type
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L1 is the largest protein from the large ribosomal subunit. In Escherichia coli, L1 is known to bind to the 23S rRNA. This model describe s bacterial and chloroplast ribosomal protein L1. Most mitochondrial L1 sequences are sufficiently divergent to be thecontained in a different entry ( ).
Protein Domain
Name: Ribosomal protein L1, 3-layer alpha/beta-sandwich
Type: Homologous_superfamily
Description: Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). This entry represents the 3-layer domain.It has been shown that the 2-layer alpha/β-sandwich domain is for RNA binding. The 3-layer alpha/beta domain is to stabilise the L1-rRNA complex []. The 3-layer domain of ribosomal protein TthL1 hinders binding of intactprotein with RNA due to interdomain flexibility. As a result, the rate of complex formation with mRNA increases for the isolated domain I as compared with that for intact TthL1 [].
Protein Domain      
Protein Domain
Name: Exostosin-like
Type: Family
Description: There are five identified human EXT family proteins (EXT1, EXT2, EXTL1, EXTL2 and EXTL3), which are members of the hereditary multiple exostoses family of tumor suppressors [ ]. They are glycosyltransferases required for the biosynthesis of heparan sulfate. Hereditary multiple exostoses (EXT) is an autosomal dominant disorder that is characterised by the appearance of multiple outgrowths of the long bones (exostoses) at their epiphyses []. Mutations in two homologous genes, EXT1 and EXT2, are responsible for the EXT syndrome. The human and mouse EXT genes have at least two homologues in the invertebrate Caenorhabditis elegans, indicating that they do not function exclusively as regulators of bone growth. EXT1 and EXT2 have both been shown to encode glycosyltransferases involved in the chain elongation step of heparan sulphate biosynthesis [].This entry also includes Arabidopsis Xyloglucan galactosyltransferase KATAMARI1 [ ] and Drosophila melanogaster EXT homologues [].
Protein Domain
Name: Class I-like SAM-dependent O-methyltransferase
Type: Family
Description: Members of this family are O-methyltransferases, including catechol O-methyltransferase [ ], caffeoyl-CoA O-methyltransferase [] and norbelladine 4'-O-methyltransferase []. The family includes also bacterial O-methyltransferases that may be involved in antibiotic production [, ].
Protein Domain
Name: Glycoside hydrolase family 63
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.This family of enzymes belongs to glycosyl hydrolase family 63 ( ). They catalyse the specific cleavage of the non-reducing terminal glucose residue from Glc(3)Man(9)GlcNAc(2). Mannosyl oligosaccharide glucosidase is the first enzyme in the N-linked oligosaccharide processing pathway. This family also includes glucosylglycerate hydrolase which catalyses the hydrolysis of glucosylglycerate to glycerate and glucose, during the mycobacterial recovery from nitrogen starvation. This process promotes the rapid mobilisation of the glucosylglycerate that accumulates under these conditions. This enzyme can also hydrolyse mannosylglycerate with lower efficiency [ ].
Protein Domain
Name: Peptidase M28
Type: Domain
Description: Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This domain is found in metallopeptidases belonging to the MEROPS peptidase family M28 (aminopeptidase Y, clan MH) [ ] and in non-peptidase homologues such as transferrin receptor proteins. Members containing this domain, also contain a transferrin receptor-like dimerisation domain () and a protease-associated PA domain ( ).
Protein Domain
Name: Signal peptidase complex subunit 1
Type: Family
Description: Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins [ ]. This family represents the Signal peptidase complex subunit 1 (SPCS1) and its homologues, such as Spc1 from budding yeasts. The signal peptidase complex cleaves the signal sequence from proteins targeted to the endoplasmic reticulum (ER). Mammalian signal peptidase is as a complex of five different polypeptide chains [ ], while the budding yeast SPC comprises four proteins []. Budding yeast Spc1 has been shown to be a nonessential component of the signal peptidase complex []. However, the Drosophila spase12 (yeast Spc1 homologue) null alleles are embryonic lethal [].Interestingly, human SPC12 has been linked to post-translational processing of proteins involved in virion assembly and secretion from flaviviruses [ , , ]. Signal peptidase complex-like protein DTM1 from Oryza sativa (rice) also belongs to this family. This protein unctions in tapetum development during early meiosis. It may play a role in the endoplasmic reticulum (ER) membrane in the early stages of tapetum development in anthers [ ].
Protein Domain
Name: Ribosomal protein L22/L17, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].Ribosomal protein L22 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L22 is known to bind 23S rRNA []. It belongs to a family ofribosomal proteins which includes: bacterial L22; algal and plant chloroplast L22 (in legumes L22 is encoded in the nucleus instead of the chloroplast); cyanelle L22;archaebacterial L22; mammalian L17; plant L17 and yeast YL17.
Protein Domain
Name: Ribosomal protein L22/L17, eukaryotic/archaeal
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].This family describes the ribosomal protein of the eukaryotic cytosol and of the Archaea, variously designated as L17, L22, and L23.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom