Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 4201 to 4300 out of 38750 for *

Category restricted to ProteinDomain (x)

0.029s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain      
Protein Domain
Name: Protein NO VEIN, C-terminal
Type: Domain
Description: This domain of unknown function is found at the C-terminal of Protein NO VEIN from Arabidopsis, a protein essential for cell fate determination during embryogenesis [ ]. It mediates this process through an auxin-dependent pathway []. It is also found in some restriction endonucleases.
Protein Domain
Name: Actin-related protein 2/3 complex subunit 4
Type: Family
Description: Arp2/3 binds to pre-existing actin filaments and nucleates new daughter filaments, and thus becomes incorporated into the dynamic actin network at the leading edge of motile cells and other actin-based protrusive structures [ ]. In order to nucleate filaments, Arp2/3 must bind to a member of the N-WASp/SCAR family protein []. Arp2 and Arp3 are thought to be brought together after activation, forming an actin-like nucleus for actin monomers to bind and create a new actin filament. In the absence of an activating protein, Arp2/3 shows very little nucleation activity. Recent research has focused on the binding and hydrolysis of ATP by Arp2 and Arp3 [], and crystal structures of the Arp2/3 complex have been solved [].The human complex consists of Arp2/3 complex composed of ARP2, ARP3, ARPC1B/p41-ARC, ARPC2/p34-ARC, ARPC3/p21-ARC, ARPC4/p20-ARC and ARPC5/p16-ARC. This family represents the ARPC4/p20-ARC subunit.
Protein Domain
Name: Alpha/gamma-adaptin-binding protein p34
Type: Family
Description: p34 is a protein involved in membrane trafficking. It is known to interact with both alpha and gamma adaptin [ ]. It has been speculated that p34 may play a chaperone role such as preventing the soluble adaptors from co-assembling with soluble clathrin, or helping to remove the adaptors from the coated vesicle. It may also aid in the recruitment of soluble adaptors onto the membrane [].
Protein Domain
Name: THAP4-like, heme-binding beta-barrel domain
Type: Domain
Description: Nitrobindins (Nbs), constituting a heme-protein family spanning from bacteria to Homo sapiens, display an all-β-barrel structural organization. Proteins containing this domain are putatively related to fatty acid-binding proteins (FABPs) [ ].This domain can be found in THAP4 from mammals and At1g79260 from Arabidopsis. THAP4 catalyzes the heme-based conversion of peroxynitrite into nitrate/NO3- in vitro []. At1g79260 is a nitrophorin-like heme-binding protein that may reversibly bind nitric oxide (NO) and be involved in NO transport []. This entry also includes the β-barrel domain of Caenorhabditis elegans protein male abnormal 7 (Mab-7) which plays an important role in determining body shape and sensory ray morphology [].
Protein Domain
Name: Vta1/Callose synthase, N-terminal domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents the N-terminal domain of vacuolar protein sorting-associated proteins and callose synthases. This domain contains seven alpha helices arranged into two antiparallel three-helix bundle modules. It has been proposed that vacuolar protein sorting-associated protein Vta1 interacts with Vps60 and Vps46/Did2 via this N-terminal domain [].
Protein Domain
Name: MoaA/NifB/PqqE, iron-sulphur binding, conserved site
Type: Conserved_site
Description: A number of proteins involved in the biosynthesis of metallo cofactors have been shown [, ] to be evolutionary related. These include:Bacterial and archebacterial protein moaA, which is involved in the biosynthesis of the molybdenum cofactor (molybdopterin; MPT).Arabidopsis thaliana (Mouse-ear cress) cnx2, a protein involved in molybdopterin biosynthesis and which is highly similar to moaA.Bacillus subtilis narA, which seems to be the moaA ortholog in that bacteria.Bacterial protein nifB (or fixZ) which is involved in the biosynthesis of the nitrogenase iron-molybdenum cofactor.Bacterial protein pqqE which is involved in the biosynthesis of the cofactor pyrrolo-quinoline-quinone (PQQ).Pyrococcus furiosus cmo, a protein involved in the synthesis of a molybdopterin-based tungsten cofactor.Caenorhabditis elegans hypothetical protein F49E2.1.These proteins share, in their N-terminal region, a conserved domain thatcontains three cysteines. In moaA, these cysteines have been shown to be important for biological activity by binding a [4Fe-4S] cluster []. The three cysteines each coordinate one Fe, while S-adenosylmethionine is the fourth ligand to the cluster and binds to its unique Fe as an N/O chelate.
Protein Domain
Name: Molybdenum cofactor biosynthesis protein A
Type: Family
Description: The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism [ , ]. In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner [ ]. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF []. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon ( ) [ ].This entry represents the MoaA protein (molybdenum cofactor biosynthesis protein A), also known as cyclic pyranopterin monophosphate synthase or GTP 3',8-cyclase. MoaA is a member of the wider S-adenosylmethionine(SAM)-dependent enzyme family which catalyze the formation of protein and/or substrate radicals by reductive cleavage of SAM via a [4Fe-4S] cluster. Monomeric and homodimeric forms of MoaA have been observed in vivo, and it is not clear what the physiologically relevant form of the enzyme is [ ]. The core of each monomer consists of an incomplete TIM barrel, formed by the N-terminal region of the protein, containing a [4Fe-4S]cluster. The C-terminal region of the protein, which also contains a [4Fe-4S] cluster consists of a β-sheet covering the lateral opening of the barrel, an extended loop and three α-helices. The N-terminal [4Fe-4S] cluster is coordinated with 3 cysteines and an exchangeable SAM molecule, while the C-terminal [4Fe-4S], also coordinated with 3 cysteines, is the binding and activation site for GTP [ ].
Protein Domain
Name: Molybdenum cofactor biosynthesis protein A-like, twitch domain
Type: Domain
Description: This entry represents the iron-sulfur cluster-binding twitch domain of GTP 3',8-cyclase, which is also known as molybdenum cofactor biosynthesis protein A (MoaA) in bacteria and archaea, molybdenum cofactor biosynthesis protein 1 (MOCS1) in most eukaryotes, and molybdenum cofactor biosynthesis enzyme CNX2 in plants [ ]. They belong to a family of enzymes involved in the synthesis of metallo-cofactors (). Each subunit of the MoaA dimer is comprised of an N-terminal SAM domain ( ) that contains the [4Fe-4S] cluster typical for this family of enzymes, as well as an additional [4Fe-4S]cluster in the C-terminal domain that is unique to MoaA proteins, involved in substrate binding [ ]. The unique Fe site of the C-terminal [4Fe-4S]cluster is thought to be involved in the binding and activation of 5'-GTP [ ].Mutations in the human MoCF biosynthesis proteins MOCS1, MOCS2 or GEPH cause MoCF Deficiency type A (MOCOD), causing the loss of activity of MoCF-containing enzymes, resulting in neurological abnormalities and death [ ].The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism [ , ]. In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner [ ]. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF []. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon ( ) [ ].
Protein Domain
Name: Ribosomal protein S2, eukaryotic/archaeal
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].This family describes the ribosomal protein of the eukaryotic cytosol and of archaea, homologous to S2 of bacteria. It is designated typically as SA in eukaryotes and SA or S2 in the archaea.
Protein Domain
Name: ATP-grasp fold, succinyl-CoA synthetase-type
Type: Domain
Description: The ATP-grasp superfamily currently includes 17 groups of enzymes, catalysing ATP-dependent ligation of a carboxylate containing molecule to an amino or thiol group-containing molecule [ ]. They contribute predominantly to macromolecular synthesis. ATP-hydrolysis is used to activate a substrate. For example, DD-ligase transfers phosphate from ATP to D-alanine on the first step of catalysis. On the second step the resulting acylphosphate is attacked by a second D-alanine to produce a DD dipeptide following phosphate elimination [].The ATP-grasp domain contains three conserved motifs, corresponding to the phosphate binding loop and the Mg(2+) binding site [ ]. The fold is characterised by two α-β subdomains that grasp the ATP molecule between them. Each subdomain provides a variable loop that forms a part of the active site, completed by region of other domains not conserved between the various ATP-grasp enzymes [].The ATP-grasp domain represented by this entry is found primarily in succinyl-CoA synthetases ( ).
Protein Domain
Name: RAVE complex protein Rav1 C-terminal
Type: Domain
Description: This domain family is found in the C-terminal region of the protein Rav1 [ ], a component of the RAVE (regulator of the ATPase of vacuolar and endosomal membranes) complex. Rav1p is involved in regulating the glucose dependent assembly and disassembly of vacuolar ATPase V1 and V0 subunits [].
Protein Domain
Name: Exportin-2, C-terminal
Type: Domain
Description: Exportin-2, also known as CAS, is an export receptor for importin-alpha [ ]. It binds strongly to importin alpha only in the presence of RanGTP, forming an importin alpha/CAS/RanGTP complex. Exportin-2 mediates importin-alpha re-export from the nucleus to the cytoplasm after import substrates have been released into the nucleoplasm [].This entry represents the C-terminal domain of XPO2. Structural studies of its yeast homologue, Cse1, indicate that this domain binds to both the transport-orchestrating protein RanGTP and the cargo molecule that is being exported [ ].
Protein Domain
Name: Golgi to ER traffic protein 4
Type: Family
Description: In budding yeast, Get4 is part of the GET complex that inserts the tail-anchored (TA) proteins into the endoplasmic reticulum membrane [ , ]. In humans, Get4 is part the BAG6/BAT3 complex, maintains misfolded and hydrophobic patches-containing proteins in a soluble state and facilitates their proper delivery to the endoplasmic reticulum, or alternatively promotes their sorting to the proteasome where they undergo degradation [, , , ]. The BAG6/BAT3 complex is involved in the post-translational delivery of tail-anchored/type II transmembrane proteins to the endoplasmic reticulum membrane [, , ].
Protein Domain
Name: GSKIP domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents the structural domain of GSK3-beta interaction protein (GSKIP), which binds to GSK3beta [ ]. It is also found as a short domain towards the N terminus in clustered mitochondria protein, also known as clueless in Drosophila, which is involved in proper cytoplasmic distribution of mitochondria [, , ].
Protein Domain
Name: Clustered mitochondria protein, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of the clustered mitochondria protein, also known as clueless protein in Drosophila. The function of this domain is not known. This domain is found in association with .
Protein Domain
Name: Glu-tRNAGln amidotransferase C subunit
Type: Family
Description: This entry includes the C subunit of the bacterial/archaeal aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferases (known as GatC) and eukaryotic Glu-tRNAGln amidotransferases.Aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase ([intenz:6.3.5.-]) allows the formation of correctly charged Asn-tRNA(Asn) or Gln-tRNA(Gln) through the transamidation of misacylated Asp-tRNA(Asn) or Glu-tRNA(Gln) in organisms which lack either or both of asparaginyl-tRNA or glutaminyl-tRNA synthetases. The reaction takes place in the presence of glutamine and ATP through an activated phospho-Asp-tRNA(Asn) or phospho-Glu-tRNA(Gln) []. The enzyme is composed of three subunits: A (an amidase), B and C. It also exists in eukaryotes as a protein targeted to the mitochondria.The heterotrimer GatABC is involved in converting Glu to Gln and/or Asp to Asn, when the amino acid is attached to the appropriate tRNA. In Lactobacillus, GatABC is responsible for producing tRNA(Gln). In Archaea, GatABC is responsible for producing tRNA(Asn), while GatDE is responsible for producing tRNA(Gln). In lineages that include Thermus, Chlamydia, or Acidithiobacillus, the GatABC complex catalyses both tRNA(Gln) and tRNA(Asn).
Protein Domain
Name: ATP synthase, F0 complex, subunit D, mitochondrial
Type: Family
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.F-ATPases (also known as ATP synthases, F1F0-ATPase, or H(+)-transporting two-sector ATPase) ( ) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), with additional subunits in mitochondria. Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis [ ]. These ATPases can also work in reverse in bacteria, hydrolysing ATP to create a proton gradient.This entry represents subunit D from the F0 complex in F-ATPases found in mitochondria. The D subunit is part of the peripheral stalk that links the F1 and F0 complexes together, and which acts as a stator to prevent certain subunits from rotating with the central rotary element. The peripheral stalk differs in subunit composition between mitochondrial, chloroplast and bacterial F-ATPases. In mitochondria, the peripheral stalk is composed of one copy each of subunits OSCP (oligomycin sensitivity conferral protein), F6, B and D [ ]. There is no homologue of subunit D in bacterial or chloroplast F-ATPase, whose peripheral stalks are composed of one copy of the delta subunit (homologous to OSCP), and two copies of subunit B in bacteria, or one copy each of subunits B and B' in chloroplasts and photosynthetic bacteria.
Protein Domain
Name: Glycosyl hydrolase family 100
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Glycosyl hydrolase family 100 includes enzymes with invertase activity [ , ].
Protein Domain
Name: Taxilin family
Type: Family
Description: The taxilin family of proteins includes alpha-, beta- and gamma-taxilins. They bind to members of the syntaxin protein family [ ], which are implicated in intracellular vesicle traffic. Taxilins might therefore be involved in this process. Alpha-taxilin may also be involved in calcium-dependent exocytosis in neuroendocrine cells [], while gamma-taxilin (also known as Elrg) may have a role in cell cycle progression [].
Protein Domain
Name: Transferrin-like domain
Type: Domain
Description: Transferrins are eukaryotic iron-binding glycoproteins that control the level of free iron in biological fluids [ ]. Evidence suggests that members of the TF family arose from the duplication and fusion of two homologous domains, with each duplicated domain binding one iron atom. Members of the family include blood serotransferrin (siderophilin); milk lactotransferrin (lactoferrin); egg white ovotransferrin (conalbumin); and membrane-associated melanotransferrin. Family members that do not bind iron have also been discovered, including inhibitor of carbonic anhydrase (ICA), which strongly binds to and inhibits certain isoforms of carbonic anhydrase [].This entry represents the transferrin-like domain, which can be further divided into two subdomains that form a cleft inside of which the iron atom is bound in iron-transporting transferrin [ ]. The iron-coordinating residues consist of an aspartic acid, two tyrosines and a histidine, as well as an arginine that coordinates a requisite anion. In addition to iron and anion liganding residues, the transferrin-like domain contains conserved cysteine residues involved in disulphide bond formation.
Protein Domain
Name: Chorismate synthase, conserved site
Type: Conserved_site
Description: Chorismate synthase (CS; 5-enolpyruvylshikimate-3-phosphate phospholyase; 1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C. 4.2.3.5) catalyzes the seventh and final step in the shikimate pathway which is used in prokaryotes, fungi and plants for the biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form chorismate which can then be used in phenylalanine, tyrosine or tryptophan biosynthesis. Chorismate synthase requires the presence of a reduced flavin mononucleotide (FMNH2 or FADH2) for its activity. Chorismate synthase from various sources shows a high degree of sequence conservation [ , ]. It is a protein of about 360 to 400 amino-acid residues.Depending on the capacity of these enzymes to regenerate the reduced form of FMN, chorismate synthases are divided into two groups: enzymes, mostly from plants and eubacteria, that sequester CS from the cellular environment, are monofunctional, while those that can generate reduced FMN at the expense of NADPH, such as found in fungi and the ciliated protozoan Euglena gracilis, are bifunctional, having an additional NADPH:FMN oxidoreductase activity. Recently, bifunctionality of the Mycobacterium tuberculosis enzyme (MtCS) was determined by measurements of both chorismate synthase and NADH:FMN oxidoreductase activities. Since shikimate pathway enzymes are present in bacteria, fungi and apicomplexan parasites (such as Toxoplasma gondii, Plasmodium falciparum, and Cryptosporidium parvum) but absent in mammals, they are potentially attractive targets for the development of new therapy against infectious diseases such as tuberculosis (TB) [ , , , , , , , , , ].This entry represents conserved regions from chorismate synthase that are rich in basic residues.
Protein Domain
Name: Chorismate synthase
Type: Family
Description: Chorismate synthase (CS; 5-enolpyruvylshikimate-3-phosphate phospholyase; 1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C. 4.2.3.5) catalyzes the seventh and final step in the shikimate pathway which is used in prokaryotes, fungi and plants for the biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form chorismate which can then be used in phenylalanine, tyrosine or tryptophan biosynthesis. Chorismate synthase requires the presence of a reduced flavin mononucleotide (FMNH2 or FADH2) for its activity. Chorismate synthase from various sources shows a high degree of sequence conservation [ , ]. It is a protein of about 360 to 400 amino-acid residues.Depending on the capacity of these enzymes to regenerate the reduced form of FMN, chorismate synthases are divided into two groups: enzymes, mostly from plants and eubacteria, that sequester CS from the cellular environment, are monofunctional, while those that can generate reduced FMN at the expense of NADPH, such as found in fungi and the ciliated protozoan Euglena gracilis, are bifunctional, having an additional NADPH:FMN oxidoreductase activity. Recently, bifunctionality of the Mycobacterium tuberculosis enzyme (MtCS) was determined by measurements of both chorismate synthase and NADH:FMN oxidoreductase activities. Since shikimate pathway enzymes are present in bacteria, fungi and apicomplexan parasites (such as Toxoplasma gondii, Plasmodium falciparum, and Cryptosporidium parvum) but absent in mammals, they are potentially attractive targets for the development of new therapy against infectious diseases such as tuberculosis (TB) [ , , , , , , , , , ].
Protein Domain
Name: Cytochrome c oxidase assembly protein COX16
Type: Family
Description: Cytochrome c oxidase assembly protein COX16 is required for the assembly of cytochrome c oxidase [ ]. It is foundin the inner membrane of the mitochondrion.
Protein Domain
Name: Ribophorin I
Type: Family
Description: Ribophorin I is an essential subunit of oligosaccharyltransferase (OST), which is also known as dolichyl-diphosphooligosaccharide--protein glycosyltransferase, ( ). OST catalyses the transfer of an oligosaccharide from dolichol pyrophosphate to selected asparagine residues of nascent polypeptides as they are translocated into the lumen of the rough endoplasmic reticulum. In the past Ribophorin I and OST48 were thought to be responsible for OST catalytic activity [ ], it is in reality the OST STTs subunits that are responsible for this activity []. Both yeast and mammalian proteins are glycosylated but the sites are not conserved. Glycosylation may contribute towards general solubility but is unlikely to be involved in a specific biochemical function []. Most family members are predicted to have a transmembrane helix at the C terminus of this region.
Protein Domain
Name: Phenylalanine ammonia-lyase, shielding domain superfamily
Type: Homologous_superfamily
Description: The ubiquitous higher plant enzyme phenylalanine ammonia-lyase (PAL; ) is a key biosynthetic catalyst in phenylpropanoid assembly. PAL catalyses the non-oxidative deamination of L-phenylalanine to trans-cinnamic acid. PAL contains a catalytic Ala-Ser-Gly triad that is post-translationally cyclised. PAL is structurally similar to the mechanistically related histidine ammonia lyase (HAL; ), with PAL having an additional approximately 160 residues extending from the common fold [ ]. Catalysis in PAL may be governed by the dipole moments of seven α-helices associated with the PAL active site. The cofactor 3,5-dihydro-5-methylidene-4H-imidazol-4-one (MIO) resides atop the positive poles of three helices, for increasing its electrophillicity. Plant and fungal PAL enzymes contain aa approximately 100-residue long C-terminal multi-helix domain, which might play a role in the rapid response of PAL in the regulation of phenylpropanoid biosynthesis by destabilising the enzyme []. This entry also includes fungal proteins such as Phenylalanine ammonia-lyase CLZ10 which mediates the biosynthesis of squalestatin S1 with potent cholesterol lowering activity by targeting squalene synthase (SS) [], and Phenylalanine ammonia-lyase hkm12 involved in the biosynthesis of hancockiamides, an unusual new family of N-cinnamoylated piperazines [].This superfamily represents the shielding domain at the C-terminal of PAL which is tightly connected to the core domain through the exceptionally long 55-residue helix α-17. The shielding domain restricts the access to the active centre so that the risk of inactivation by nucleophiles in conjunction with dioxygen is minimised. This may help PAL to function, for instance, in stressed plant tissue. It should be noted that PAL forms its electrophilic prosthetic group autocatalytically from its own polypeptide, rendering it independent of any cofactor and thus facilitating its upregulation [ ].
Protein Domain
Name: Aromatic amino acid lyase
Type: Family
Description: This family includes phenylalanine ammonia-lyase, (PAL; ), histidine ammonia-lyase, (HAL; ), and tyrosine aminomutase, ( ) [ , , ].PAL and HAL are members of the Lyase class I_like superfamily of enzymes that, catalyze similar beta-elimination reactions and are active as homotetramers. Both PAL and HAL contain a catalytic Ala-Ser-Gly triad that is post-translationally cyclised [ ]. PAL is a key biosynthetic catalyst in phenylpropanoid assembly in plants and fungi, and is involved in the biosynthesis of a wide variety of secondary metabolites such as flavanoids, furanocoumarin phytoalexins and cell wall components. These compounds are important for normal growth and in responses to environmental stress. HAL catalyses the first step in histidine degradation, the removal of an ammonia group from histidine to produce urocanic acid. The core domain in PAL and HAL share about 30% sequence identity, with PAL containing an additional approximately 160 residues extending from the common fold []. Tyrosine 2,3-aminomutase has aminomutase activity and, to a much lesser extent, ammonia-lyase activity [].PAL is being explored as enzyme substitution therapy for Phenylketonuria (PKU), a disorder which involves an inability to metabolize phenylalanine. HAL failure in humans results in the disease histidinemia [ , , , ].
Protein Domain
Name: Phenylalanine/histidine ammonia-lyases, active site
Type: Active_site
Description: This entry represents the active site of phenylalanine ammonia-lyase (PAL; ) and the mechanistically related protein histidine ammonia lyase (HAL; ). Both contain a catalytic Ala-Ser-Gly triad that is post-translationally cyclised [ ]. PAL is a key biosynthetic catalyst in phenylpropanoid assembly in plants and fungi, and is involved in the biosynthesis of a wide variety of secondary metabolites such as flavanoids, furanocoumarin phytoalexins and cell wall components. These compounds are important for normal growth and in responses to environmental stress. PAL catalyses the removal of an ammonia group from phenylalanine to form trans-cinnamate. HAL catalyses the first step in histidine degradation, the removal of an ammonia group from histidine to produce urocanic acid. The core domain in PAL and Hal share about 30% sequence identity, with PAL containing an additional approximately 160 residues extending from the common fold [].The two types of enzymes are functionally and structurally related [ ]. They are the only enzymes which are known to have the modified amino acid dehydro-alanine (DHA) in their active site. A serine residue has been shown [, , ] to be the precursor of this essential electrophilic moiety. The region around the active site serine is well conserved and has been used as the signature pattern for this entry.
Protein Domain
Name: Phenylalanine ammonia-lyase
Type: Family
Description: The ubiquitous higher plant enzyme phenylalanine ammonia-lyase (PAL; ) is a key biosynthetic catalyst in phenylpropanoid assembly. PAL catalyses the non-oxidative deamination of L-phenylalanine to trans-cinnamic acid. PAL contains a catalytic Ala-Ser-Gly triad that is post-translationally cyclised. PAL is structurally similar to the mechanistically related histidine ammonia lyase (HAL; ), with PAL having an additional approximately 160 residues extending from the common fold [ ]. Catalysis in PAL may be governed by the dipole moments of seven α-helices associated with the PAL active site. The cofactor 3,5-dihydro-5-methylidene-4H-imidazol-4-one (MIO) resides atop the positive poles of three helices, for increasing its electrophillicity. Plant and fungal PAL enzymes contain aa approximately 100-residue long C-terminal multi-helix domain, which might play a role in the rapid response of PAL in the regulation of phenylpropanoid biosynthesis by destabilising the enzyme []. This entry also includes fungal proteins such as Phenylalanine ammonia-lyase CLZ10 which mediates the biosynthesis of squalestatin S1 with potent cholesterol lowering activity by targeting squalene synthase (SS) [], and Phenylalanine ammonia-lyase hkm12 involved in the biosynthesis of hancockiamides, an unusual new family of N-cinnamoylated piperazines [].
Protein Domain
Name: Peptidyl-tRNA hydrolase
Type: Family
Description: Peptidyl-tRNA hydrolase ( ) (PTH) is a bacterial enzyme that cleaves peptidyl-tRNA or N-acyl-aminoacyl-tRNA to yield free peptides or N-acyl-amino acids and tRNA. The natural substrate for this enzyme may be peptidyl-tRNA which drop off the ribosome during protein synthesis [ , ]. Bacterial PTH has been found to be evolutionary related to a yeast protein [].This group also contains chloroplast RNA splicing 2 (CRS2), which is closely related nuclear-encoded protein required for the splicing of nine group II introns in chloroplasts [ , , ].
Protein Domain
Name: Vacuolar protein sorting-associated protein 8, central domain
Type: Domain
Description: Vps8 is one of the Golgi complex components necessary for vacuolar sorting [ ]. Eukaryotic cells contain a highly dynamic endo-membrane system, in which individual organelles keep their identity despite continuous vesicle generation and fusion. Vesicles that bud from a donor membrane are targeted and delivered to each individual organelle, where they release their cargo after fusion with the acceptor membrane. Vps8 is the core component of the endosomal tethering complex CORVET (class C core vacuole/endosome tethering). Vps8 co-operates with Vps21-GTP to mediate endosomal clustering in a reaction that is dependent on Vps3. Vps8 is the only CORVET subunit that is enriched on late endosomes, suggesting that it is a marker for the maturation of late endosomes. Late endosomes form intralumenal vesicles, and the resulting multivesicular bodies fuse with the vacuole to release their cargoes [ ].This entry represents the central domain of Vps8.
Protein Domain
Name: Pyruvate dehydrogenase (acetyl-transferring) E1 component, alpha subunit, subgroup y
Type: Family
Description: Members of this protein family are the alpha subunit of the E1 component of pyruvate dehydrogenase (PDH) [ ]. This entry represents one branch of a larger family that E1-alpha proteins from 2-oxoisovalerate dehydrogenase, acetoin dehydrogenase, another PDH clade, etc. The pyruvate dehydrogenase complex catalyses the overall conversion of pyruvate to acetyl-CoA and carbon dioxide. It contains multiple copies of three enzymatic components: pyruvate dehydrogenase (E1), dihydrolipoamide acetyltransferase (E2) and lipoamide dehydrogenase (E3).
Protein Domain
Name: NADH:ubiquinone oxidoreductase intermediate-associated protein 30
Type: Domain
Description: Mitochondrial complex I intermediate-associated protein 30 (CIA30) is present in human and mouse, and also in Schizosaccharomyces pombe (Fission yeast) which does not contain the NADH dehydrogenase component of complex I, or many of the other essential subunits. This means it is not directly involved in oxidative phosphorylation [ , ]. In Drosophila it has been shown to be a chaperone required for assembly complex I [].
Protein Domain
Name: Large ribosomal RNA subunit accumulation protein YceD
Type: Family
Description: This family includes the large ribosomal RNA subunit accumulation protein YceD. Gene knockout in Escherichia colileads to significant reduction of 23S rRNA. These proteins are nearly universally conserved in bacteria and plants. In Nicotiana benthamianaleaves the protein is localized in chloroplasts [ ].
Protein Domain
Name: L-gulonolactone oxidase, plant
Type: Family
Description: This entry represents a family of L-gulonolactone oxidases. At least seven distinct members are found in Arabidopsis thaliana (Mouse-ear cress). This group of proteins may be involved in the biosynthesis of ascorbic acid [ ].
Protein Domain      
Protein Domain
Name: Neurolysin/Thimet oligopeptidase, domain 2
Type: Homologous_superfamily
Description: Thimet oligopeptidase and neurolysin are closely related zinc-dependent metallopeptidases that metabolize small bioactive peptides. They cleave many substrates at the same sites, but they recognise different positions on others [].This entry represents an alpha orthogonal bundle domain found in these and related peptidases.
Protein Domain
Name: Peptidase M3A/M3B catalytic domain
Type: Domain
Description: This group of metallopeptidases belong to MEROPS peptidase family M3 (clan MA(E)), subfamilies M3A and M3B. The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.The Thimet oligopeptidase family, is a large family of archaeal, bacterial and eukaryotic oligopeptidases that cleave medium sized peptides. The group contains:Mitochondrial intermediate peptidase ( ) Neurolysin, mitochondrial precursor, ( ) Thimet oligopeptidase ( ) Dipeptidyl carboxypeptidase ( ) Oligopeptidase A ( ) Oligoendopeptidase F
Protein Domain
Name: Matrilin, coiled-coil trimerisation domain
Type: Domain
Description: This entry represents a short domain found the matrilin (cartilage matrix) proteins. It forms a coiled coil structure and contains a single cysteine residue at its start which is likely to form a di-sulphide bridge with a corresponding cysteine in an upstream EGF domain, thereby spanning the VWA domain of the protein ( ).This domain is likely to be responsible for protein trimerisation [ , ].
Protein Domain
Name: Glycine-rich domain-containing protein-like
Type: Family
Description: This entry includes Arabidopsis Glycine-rich domain-containing protein 1 and 2 (GRDP1/2). They are involved in development and stress responses [ , ].
Protein Domain
Name: Cyclic phosphodiesterase
Type: Homologous_superfamily
Description: This entry represents a β-barrel domain consisting of a duplication of a beta/alpha/beta/alpha/beta motif, which is found in plant cyclic phosphodiesterases (CPDases) [ ], as well as catalytic domains from mammalian 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNPase) [], and bacterial and archaeal LigT-like 2',3'-cyclic phosphodiesterases (originally identified as 2'-5' RNA ligases) []. This β-barrel domain is similar in structure to the β-barrel found in prokaryotic DNA topoisomerases I and III.The catalytic domain of CNPase from animals catalyzes the hydrolysis of nucleoside 2',3'-cyclic monophosphates to nucleoside 2'-monophosphates [ ]. The archaeobacterial LigT-like enzymes hydrolyze 2',3'-cyclic phosphate in (oligo)nucleotides and join the produced 2'-phosphate to a 5'-hydroxyl group of another (oligo)nucleotide to form atypical 2',5'-linkages. Such activity has not been reported for CNPase [].
Protein Domain
Name: 2',3'-cyclic-nucleotide 3'-phosphodiesterase
Type: Family
Description: 2',3' Cyclic nucleotide phosphodiesterases (CPDases) are enzymes that catalyse at least two distinct steps in the splicing of tRNA introns in eukaryotes. The active site is characterised by two conserved histidine residues [ ]. The enzyme has six cysteine residues, four of which are involved in forming two intra-molecular disulphide bridges. One of these bridges is involved in the catalytic activity of the enzyme as it opens when CPDase is semi-reduced [].Proteins in this entryand belong to and catalyse the reaction Nucleoside 2',3'-cyclic phosphate + H2O = nucleoside 2'-phosphate.
Protein Domain
Name: Complex 1 LYR protein domain
Type: Domain
Description: Proteins containing this domain include an accessory subunit of the higher eukaryotic NADH dehydrogenase complex. In Saccharomyces cerevisiae, the Isd11 protein ( ) has been shown to play a role in Fe/S cluster biogenesis in mitochondria [ , ]. We have named this family LYR after a highly conserved tripeptide motif close to the N terminus of these proteins.
Protein Domain
Name: Alpha,alpha-trehalose-phosphate synthase
Type: Family
Description: This enzyme catalyzes the key, penultimate step in biosynthesis of trehalose, a compatible solute made as an osmoprotectant in some species in all three domains of life. The gene symbol OtsA stands for osmotically regulated trehalose synthesis A. Trehalose helps protect against both osmotic and thermal stresses, and is made from two glucose subunits. This entry excludes glucosylglycerol-phosphate synthase, an enzyme of an analogous osmoprotectant system in many cyanobacterial strains. This entry does not identify archaeal examples, as they are more divergent than glucosylglycerol-phosphate synthase. Sequences that score in the gray zone between the trusted and noise cut offs include a number of yeast multidomain proteins in which the N-terminal domain may be functionally equivalent to this family. The gray zone also includes the OtsA of Cornyebacterium glutamicum (and related species), shown to be responsible for synthesis of only trace amounts of trehalose while the majority is synthesized by the TreYZ pathway; the significance of OtsA in this species is unclear (see [ ]).
Protein Domain
Name: XPG conserved site
Type: Conserved_site
Description: Xeroderma pigmentosum (XP) [ ] is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair [, ]. XP-G can be corrected by a 133 Kd nuclear protein, XPGC []. XPGC is an acidic protein that confers normal UV resistance in expressing cells []. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms [, ]. XPGC cleaves one strand of the duplex at the border with the single-stranded region [].XPG (ERCC-5) belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases [ , , ]; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.The first pattern, , corresponds to the central part of the N-region, the second pattern, , is part of the I-region and includes the putative catalytic core pentapeptide.
Protein Domain
Name: Phosphoenolpyruvate carboxykinase, N-terminal
Type: Homologous_superfamily
Description: Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising ( ) and GTP-utilising ( ) enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria [ ]. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding [ ]. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels [].PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another [ , ]. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site []. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.This superfamily represents the N-terminal domain found in both GTP-utilising and ATP-utilising phosphoenolpyruvate carboxykinase enzymes.
Protein Domain
Name: Phosphoenolpyruvate carboxykinase, ATP-utilising
Type: Family
Description: Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising ( ) and GTP-utilising ( ) enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria [ ]. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding [ ]. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels [].PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another [ , ]. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site []. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.This entry represents ATP-utilising phosphoenolpyruvate carboxykinase enzymes.
Protein Domain
Name: Phosphoenolpyruvate carboxykinase (ATP), conserved site
Type: Conserved_site
Description: Phosphoenolpyruvate carboxykinase (ATP) ( ) (PEPCK) [ ] catalyses the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate while hydrolysing ATP, a rate limiting step in gluconeogenesis (the biosynthesis of glucose). This conserved site represents a highly conserved region that contains four acidic residues and which is located in the central part of the enzyme. The beginning of this conserved site is located about 10 residues to the C terminus of an ATP-binding motif 'A' (P-loop) and is also part of the ATP-binding domain [ ].
Protein Domain
Name: Phosphoenolpyruvate carboxykinase, C-terminal
Type: Homologous_superfamily
Description: Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising ( ) and GTP-utilising ( ) enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria [ ]. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding [ ]. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels [].PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another [ , ]. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site []. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.This superfamily represents the C-terminal domain found in both GTP-utilising and ATP-utilising phosphoenolpyruvate carboxykinase enzymes.
Protein Domain
Name: Peptide-N4-(N-acetyl-beta-glucosaminyl)asparagine amidase A
Type: Family
Description: Peptide-N4-(N-acetyl-beta-glucosaminyl)asparagine amidase A (PNGase A), unlike many other amidases, is capable of hydrolysing glycopeptides with an alpha-1,3-fucosylated asparagine-bound N-acetylglucosamine (GlcNAc). PNGase A is a heterodimer composed of a large and small subunit [ ]. This entry represents the PNGase A precursor, which contains both subunits and is activated by proteolytic cleavage.
Protein Domain
Name: Sorting nexin/Vps5-like, C-terminal
Type: Domain
Description: Vps5 is a sorting nexin that functions in membrane trafficking. This is the C-terminal dimerisation domain [ ].
Protein Domain
Name: Cell division cycle protein 123
Type: Family
Description: This family contains a number of eukaryotic cell division cycle 123 (Cdc123, also known as D123) proteins approximately 330 residues long. It has been shown that mutated variants of Cdc123 exhibit temperature-dependent differences in their degradation rate [ ]. Budding yeast Cdc123 regulates the cell cycle in a nutrient dependent manner [].
Protein Domain      
Protein Domain
Name: WIYLD domain
Type: Domain
Description: This entry represents a presumed domain which has been predicted to contain three alpha helices. It was named the WIYLD domain based on the pattern of the ost conserved residues []. This domain appears to be specific to plant SET-domain proteins.
Protein Domain
Name: Phospholipase C, phosphatidylinositol-specific, Y domain
Type: Domain
Description: Phosphatidylinositol-specific phospholipase C ( ), an eukaryotic intracellular enzyme, plays an important role in signal transduction processes [ ] (see ). It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins [ , , ].In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC.All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as 'X-box' (see ) and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain (see ) possibly involved in Ca-dependent membrane attachment.
Protein Domain
Name: Signal peptidase complex subunit 2
Type: Family
Description: This family represents the Signal peptidase complex subunit 2 (SPCS2) and its homologues, such as Spc2 from budding yeasts. The signal peptidase complex cleaves the signal sequence from proteins targeted to the endoplasmic reticulum (ER). Mammalian signal peptidase is as a complex of five different polypeptide chains [ ], while the budding yeast SPC comprises four proteins []. Budding yeast Spc2 has been shown to be a nonessential component of the signal peptidase complex []. Spc2 has been shown to enhance the enzymatic activity of the SPC and facilitate the interactions between different components of the translocation site []. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins [ ].
Protein Domain
Name: Eukaryotic translation initiation factor 3 subunit F
Type: Family
Description: Eukaryotic translation initiation factor 3 subunit F is a component of the eukaryotic translation initiation factor 3 (eIF-3) complex, which is involved in protein synthesis and, together with other initiation factors, stimulates binding of mRNA and methionyl-tRNAi to the 40S ribosome [ , ]. In humans, it exhibits a deubiquitinase activity regulating Notch activation [].
Protein Domain
Name: Nematode resistance protein-like HSPRO1, N-terminal
Type: Domain
Description: This entry represents the N terminus (approximately 180 residues) of plant HSPRO1, which is believed to confer resistance to nematodes [ ]. Proteins containing this domain also include HSPRO2, which is involved in basal resistance [].
Protein Domain
Name: Hs1pro-1, C-terminal
Type: Domain
Description: This entry represents the C terminus (approximately 270 residues) of a number of plant Hs1pro-1 proteins, which are believed to confer nematode resistance [ ].
Protein Domain
Name: Chromosome segregation protein Spc25, C-terminal
Type: Domain
Description: The Ndc80 complex is a conserved outer kinetochore protein complex consisting of Ndc80 (Hec1), Nuf2, Spc24 and Spc25. The Ndc80 complex is required for chromosome segregation and spindle checkpoint activity [ , , ].This entry represents the C-terminal domain of Spc25 [ ].
Protein Domain
Name: BolA protein
Type: Family
Description: This family consist of the morpho-protein BolA from Escherichia coli and its various homologues. In E. coli, over-expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division [ ]. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase []. BolA is also induced by stress during early stages of growth [] and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5 [, ]. IbaG is a BolA homologue involved in acid resistance [].
Protein Domain
Name: Ribosomal protein S16
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups: Eubacterial S16.Algal and plant chloroplast S16.Cyanelle S16.Neurospora crassa mitochondrial S24 (cyt-21).S16 proteins have about 100 amino-acid residues. There are two paralogues in Arabidopsis thaliana, RPS16-1 (chloroplastic) and RPS16-2 (targeted to the chloroplast and the mitochondrion) [].
Protein Domain
Name: Ribosomal protein S16 domain superfamily
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups: Eubacterial S16.Algal and plant chloroplast S16.Cyanelle S16.Neurospora crassa mitochondrial S24 (cyt-21).S16 proteins have about 100 amino-acid residues. There are two paralogues in Arabidopsis thaliana, RPS16-1 (chloroplastic) and RPS16-2 (targeted to the chloroplast and the mitochondrion) [].This superfamily represents the structural domain of ribosomal S16 proteins, consisting of an α-β 2 layer sandwich.
Protein Domain
Name: Alliinase, EGF-like domain
Type: Domain
Description: Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [ ].This entry represents the N-terminal EGF-like domain [ ].
Protein Domain
Name: Biotin--acetyl-CoA-carboxylase ligase
Type: Family
Description: The biotin operon of Escherichia coli contains 5 structural genes involved in the synthesis of biotin. Transcription of the operon is regulated via one of these proteins, the biotin ligase BirA. BirA is an asymetric protein with 3 specific domains - an N-terminal DNA-binding domain, a central catalytic domain and a C-terminal of unknown function. The ligase reaction intermediate, biotinyl-5'-AMP, is the co-repressor that triggers DNA binding by BirA. The α-helical N-terminal domain of the BirA protein has the helix-turn-helix structure of DNA-binding proteins with a central DNA recognition helix. BirA undergoes several conformational changes related to repressor function and the N-terminal DNA-binding function is connected to the rest of the molecule through a hinge which will allow relocation of the domains during the reaction []. Biotin-binding causes a large structural change thought to facilitate ATP-binding.Two repressor molecules form the operator-repressor complex, with dimer formation occuring simultaneously with DNA binding. DNA-binding may also cause a conformational change which allows this co-operative interaction. In the dimer structure, the β-sheets in the central domain of each monomer are arranged side-by-side to form a single, seamless β-sheet. The apparent orthologs among the eukaryotes are larger proteins that contain a domain with high sequence homology to BirA.
Protein Domain
Name: Biotin protein ligase, C-terminal
Type: Domain
Description: This C-terminal domain has an SH3-like barrel fold, the function of which is unknown. It is found associated with prokaryotic bifunctional transcriptional repressors [ ] and eukaryotic enzymes involved in biotin utilization [, ]. In Escherichia coli the biotin operon repressor (BirA) is a bifunctional protein. BirA acts both as the acetyl-coA carboxylase biotin holoenzyme synthetase ( ) and as the biotin operon repressor. DNA sequence analysis of mutations indicates that the helix-turn-helix DNA binding region is located at the N terminus while mutations affecting enzyme function, although mapping over a large region, are found mainly in the central part of the protein's primary sequence [ ].
Protein Domain
Name: Leghaemoglobin, iron-binding site
Type: Binding_site
Description: Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms [ ]. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:Haemoglobin (Hb): tetramer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates [ ]. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors [].Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle [ ].Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia [ ]. Neuroglobin belongs to a branch of the globin family that diverged early in evolution. Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin [ ].Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers [ ].Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation.Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants [ ].Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin [].Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [ , ]. Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors [ ].Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 α-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features [ ].This entry is found in leghaemoglobins from leguminous and non-leguminous plants, and also non-symbiotic haemoglobins from other plants. Leghaemoglobins were first identified in the root nodules of leguminous plants, where they are crucial for supplying sufficient oxygen to root nodule bacteria for nitrogen fixation to occur [ ]. Although leghaemoglobin and myoglobin both share a common fold, and both regulate the facilitated diffusion of oxygen, leghemoglobins regulate oxygen affinity through a mechanism different from that of myoglobin using a novel combination of haem pocket amino acids that lower the oxygen affinity [, ]. In non-leguminous plants, leghaemoglobins play a role in the respiratory metabolism of root cells. The structure of leghaemoglobins is similar to that of haemoglobins and myoglobins, although there is little sequence conservation. The proteins are largely α-helical, eight helices providing the scaffold for a well-defined haem-binding pocket. By contrast with the tetrameric mammalian globin assembly, the plant form is monomeric.Non-symbiotic haemoglobins (NsHb) play important roles in a variety of cellular processes. A class I NsHb from cotton plants can be induced in plant roots as a defence mechanism against pathogen invasion, possibly by modulating nitric oxide (NO) levels [ ]. Several NsHbs appear to play a role NO scavenging in plants, indicating that the primordial function of haemoglobins may well be to protect against nitrosative stress and to modulate NO signalling functions [].The signature pattern of this entry exclusively identifies plant haemoglobin sequences. It is centred on a histidine that acts as the haem iron distal ligand.
Protein Domain
Name: Indole-3-glycerol phosphate synthase, conserved site
Type: Conserved_site
Description: Indole-3-glycerol phosphate synthase ( ) (IGPS) catalyses the fourth step in the biosynthesis of tryptophan, the ring closure of 1-(2-carboxy-phenylamino)-1-deoxyribulose into indol-3-glycerol-phosphate. In some bacteria, IGPS is a single chain enzyme. In others, such as Escherichia coli, it is the N-terminal domain of a bifunctional enzyme that also catalyses N-(5'-phosphoribosyl)anthranilate isomerase ( ) (PRAI) activity (see ), the third step of tryptophan biosynthesis. In fungi, IGPS is the central domain of a trifunctional enzyme that contains a PRAI C-terminal domain and a glutamine amidotransferase ( ) (GATase) N-terminal domain. A structure of the IGPS domain of the bifunctional enzyme from the mesophilic bacterium E. coli (eIGPS) has been compared with the monomeric indole-3-glycerol phosphate synthase from the hyperthermophilic archaeon Sulfolobus solfataricus (sIGPS). Both are single-domain(beta/alpha)8 barrel proteins, with one (eIGPS) or two (sIGPS) additional helices inserted before the first beta strand [ ]. This entry represents a highly conserved region within the N-terminal section of IGPS, which has been shown to be part of the active site cavity.
Protein Domain
Name: Indole-3-glycerol phosphate synthase domain
Type: Domain
Description: Indole-3-glycerol phosphate synthase ( ) (IGPS) catalyses the fourth step in the biosynthesis of tryptophan, the ring closure of 1-(2-carboxy-phenylamino)-1-deoxyribulose into indol-3-glycerol-phosphate. In some bacteria, IGPS is a single chain enzyme. In others, such as Escherichia coli, it is the N-terminal domain of a bifunctional enzyme that also catalyses N-(5'-phosphoribosyl)anthranilate isomerase ( ) (PRAI) activity (see ), the third step of tryptophan biosynthesis. In fungi, IGPS is the central domain of a trifunctional enzyme that contains a PRAI C-terminal domain and a glutamine amidotransferase ( ) (GATase) N-terminal domain. A structure of the IGPS domain of the bifunctional enzyme from the mesophilic bacterium E. coli (eIGPS) has been compared with the monomeric indole-3-glycerol phosphate synthase from the hyperthermophilic archaeon Sulfolobus solfataricus (sIGPS). Both are single-domain(beta/alpha)8 barrel proteins, with one (eIGPS) or two (sIGPS) additional helices inserted before the first beta strand [ ].
Protein Domain
Name: Iron-sulfur cluster assembly scaffold protein IscU
Type: Family
Description: Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S][ ]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S]form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins [ , ]. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transferring them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly [ ].The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA [ ]. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. SufA is homologous to IscA [], acting as a scaffold protein in which Fe and S atoms are assembled into [FeS]cluster forms, which can then easily be transferred to apoproteins targets. In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins [ ]. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen [].This entry represents IscU from the ISC system, a homologue of the N-terminal region of NifU (NIF system), an Fe-S cluster assembly protein found mostly in nitrogen-fixing bacteria. IscU is a scaffold protein on which Fe-S clusters are assembled before transfer to apoproteins [ , ]. This family includes largely proteobacterial and eukaryotic forms and excludes the true NifU proteins from Klebsiella sp. and Anabaena sp. as well as the archaeal homologues.
Protein Domain
Name: Cytochrome cd1-nitrite reductase-like, haem d1 domain superfamily
Type: Homologous_superfamily
Description: Cytochrome cd1 (cyt cd1) nitrite reductase is a dimeric enzyme of the bacterial periplasm that plays a key role in denitrification, the respiratory reduction of nitrite to nitric oxide in the nitrogen cycle. Each subunit of the cyt cd1 dimer contains one cytochrome c and one d1 haem group [ ]. The active site contains a specialised d1 haem, where the nitrite substrate is bound and reduced. This d1 haem is bound in an 8-bladed β-propeller, which is also found in some members of the WD40 repeat-containing proteins (). This superfamily represents the d1 heme binding domain.
Protein Domain
Name: Dim1 family
Type: Family
Description: This entry represents fission yeast Dim1 and its homologues, including Dib1 from budding yeasts, YLS8 from plants and TXNL4 from animals. Dim1 was originally identified as a mitosis protein [ ]. Later, it was found to interact with spliceosome component Prp6, which is involved in pre-mRNA splicing []. Dim1 may act at the level of mRNA, which impacts the functioning of the APC/C, a critical complex in controlling mitotic progression [].It's worth noting that although the Dim proteins exhibit a thioredoxin-like fold, they lack the disulfide bond required for the thioredoxin redox activity [ ].
Protein Domain      
Protein Domain
Name: Myotubularin-like, phosphatase domain
Type: Domain
Description: This entry represents the phosphatase domain within eukaryotic myotubularin-related proteins. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate [ ]. Mutations in gene encoding myotubularin-related proteins have been associated with disease []. The protein exists as a dimer with twofold symmetry, in which the dimerization is mediated by the phosphatase domain [].Myotubularin phosphatases are members of the protein tyrosine phosphatase (PTP) superfamily. The PTP domain is found in a diverse group of enzymes that catalyse phosphoester hydrolysis using a cysteine nucleophile and an arginine residue that binds to oxygen atoms of the phosphate. These two catalytically essential residues are found in a Cys-x(5)-Arg motif, which is a hallmark of PTP domains. The PTP superfamily of enzymes includes tyrosine-specific, dual specificity, low molecular weight, and Cdc25 phosphatases. All of these enzymes utilise phosphoproteins as substrates. Unlike these members of PTPs, enzymes that contain the tensin and myotubularin PTP domain utilise the phosphoinositide as its physiologic substrate. Myotubularins are 3-phosphatases specific for membrane-embedded PtdIns3P and PtdIns(3,5)P2, two PIs that function within the endosomal-lysosomal pathway [, ].The myotubularin phosphatase domain consists of a central seven stranded beta sheet flanked by thirteen alpha helices [ , ]. Although its core structure is similar to that of other PTP superfamily members, the myotubularin phosphatase domain is much larger. It contains an extra C-terminal region, which could be implicated in protein-protein interactions. The active site motif forms a P-loop at the base of a substrate binding pocket that is characteristic of PTP domains. This pocket is significantly deeper than that of other PTP pockets, which could explain the difference in substrate specificity.The myotubularin family includes catalytically inactive members, or pseudophosphatases, which contain inactivating substitutions in the phosphatase domain [ ].
Protein Domain
Name: BING4, C-terminal domain
Type: Domain
Description: This C-terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins [ ].
Protein Domain
Name: Thermonuclease active site
Type: Active_site
Description: Staphylococcus aureus secretes a thermostable nuclease ( ), known as thermonuclease (TNase), which is a calcium-dependent enzyme that catalyzes thehydrolysis of both DNA and RNA at the 5' position of the phosphodiester bond yielding 3'-mononucleotides and dinucleotides []. This signature contains the three residues, twoarginines and a glutamate, that form the active site. The sequence of the TNase of S. aureus is evolutionary related [] to other TNase's of other Staphylococcus spp as well as to several other proteins.
Protein Domain
Name: Staphylococcal nuclease (SNase-like), OB-fold
Type: Domain
Description: Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has a multi-domain organisation [ ]. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function []. A variety of proteins including many that are still uncharacterised belong to this group.SNase domains have an OB-fold consisting of a closed or partly open β-barrel with Greek key topology [ ].
Protein Domain
Name: Histone H2B
Type: Family
Description: Histone H2B is one of the five histones, along with H1/H5, H2A, H3 and H4. Two copies of each of the H2A, H2B, H3, and H4 histones ensemble to form the core of the nucleosome [ ]. The nucleosome forms octameric structure that wraps DNA in a left-handed manner. Histones can undergo several different types of post-translational modifications that affect transcription, DNA repair, DNA replication and chromosomal stability. The HBR (histone H2B repression) domain within the H2B N terminus is important for nucleosome assembly by histone chaperone FACT [ ].
Protein Domain
Name: Kinetochore protein Nuf2, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of kinetochore protein Nuf2, which is part of the Ndc80 complex. This domain fold as a CH domain consisting of a four-helix bundle containing the parallel helices alphaA, alphaC, alphaE, and alphaG, similar to that in Ndc80. These CH domains from Nuf2 and Ndc80 forms the globular rod at one end the complex [ , ]. The ability of the Ndc80 complex to bind microtubules resides on the tightly packed CH domains from Nuf2 and Ndc80 []. This complex binds to the spindle and is required for chromosome segregation and spindle checkpoint activity [, , , , , , ].
Protein Domain
Name: Malate dehydrogenase, type 2
Type: Family
Description: Malate dehydrogenases catalyse the interconversion of malate and oxaloacetate using dinucleotide cofactors [ ]. The enzymes in this entry are found in archaea, bacteria and eukaryotes and fall into two distinct groups. The first group are cytoplasmic, NAD-dependent enzymes which participate in the citric acid cycle (). The second group are found in plant chloroplasts, use NADP as cofactor, and participate in the C4 cycle ( ). Structural studies indicate that these enzymes are homodimers with very similar overall topology, though the chloroplast enzymes also have N- and C-terminal extensions, and all contain the classical Rossman fold for NAD(P)H binding [ , , , ]. Substrate specificity is determined by a mobile loop at the active site which uses charge balancing to discriminate between the correct substrates (malate and oxaloacetate) and other potential oxo/hydroxyacid substrates the enzyme may encounter within the cell [].
Protein Domain
Name: Malate dehydrogenase, NADP-dependent, plants
Type: Family
Description: This entry represents the NADP-dependent malate dehydrogenase found in plants, mosses and green algae and localised to the chloroplast. Malate dehydrogenase converts oxaloacetate into malate, a critical step in the C4 cycle which allows circumvention of the effects of photorespiration. Malate is subsequently transported from the chloroplast to the cytoplasm (and then to the bundle sheath cells in C4 plants). The plant and moss enzymes are light regulated via cysteine diisulphide bonds. The enzyme from sorghum has been crystallised [ ].
Protein Domain
Name: GTP-binding protein TrmE/Aminomethyltransferase GcvT, domain 1
Type: Homologous_superfamily
Description: This entry represents an alpha/beta domain found in GTP-binding protein TrmE (N-terminal domain) [ ] and domain 1 of glycine cleavage T-protein (also known as aminomethyltransferase) [].TrmE is a guanine nucleotide-binding protein conserved between bacteria and eukaryotes. It is involved in the modification of uridine bases at the first anticodon (wobble) position of tRNAs. The N-terminal portion of the protein is necessary for mediating dimer formation within the protein [ ]. Glycine cleavage T-protein is part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes [ ]. GCV catalyses the oxidative decarboxylation of glycine. The T-protein is an aminomethyl transferase . The N-terminal region (residues 14-35) of domain 1 plays a crucial role in H-protein interaction [ ].
Protein Domain
Name: Glycine cleavage T-protein, C-terminal barrel domain
Type: Domain
Description: This entry represents the C-terminal β-barrel domain of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes [ ]. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase.
Protein Domain
Name: Aminomethyltransferase, folate-binding domain
Type: Domain
Description: This domain is found at the N terminus of glycine cleavage T-proteins, which are part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein (aminomethyltransferase, ) is a folate-dependent enzyme that catalyses the release of ammonia and the transfer of the methylene carbon unit (C1 unit) to tetrahydrofolate (H4folate) from the aminomethyl intermediate attached to the lipoate cofactor of H-protein [, ].This domain is also found in YgfZ proteins. YgfZ in E.coli is a folate binding protein involved in RNA modification and regulation of chromosomal replication initiation [ ]. YgfZ is not an aminomethyltransferase but is likely a folate-dependent regulatory protein []. This domain could represent a folate-binding domain.
Protein Domain
Name: YgfZ/GcvT conserved site
Type: Conserved_site
Description: This entry represents a conserved site, which includes includes the motif KGCYxGQE, that is found in glycine cleavage system T proteins (GcvT) and in the bacterial tRNA-modifying protein YgfZ.
Protein Domain
Name: HAD-superfamily hydrolase, subfamily IIID
Type: Domain
Description: This family of sequences appears to belong to the Haloacid Dehalogenase (HAD) superfamily of enzymes by virtue of the presence of three catalytic domains [ ], in this case: LLVLD(ILV)D(YH)T, I(VMG)IWS, and (DN)(VC)K(PA)Lx{15-17}T(IL)(MH)(FV)DD(IL)(GRS)(RK)N. Since this family has no large "cap"domain [ ] between motifs 1 and 2 or between 2 and 3, it is formally a "class III"HAD [ ].
Protein Domain
Name: Ribosomal protein L6
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 contains two domains with almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N terminus being involved in protein-protein interactions and the C terminus containing possible RNA-binding sites [].
Protein Domain
Name: Ribosomal protein L6, alpha-beta domain
Type: Domain
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 contains two domains with almost identical folds, suggesting that is was derived by the duplication of anancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N terminus being involved in protein-protein interactions and the C terminus containing possible RNA-binding sites [ ].This entry represents the α-β domain found duplicated in ribosomal L6 proteins. This domain consists of two β-sheets and one α-helix packed around single core [ ].
Protein Domain
Name: Ribosomal protein L6, bacterial-type
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 contains two domains with almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N terminus being involved in protein-protein interactions and the C terminus containing possible RNA-binding sites [].
Protein Domain
Name: Ribosomal protein L6, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].This pattern identifies ribosomal protein L6, which is one of the proteins from the large ribosomal subunit. In Escherichia coli, L6 is known to bind directly to the 23S rRNA and is located at the aminoacyl-tRNA binding site of the peptidyltransferase centre. It belongs to a family of ribosomal proteins, which on the basis of sequencesimilarities groups: bacterial, algal chloroplast, cyanelle, archaeal, Marchantia polymorpha mitochondrial L6, yeast mitochondrial YmL6 (gene MRPL6), mammalian, Drosophila melanogaster; plant and yeast L9 [ , , ]. This signature finds the L6 proteins from most organisms, while plant L6 and the L9 proteins are also found in .
Protein Domain
Name: Shugoshin, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of Shugoshin (Sgo1) kinetochore-attachment proteins. Shugoshin has a conserved coiled-coil N-terminal domain and a highly conserved C-terminal basic region ( ). Shugoshin is a crucial target of Bub1 kinase that plays a central role in chromosome cohesion during mitosis and meiosis divisions by preventing premature dissociation of cohesin complex from centromeres after prophase, when most of cohesin complex dissociates from chromosomes arms [ , ]. Shugoshin is thought to act by protecting Rec8 and Rad21 at the centromeres from separase degradation during anaphase I (during meiosis) so that sister chromatids remain tethered []. Shugoshin also acts as a spindle checkpoint component required for sensing tension between sister chromatids during mitosis, its degradation when they separate preventing cell cycle arrest and chromosome loss in anaphase, a time when sister chromatids are no longer under tension. Human shugoshin is diffusible and mediates kinetochore-driven formation of kinetochore-microtubules during bipolar spindle assembly []. Further, the primary role of shugoshin is to ensure bipolar attachment of kinetochores, and its role in protecting cohesion has co-developed to facilitate this process [].
Protein Domain
Name: RNA pyrophosphohydrolase RppH
Type: Family
Description: This entry represents an RNA pyrophosphohydrolase belonging to the nudix hydrolase family, RppH subfamily. It accelerates the degradation of transcripts by removing pyrophosphate from the 5'-end of triphosphorylated RNA, leading to a more labile monophosphorylated state that can stimulate subsequent ribonuclease cleavage [].
Protein Domain
Name: Probable zinc-ribbon domain, plant
Type: Domain
Description: This eukaryotic domain has no known function.
Protein Domain
Name: WIBG, Mago-binding
Type: Domain
Description: Partner of Y14 and mago (PYM, also known as WIBG) is a key regulator of the exon junction complex (EJC), a multiprotein complex that associates immediately upstream of the exon-exon junction on mRNAs and serves as a positional landmarks for the intron exon structure of genes and directs post-transcriptional processes in the cytoplasm such as mRNA export, nonsense-mediated mRNA decay (NMD) or translation [ , ].The N-terminal domain of PYM adopts a small globular all-beta-domain structure, with a three-stranded β-sheet and a contiguous β-hairpin. It binds to both Mago and Y14 [ ].
Protein Domain
Name: Domain of unknown function DUF1221
Type: Domain
Description: This is a group of plant proteins, most of which are hypothetical and of unknown function. All members contain the domain, suggesting that they may possess kinase activity.
Protein Domain
Name: Centromere protein X
Type: Family
Description: Centromere protein X (CENP-X) is a component of several different complexes, including the multisubunit FA complex, the heterotetrameric CENP-T-W-S-X complex and the APITD1/CENPS complex. The Fanconi anemia (FA) core complex is involved in DNA damage repair and genome maintenance. The FA complex is composed of CENPS, FANCA, FANCB, FANCC, FANCE, FANCF, FANCG, FANCL/PHF9, FANCM, FAAP24 and CENPX. Interacts with CENPS, FANCM and FAAP24 [ , ]. Inner kinetochore subunit mhf2 is the dsDNA-binding component of the FANCM-MHF complex, important for gene conversion at blocked replication forks [] and non-crossover recombination during mitosis and meiosis [].The CENP-T-W-S-X complex binds, supercoils DNA and plays an important role in kinetochore assembly [ ].The APITD1/CENPS complex is composed of at least of CENP-S and CENP-X and is essential for the stable assembly of the outer kinetchore [ ].
Protein Domain
Name: DM13 domain
Type: Domain
Description: The DM13 domain has been identified in animal proteins containing a DOMON domain likely to function as cytochromes involved in as yetunidentified redox reactions potentially related to protein hydroxylation or oxidative cross-linking. However, it is also found in bacteria. The DM13domain contains a nearly absolutely conserved cysteine, which can be potentially involved in a redox reaction either as a nacked thiol group or bybinding a prosthetic group like heme [ ].The DM13 domain is predicted to have a β-strand-rich fold [].
Protein Domain
Name: Protein of unknown function DUF2996
Type: Family
Description: This family of proteins has no known function.
Protein Domain
Name: Nucleolar 27S pre-rRNA processing, Urb2/Npa2, C-terminal
Type: Domain
Description: This entry represents a conserved domain found towards the C terminus of proteins involved in ribosome biogenesis, such as the Urb2 protein from yeast [ ].
Protein Domain
Name: Glycerol-3-phosphate O-acyltransferase, alpha helical bundle, N-terminal
Type: Domain
Description: Stromal glycerol-3-phosphate acyltransferases (GPAT) are responsible for the selective incorporation of saturated and unsaturated fatty-acyl chains into chloroplast membranes, which is an important determinant of a plant's ability to tolerate chilling temperatures [ ]. This entry represents the N-terminal four alpha helical bundle found in these proteins.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom